dirfile-format (5) Linux Manual Page
dirfile-format — the dirfile database format specification file
Description
The dirfile format file fully specifies the raw and derived time streams and auxiliary information for a dirfile(5) database. The format file is a case sensitive text file called format located in the dirfile directory. The explicit text encoding of the file is not specified by these standards, but must be 7-bit ASCII compatible. Examples of acceptable character encodings include all the ISO~8859 character sets (i.e. Latin-1 through Latin-10, among others), as well as the UTF-8 encoding of Unicode and UCS.
Syntax
The format file is composed of field specification lines and directive lines, optionally separated by blank lines or lines containing only whitespace. Lines are separated by the line-feed character (0x0A). Unless escaped (see below), the hash mark (#) is the comment delimiter; the comment delimiter, and any text following it to the end of the line, is ignored.
Tokens
Both field specification lines and directive lines consist of several tokens separated by whitespace. Whitespace consists of one or more whitespace characters. These are: space (0x20), horizontal tab (0x09), vertical tab (0x0B), form-feed (0x0C), and carriage return (0x0D). The first token of a directive line is always a reserved word. The first token of a field specification line is never a reserved word. Any amount of whitespace may precede the first token on a line.
Since tokens are separated by whitespace, to include a whitespace character in a token, it must either escaped by preceding it by a backslash character (\), or be replaced by a character escape sequence (see below), or else the token must be enclosed in quotation marks (). The quotation marks themselves will be stripped from the token. The null-token (that is, the token consisting of zero characters) may be specified by a pair of quotation marks with nothing between them (). To include a literal quotation mark in a token, it must be escaped (\ ). Similarly, a hash mark may be included in a token by including it in a quoted token or else by escaping it (\#), otherwise the hash mark will be understood as the comment delimiter.
It is a syntax error to have a line which contains unmatched quotation marks, or in which the last character is the backslash character.
Several characters when escaped by a preceding backslash character are interpreted as special characters in tokens. The character escape sequences are:
-
- an alert (bell) character (ASCII 0x07 / U+0007)
- a backspace character (ASCII 0x08 / U+0008)
- an escape character (ASCII 0x1B / U+001B)
- a form-feed character (ASCII 0x0C / U+000C)
- a line-feed character (ASCII 0x0A / U+000A)
- a carriage return character (ASCII 0x0D / U+000D)
- a horizontal tab character (ASCII 0x09 / U+0009)
- a vertical tab character (ASCII 0x0B / U+000B)
\- a backslash character (ASCII 0x5C / U+005C)
\ooo- the single byte given by the octal number ooo.
