Artifact 8a7ce84d6007a69b20f45443f1ee95519ea958eb
File www/fileformat.html part of check-in [b807acf62e] - Documentation updates by drh on 2007-07-24 12:52:32.
Fossil File Formats
The global state of a fossil repository is determined by an unordered set of files. Some files used to represent wiki pages, trouble tickets, and the special "manifest" file has a specific and well-defined format. Other files are just the content of the files. Files can be text or binary.
Each file in the repository is named by its SHA1 hash. Some files have a particular format which qualifies them as "manifests". A manifest assigns filenames to a subset of the files in the repository, in order to provide a snapshot of the state of the project at a point in time. Each manifest file corresponds to a version or baseline of the project.
1.0 The Manifest File
Any file in the repository that follows the syntactic rules of a manifest is a manifest. Note that a manifest can be both a real manifest and also a content file, though this is rare.
A manifest is a line-oriented text file. Newline characters (ASCII 0x0a) separate lines. Each line begins with a single character "line type". Zero or more arguments may follow the line type. All arguments are separated from each other and from the line-type character by a single space character. There is no surplus white space between arguments and no leading or trailing whitespace except for the newline character that acts as the line separator.
All lines of the manifest occur in strict sorted lexigraphical order. No line may be duplicated. The entire manifest file may be PGP clear-signed, but otherwise it may contain no additional text or data beyond what is described here.
Allowed lines in the manifest are as follows:
C checkin-comment
D time-and-date-stamp
F filename SHA1-hash
P SHA1-hash+
R repository-checksum
U user-login
Z manifest-checksum
A manifest must have exactly one C-line. The sole argument to the C-line is a check-in comment that describes the baseline that the manifest defines. The check-in comment is text. The following escape sequences are applied to the text: A space (ASCII 0x20) is represented as "\s" (ASCII 0x5C, 0x73). A newline (ASCII 0x0a) is "\n" (ASCII 0x6C, x6E). A backslash (ASCII 0x5C) is represented as two backslashes "\\". Apart from space and newline, no other whitespace characters are allowed in the check-in comment. Nor are any unprintable characters allowed in the comment.
A manifest must have exactly one D-line. The sole argument to the D-line is a date-time stamp in the ISO8601 format. The date and time should be in coordinated universal time (UTC). The format is:
YYYY-MM-DDTHH:MM:SS
A manifest has zero or more F-lines. Each F-line defines a file (other than the manifest itself) which is part of the baseline that the manifest defines. There are two arguments. The first argment is the pathname of the file in the baseline relative to the root of the project file hierarchy. No ".." or "." directories are allowed within the filename. Space characters are escaped as in C-line comment text. Backslash characters and newlines are not allowed within filenames. The directory separator character is a forward slash (ASCII 0x2F). The second argument to the F-line is the full 40-character hexadecimal SHA1 hash of the file content. Upper-case letters ABCDEF are used for the higher digits of the hexadecimal.
A manifest has zero or one P-lines. Most manifests have one P-line. The P-line has a varying number of arguments that defines other manifests from which the current manifest is derived. Each argument is an 40-character uppercase hexadecimal SHA1 of the predecessor manifest. All arguments to the P-line must be unique to that line. The first predecessor is the manifests direct ancestor. Other arguments define manifests with which the first was merged to yield the current manifest. Most manifests have a P-line with a single argument. The first manifest in the project has no ancestors and thus has no P-line.
A manifest may optionally have a single R-line. The R-line has a single argument which is the MD5 checksum of all files in the baseline except the manifest itself. The checksum is expressed as 32-characters of uppercase hexadecimal. The checksum is computed as follows: For each file in the baseline (except for the manifest itself) in strict sorted lexigraphical order, take the pathname of the file relative to the root of the repository, append a single space (ASCII 0x20), the size of the file in ASCII decimal, a single newline character (ASCII 0x0A), and the complete text of the file. Compute the MD5 checksum of the the result.
Each manifest has a single U-line. The argument to the U-line is the login of the user who created the manifest. The login name is encoded using the same character escapes as is used for the check-in comment argument to the C-line.
A manifest has an option Z-line as its last line. The argument to the Z-line is a 32-character uppercase hexadecimal MD5 hash of all prior lines of the manifest up to and including the newline character that immediately preceeds the "Z". The Z-line is just a sanity check to prove that the manifest is well-formed and consistent.
2.0 Trouble Tickets
Each trouble ticket is a file in the repository and appears in a manifest for every baseline in which the ticket exists. Trouble tickets occur in a specific subdirectory of the file heirarchy. The name of the subdirectory that contains tickets is part of the local state of each repository. The filename of each trouble ticket has a ".tkt" suffix. The trouble ticket has a particular file format defined below.
To be continued...3.0 Wiki Pages
Each wiki is a file in the repository and appears in a manifest for every baseline in which that wiki page exists. Wiki pages occur in a specific subdirectory of the file heirarchy. The name of the subdirectory that contains wiki pages is part of the local state of each repository. The filename of each wiki page has a ".wiki" suffix. The base name of the file is the name of the wiki page. The wiki pages have a particular file format defined below.
To be continued...