Overview
SHA1 Hash: | 6680679c2e27c15dbd879fa99dc4534698cca5aa |
---|---|
Date: | 2007-11-24 14:06:38 |
User: | drh |
Comment: | Documentation updates. |
Timelines: | ancestors | descendants | both | trunk |
Other Links: | files | ZIP archive | manifest |
Tags And Properties
- branch=trunk inherited from [a28c83647d]
- sym-trunk inherited from [a28c83647d]
Changes
[hide diffs]Modified www/fileformat.html from [d480398398] to [66d0102569].
@@ -24,30 +24,43 @@ its hash is computed. The name of a artifact in the repository is exactly the same SHA1 hash that is computed by sha1sum on the file as it exists in your source tree.</p> <p> -Some artifacts have a particular format which qualifies them -as "manifests". A manifest assigns filenames to a subset -of the artifacts in the repository, in order to provide a -snapshot of the state of the project at a point in time. -Each manifest corresponds to a version or baseline -of the project. -</p> +Some artifacts have a particular format which gives them special +meaning to fossil. Fossil recognizes:</p> + +<ul> +<li> Manifests </li> +<li> Clusters </li> +<li> Control Artifacts </li> +<li> Wiki Pages </li> +<li> Ticket Changes </li> +</ul> + +<p>These five artifact types are described in the sequel.</p> <h2>1.0 The Manifest</h2> + +<p>A manifest defines a baseline or version of the project +source tree. The manifest contains a list of artifacts for +each file in the project and the corresponding filenames, as +well as information such as parent baselines, the name of the +programmer who created the baseline, the date and time when +the baseline was created, and any check-in comments associated +with the baseline.</p> <p> Any artifact in the repository that follows the syntactic rules of a manifest is a manifest. Note that a manifest can be both a real manifest and also a content file, though this is rare. </p> <p> -A manifest is a line-oriented text file. Newline characters -(ASCII 0x0a) separate lines. Each line is called a "card". +A manifest is a text file. Newline characters +(ASCII 0x0a) separate the file into "cards". Each card begins with a single character "card type". Zero or more arguments may follow the card type. All arguments are separated from each other and from the card-type character by a single space character. There is no surplus white space between arguments @@ -186,11 +199,11 @@ <p> Allowed cards in the cluster are as follows: </p> <blockquote> -<b>M</b> <i>uuid</i> +<b>M</b> <i>uuid</i><br /> <b>Z</b> <i>checksum</i> </blockquote> <p> A cluster contains one or more "M" cards followed by a single "Z" @@ -219,11 +232,11 @@ Allowed cards in a control artifact are as follows: </p> <blockquote> <b>D</b> <i>time-and-date-stamp</i><br /> -<b>T</b> <i>tag-name uuid ?value?</i><br /> +<b>T</b> (<b>+</b>|<b>-</b>|<b>*</b>)<i>tag-name uuid ?value?</i><br /> <b>Z</b> <i>checksum</i><br /> </blockquote> <p> A control artifact must have one D card and one Z card and @@ -240,13 +253,15 @@ to which the tag is to be applied. The first value is the tag name. The first character of the tag is either "+", "-", or "*". A "+" means the tag should be added to the artifact. The "-" means the tag should be removed. The "*" character means the tag should be added to the artifact -and all direct decendents (but not branches) of the artifact. +and all direct decendents (but not branches) of the artifact down +to but not including the first decendent that contains a +more recent "-" tag with the same name. The optional third argument is the value of the tag. A tag -without a value is considered to be a boolean.</p> +without a value is a boolean.</p> <p>When two or more tags with the same name are applied to the same artifact, the tag with the latest (most recent) date is used.</p> @@ -254,30 +269,73 @@ to a baseline will override the check-in comment of that baseline for display purposes.</p> <h2>4.0 Wiki Pages</h2> -<p>A wiki page is an artifact in a format similar to manifests, +<p>A wiki page is an artifact with a format similar to manifests, clusters, and control artifacts. The artifact is divided into cards by newline characters. The format of each card is as in manifests, clusters, and control artifacts. Wiki artifacts accept the following card types:</p> <blockquote> <b>D</b> <i>time-and-date-stamp</i><br /> <b>L</b> <i>wiki-title</i><br /> +<b>P</b> <i>parent-uuid</i>+<br /> <b>U</b> <i>user-name</i><br /> -<b>W</b> <i>size</i> \n <i>text</i> \n<br /> +<b>W</b> <i>size</i> <b>\n</b> <i>text</i> <b>\n</b><br /> <b>Z</b> <i>checksum</i> </blockquote> +<p>The D card is the date and time when the wiki page was edited. +The P card specifies the parent wiki pages, if any. The L card +gives the name of the wiki page. The U card specifies the login +of the user who made this edit to the wiki page. The Z card is +the usual checksum over the either artifact.</p> + +<p>The W card is used to specify the text of the wiki page. The +argument to the W card is an integer which is the number of bytes +of text in the wiki page. That text follows the newline character +that terminates the W card. The wiki text is always followed by one +extra newline.</p> <h2>5.0 Ticket Changes</h2> + +<p>A ticket-change artifact represents a change to a trouble ticket. +The following cards are allowed on a ticket change artifact:</p> <blockquote> <b>D</b> <i>time-and-date-stamp</i><br /> <b>J</b> ?<b>+</b>?<i>name value</i><br /> <b>K</b> <i>ticket-uuid</i><br /> <b>U</b> <i>user-name</i><br /> <b>Z</b> <i>checksum</i> </blockquote> +<p> +The D card is the usual date and time stamp and represents the point +in time when the change was entered. The U card is the login of the +programmer who entered this change. The Z card is the checksum over +the entire artifact.</p> + +<p> +Every ticket has a UUID. The ticket to which this change is applied +is specified by the K card. A ticket exists if it contains one or +more changes. The first "change" to a ticket is what brings the +ticket into existance.</p> + +<p> +J cards specify changes to "fields" of the ticket. Each fossil +server has a ticket configuration which specifies the fields its +understands. This is not a limit on the fields that can appear +on the J cards, however. If a J card specifies a field that a +particular fossil server does not recognize, then that J card +is simply ignored.</p> + +<p> +The first argument of the J card is the field name. The second +value is the field value. If the field name begins with "+" then +the value is appended to the prior value. Otherwise, the value +on the J card replaces any previous value of the field. +The field name and value are both encoded using the character +escapes defined for the C card of a manifest. +</p>
Modified www/selfcheck.html from [48d9461817] to [82d080c276].
@@ -15,10 +15,16 @@ of integrity so that you can have confidence that you will not lose your files. This note describes the defensive measures that fossil uses to help prevent file loss due to bugs. </p> +<p><i>Follow-up as of 2007-11-24:</i> +Fossil has been hosting itself and several other projects for +months now. Many bugs have been encountered. But, thanks in large +part to the defensive measures described here, no data has been +lost. The integrity checks are doing their job well.</p> + <h2>Atomic Check-ins With Rollback</h2> <p> The fossil repository is an <a href="http://www.sqlite.org/">SQLite version 3</a> database file. @@ -42,11 +48,11 @@ <p> The content files that comprise the global state of a fossil respository are stored in the repository as a tree. The leaves of the tree are stored as zlib-compressed BLOBs. Interior nodes are deltas from their -decendents. There is a lot of encoding going on here. There is +decendents. A lot of encoding is going on. There is zlib-compression which is relatively well-tested but still might cause corruption if used improperly. And there is the relatively new delta-encoding mechanism designed expressly for fossil. We want to make sure that bugs in these encoding mechanisms do not lead to loss of data. @@ -54,11 +60,12 @@ <p> To increase our confidence that everything in the repository is recoverable, fossil makes sure it can extract an exact replicate of every content file that it changes just prior to transaction -commit. So during the course of check-in, many different files +commit. So during the course of check-in (or other repository +operation) many different files in the repository might be modified. Some files are simply compressed. Other files are delta encoded and then compressed. While all this is going on, fossil makes a record of every file that is encoded and the SHA1 hash of the original content of that file. Then just before transaction commit, fossil re-extracts @@ -68,25 +75,25 @@ message is printed and the transaction rolls back. </p> <p> So, in other words, fossil always checks to make sure it can -re-extract a file before it commits a check-in of that file. +re-extract a file before it commits a change to that file. Hence bugs in fossil are unlikely to corrupt the repository in a way that prevents us from extracting historical versions of files. </p> <h2>Checksum Over All Files In A Baseline</h2> <p> -Manifest files that define a baseline have two fields (the -R-line and Z-line) that record MD5 hashs of the manifest itself +Manifest artifacts that define a baseline have two fields (the +R-card and Z-card) that record MD5 hashs of the manifest itself and of all other files in the manifest. Prior to any check-in commit, these checksums are verified to ensure that the baseline checked in agrees exactly with what is on disk. Similarly, the repository checksum is verified after a checkout to make sure that the entire repository was checked out correctly. Note that these added checks use a different hash (MD5 instead of SHA1) in order to avoid common-mode failures in the hash algorithm implementation. </p>