d87ca60c58 2008-05-15 stephan: <h1 align="center">The Fossil Sync Protocol</h1> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <p>Fossil supports commands <b>push</b>, <b>pull</b>, and <b>sync</b> d87ca60c58 2008-05-15 stephan: for transferring information from one repository to another. The d87ca60c58 2008-05-15 stephan: command is run on the client repository. A URL for the server repository d87ca60c58 2008-05-15 stephan: is specified as part of the command. This document describes what happens d87ca60c58 2008-05-15 stephan: behind the scenes in order to synchronize the information on the two d87ca60c58 2008-05-15 stephan: repositories.</p> d87ca60c58 2008-05-15 stephan: adc0b3bfb0 2008-07-15 drh: <h2>1.0 Overview</h2> adc0b3bfb0 2008-07-15 drh: adc0b3bfb0 2008-07-15 drh: <p>The global state of a fossil repository consists of an unordered e8c4f69c50 2008-10-24 drh: collection of artifacts. Each artifact is identified by its SHA1 hash e8c4f69c50 2008-10-24 drh: expressed as a 40-character lower-case hexadecimal string. adc0b3bfb0 2008-07-15 drh: Synchronization is simply the process of sharing artifacts between adc0b3bfb0 2008-07-15 drh: servers so that all servers have copies of all artifacts. Because adc0b3bfb0 2008-07-15 drh: artifacts are unordered, the order in which artifacts are received adc0b3bfb0 2008-07-15 drh: at a server is inconsequential. It is assumed that the SHA1 hashes adc0b3bfb0 2008-07-15 drh: of artifacts are unique - that every artifact has a different SHA1 hash. adc0b3bfb0 2008-07-15 drh: To first approximation, synchronization proceeds by sharing lists adc0b3bfb0 2008-07-15 drh: SHA1 hashes of available artifacts, then sharing those artifacts that adc0b3bfb0 2008-07-15 drh: are not found on one side or the other of the connection. In practice, adc0b3bfb0 2008-07-15 drh: a repository might contain millions of artifacts. The list of adc0b3bfb0 2008-07-15 drh: SHA1 hashes for this many artifacts can be large. So optimizations are adc0b3bfb0 2008-07-15 drh: employed that usually reduce the number of SHA1 hashes that need to be adc0b3bfb0 2008-07-15 drh: shared to a few hundred.</p> adc0b3bfb0 2008-07-15 drh: adc0b3bfb0 2008-07-15 drh: <h2>2.0 Transport</h2> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <p>All communication between client and server is via HTTP requests. d87ca60c58 2008-05-15 stephan: The server is listening for incoming HTTP requests. The client d87ca60c58 2008-05-15 stephan: issues one or more HTTP requests and receives replies for each d87ca60c58 2008-05-15 stephan: request.</p> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <p>The server might be running as an independent server d87ca60c58 2008-05-15 stephan: using the <b>server</b> command, or it might be launched from d87ca60c58 2008-05-15 stephan: inetd or xinetd using the <b>http</b> command. Or the server might d87ca60c58 2008-05-15 stephan: be launched from CGI. The details of how the server is configured d87ca60c58 2008-05-15 stephan: to "listen" for incoming HTTP requests is immaterial. The important d87ca60c58 2008-05-15 stephan: point is that the server is listening for requests and the client d87ca60c58 2008-05-15 stephan: is the issuer of the requests.</p> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <p>A single push, pull, or sync might involve multiple HTTP requests. d87ca60c58 2008-05-15 stephan: The client maintains state between all requests. But on the server d87ca60c58 2008-05-15 stephan: side, each request is independent. The server does not preserve d87ca60c58 2008-05-15 stephan: any information about the client from one request to the next.</p> d87ca60c58 2008-05-15 stephan: adc0b3bfb0 2008-07-15 drh: <h3>2.1 Server Identification</h3> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <p>The server is identified by a URL argument that accompanies the d87ca60c58 2008-05-15 stephan: push, pull, or sync command on the client. (As a convenience to d87ca60c58 2008-05-15 stephan: users, the URL can be omitted on the client command and the same URL d87ca60c58 2008-05-15 stephan: from the most recent push, pull, or sync will be reused. This saves d87ca60c58 2008-05-15 stephan: typing in the common case where the client does multiple syncs to d87ca60c58 2008-05-15 stephan: the same server.)</p> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <p>The client modifies the URL by appending the method name "<b>/xfer</b>" d87ca60c58 2008-05-15 stephan: to the end. For example, if the URL specified on the client command d87ca60c58 2008-05-15 stephan: line is</p> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <blockquote> d87ca60c58 2008-05-15 stephan: http://fossil-scm.hwaci.com/fossil d87ca60c58 2008-05-15 stephan: </blockquote> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <p>Then the URL that is really used to do the synchronization will d87ca60c58 2008-05-15 stephan: be:</p> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <blockquote> d87ca60c58 2008-05-15 stephan: http://fossil-scm.hwaci.com/fossil/xfer d87ca60c58 2008-05-15 stephan: </blockquote> d87ca60c58 2008-05-15 stephan: adc0b3bfb0 2008-07-15 drh: <h3>2.2 HTTP Request Format</h3> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <p>The client always sends a POST request to the server. The d87ca60c58 2008-05-15 stephan: general format of the POST request is as follows:</p> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <blockquote><pre> d87ca60c58 2008-05-15 stephan: POST /fossil/xfer HTTP/1.0 d87ca60c58 2008-05-15 stephan: Host: fossil-scm.hwaci.com:80 d87ca60c58 2008-05-15 stephan: Content-Type: application/x-fossil d87ca60c58 2008-05-15 stephan: Content-Length: 4216 d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <i>content...</i> d87ca60c58 2008-05-15 stephan: </pre></blockquote> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <p>In the example above, the pathname given after the POST keyword d87ca60c58 2008-05-15 stephan: on the first line is a copy of the URL pathname. The Host: parameter d87ca60c58 2008-05-15 stephan: is also taken from the URL. The content type is always either d87ca60c58 2008-05-15 stephan: "application/x-fossil" or "application/x-fossil-debug". The "x-fossil" d87ca60c58 2008-05-15 stephan: content type is the default. The only difference is that "x-fossil" d87ca60c58 2008-05-15 stephan: content is compressed using zlib whereas "x-fossil-debug" is sent d87ca60c58 2008-05-15 stephan: uncompressed.</p> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <p>A typical reply from the server might look something like this:</p> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <blockquote><pre> d87ca60c58 2008-05-15 stephan: HTTP/1.0 200 OK d87ca60c58 2008-05-15 stephan: Date: Mon, 10 Sep 2007 12:21:01 GMT d87ca60c58 2008-05-15 stephan: Connection: close d87ca60c58 2008-05-15 stephan: Cache-control: private d87ca60c58 2008-05-15 stephan: Content-Type: application/x-fossil; charset=US-ASCII d87ca60c58 2008-05-15 stephan: Content-Length: 265 d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <i>content...</i> d87ca60c58 2008-05-15 stephan: </pre></blockquote> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <p>The content type of the reply is always the same as the content type d87ca60c58 2008-05-15 stephan: of the request.</p> d87ca60c58 2008-05-15 stephan: adc0b3bfb0 2008-07-15 drh: <h2>3.0 Fossil Synchronization Content</h2> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <p>A synchronization request between a client and server consists of d87ca60c58 2008-05-15 stephan: one or more HTTP requests as described in the previous section. This d87ca60c58 2008-05-15 stephan: section details the "x-fossil" content type.</p> d87ca60c58 2008-05-15 stephan: adc0b3bfb0 2008-07-15 drh: <h3>3.1 Line-oriented Format</h3> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <p>The x-fossil content type consists of zero or more "cards". Cards d87ca60c58 2008-05-15 stephan: are separate by the newline character ("\n"). Leading and trailing d87ca60c58 2008-05-15 stephan: whitespace on a card is ignored. Blank cards are ignored.</p> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <p>Each card is divided into zero or more space separated tokens. d87ca60c58 2008-05-15 stephan: The first token on each card is the operator. Subsequent tokens d87ca60c58 2008-05-15 stephan: are arguments. The set of operators understood by servers is slightly d87ca60c58 2008-05-15 stephan: different from the operators understood by clients, though the two d87ca60c58 2008-05-15 stephan: are very similar.</p> d87ca60c58 2008-05-15 stephan: adc0b3bfb0 2008-07-15 drh: <h3>3.2 Login Cards</h3> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <p>Every message from client to server begins with one or more login d87ca60c58 2008-05-15 stephan: cards. Each login card has the following format:</p> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <blockquote> d87ca60c58 2008-05-15 stephan: <b>login</b> <i>userid nonce signature</i> d87ca60c58 2008-05-15 stephan: </blockquote> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <p>The userid is the name of the user that is requesting service d87ca60c58 2008-05-15 stephan: from the server. The nonce is the SHA1 hash of the remainder of d87ca60c58 2008-05-15 stephan: the message - all text that follows the newline character that d87ca60c58 2008-05-15 stephan: terminates the login card. The signature is the SHA1 hash of d87ca60c58 2008-05-15 stephan: the concatenation of the nonce and the users password.</p> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <p>For each login card, the server looks up the user and verifies d87ca60c58 2008-05-15 stephan: that the nonce matches the SHA1 hash of the remainder of the d87ca60c58 2008-05-15 stephan: message. It then checks the signature hash to make sure the d87ca60c58 2008-05-15 stephan: signature matches. If everything d87ca60c58 2008-05-15 stephan: checks out, then the client is granted all privileges of the d87ca60c58 2008-05-15 stephan: specified user.</p> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <p>Privileges are cumulative. There can be multiple successful d87ca60c58 2008-05-15 stephan: login cards. The session privileges are the bit-wise OR of the d87ca60c58 2008-05-15 stephan: privileges of each individual login.</p> d87ca60c58 2008-05-15 stephan: adc0b3bfb0 2008-07-15 drh: <h3>3.3 File Cards</h3> adc0b3bfb0 2008-07-15 drh: e8c4f69c50 2008-10-24 drh: <p>Artifacts are transferred using "file" cards. (The name "file" e8c4f69c50 2008-10-24 drh: card comes from the fact that most artifacts correspond to files.) e8c4f69c50 2008-10-24 drh: File cards come in two different formats depending e8c4f69c50 2008-10-24 drh: on whether the artifact is sent directly or as a delta from some e8c4f69c50 2008-10-24 drh: other artifact.</p> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <blockquote> e8c4f69c50 2008-10-24 drh: <b>file</b> <i>artifact-id size</i> <b>\n</b> <i>content</i><br> e8c4f69c50 2008-10-24 drh: <b>file</b> <i>artifact-id delta-artifact-id size</i> <b>\n</b> <i>content</i> d87ca60c58 2008-05-15 stephan: </blockquote> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <p>File cards are different from all other cards in that they e8c4f69c50 2008-10-24 drh: followed by in-line "payload" data. The content of the artifact e8c4f69c50 2008-10-24 drh: or the artifact delta consists of the first <i>size</i> bytes of the d87ca60c58 2008-05-15 stephan: x-fossil content that immediately follow the newline that d87ca60c58 2008-05-15 stephan: terminates the file card. No other cards have this characteristic. d87ca60c58 2008-05-15 stephan: </p> d87ca60c58 2008-05-15 stephan: e8c4f69c50 2008-10-24 drh: <p>The first argument of a file card is the ID of the artifact that e8c4f69c50 2008-10-24 drh: is being transferred. The artifact ID is the lower-case hexadecimal e8c4f69c50 2008-10-24 drh: representation of the SHA1 hash of the artifact. d87ca60c58 2008-05-15 stephan: The last argument of the file card is the number of bytes of d87ca60c58 2008-05-15 stephan: payload that immediately follow the file card. If the file d87ca60c58 2008-05-15 stephan: card has only two arguments, that means the payload is the e8c4f69c50 2008-10-24 drh: complete content of the artifact. If the file card has three d87ca60c58 2008-05-15 stephan: arguments, then the payload is a delta and second argument is e8c4f69c50 2008-10-24 drh: the ID of another artifact that is the source of the delta.</p> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <p>File cards are sent in both directions: client to server and d87ca60c58 2008-05-15 stephan: server to client. A delta might be sent before the source of d87ca60c58 2008-05-15 stephan: the delta, so both client and server should remember deltas d87ca60c58 2008-05-15 stephan: and be able to apply them when their source arrives.</p> d87ca60c58 2008-05-15 stephan: adc0b3bfb0 2008-07-15 drh: <h3>3.4 Push and Pull Cards</h3> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <p>Among of the first cards in a client-to-server message are d87ca60c58 2008-05-15 stephan: the push and pull cards. The push card tell the server that d87ca60c58 2008-05-15 stephan: the client is pushing content. The pull card tell the server d87ca60c58 2008-05-15 stephan: that the client wants to pull content. In the event of a sync, d87ca60c58 2008-05-15 stephan: both cards are sent. The format is as follows:</p> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <blockquote> d87ca60c58 2008-05-15 stephan: <b>push</b> <i>servercode projectcode</i><br> d87ca60c58 2008-05-15 stephan: <b>pull</b> <i>servercode projectcode</i> d87ca60c58 2008-05-15 stephan: </blockquote> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <p>The <i>servercode</i> argument is the repository ID for the d87ca60c58 2008-05-15 stephan: client. The server will only allow the transaction to proceed d87ca60c58 2008-05-15 stephan: if the servercode is different from its own servercode. This d87ca60c58 2008-05-15 stephan: prevents a sync-loop. The <i>projectcode</i> is the identifier d87ca60c58 2008-05-15 stephan: of the software project that the client repository contains. d87ca60c58 2008-05-15 stephan: The projectcode for the client and server must match in order d87ca60c58 2008-05-15 stephan: for the transaction to proceed.</p> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <p>The server will also send a push card back to the client d87ca60c58 2008-05-15 stephan: during a clone. This is how the client determines what project d87ca60c58 2008-05-15 stephan: code to put in the new repository it is constructing.</p> d87ca60c58 2008-05-15 stephan: adc0b3bfb0 2008-07-15 drh: <h3>3.5 Clone Cards</h3> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <p>A clone card works like a pull card in that it is sent from d87ca60c58 2008-05-15 stephan: client to server in order to tell the server that the client d87ca60c58 2008-05-15 stephan: wants to pull content. But unlike the pull card, the clone d87ca60c58 2008-05-15 stephan: card has no arguments.</p> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <blockquote> d87ca60c58 2008-05-15 stephan: <b>clone</b> d87ca60c58 2008-05-15 stephan: </blockquote> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <p>In response to a clone message, the server also sends the client d87ca60c58 2008-05-15 stephan: a push message so that the client can discover the projectcode for d87ca60c58 2008-05-15 stephan: this project.</p> d87ca60c58 2008-05-15 stephan: adc0b3bfb0 2008-07-15 drh: <h3>3.6 Igot Cards</h3> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <p>An igot card can be sent from either client to server or from d87ca60c58 2008-05-15 stephan: server to client in order to indicate that the sender holds a copy e8c4f69c50 2008-10-24 drh: of a particular artifact. The format is:</p> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <blockquote> e8c4f69c50 2008-10-24 drh: <b>igot</b> <i>artifact-id</i> d87ca60c58 2008-05-15 stephan: </blockquote> d87ca60c58 2008-05-15 stephan: e8c4f69c50 2008-10-24 drh: <p>The argument of the igot card is the ID of the artifact that d87ca60c58 2008-05-15 stephan: the sender possesses. d87ca60c58 2008-05-15 stephan: The receiver of an igot card will typically check to see if e8c4f69c50 2008-10-24 drh: it also holds the same artifact and if not it will request the artifact d87ca60c58 2008-05-15 stephan: using a gimme card in either the reply or in the next message.</p> d87ca60c58 2008-05-15 stephan: adc0b3bfb0 2008-07-15 drh: <h3>3.7 Gimme Cards</h3> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <p>A gimme card is sent from either client to server or from server d87ca60c58 2008-05-15 stephan: to client. The gimme card asks the receiver to send a particular e8c4f69c50 2008-10-24 drh: artifact back to the sender. The format of a gimme card is this:</p> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <blockquote> e8c4f69c50 2008-10-24 drh: <b>gimme</b> <i>artifact-id</i> d87ca60c58 2008-05-15 stephan: </blockquote> d87ca60c58 2008-05-15 stephan: e8c4f69c50 2008-10-24 drh: <p>The argument to the gimme card is the ID of the artifact that d87ca60c58 2008-05-15 stephan: the sender wants. The receiver will typically respond to a d87ca60c58 2008-05-15 stephan: gimme card by sending a file card in its reply or in the next d87ca60c58 2008-05-15 stephan: message.</p> d87ca60c58 2008-05-15 stephan: adc0b3bfb0 2008-07-15 drh: <h3>3.8 Cookie Cards</h3> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <p>A cookie card can be used by a server to record a small amount d87ca60c58 2008-05-15 stephan: of state information on a client. The server sends a cookie to the d87ca60c58 2008-05-15 stephan: client. The client sends the same cookie back to the server on d87ca60c58 2008-05-15 stephan: its next request. The cookie card has a single argument which d87ca60c58 2008-05-15 stephan: is its payload.</p> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <blockquote> d87ca60c58 2008-05-15 stephan: <b>cookie</b> <i>payload</i> d87ca60c58 2008-05-15 stephan: </blockquote> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <p>The client is not required to return the cookie to the server on d87ca60c58 2008-05-15 stephan: its next request. Or the client might send a cookie from a different d87ca60c58 2008-05-15 stephan: server on the next request. So the server must not depend on the d87ca60c58 2008-05-15 stephan: cookie and the server must structure the cookie payload in such d87ca60c58 2008-05-15 stephan: a way that it can tell if the cookie it sees is its own cookie or d87ca60c58 2008-05-15 stephan: a cookie from another server. (Typically the server will embed d87ca60c58 2008-05-15 stephan: its servercode as part of the cookie.)</p> d87ca60c58 2008-05-15 stephan: adc0b3bfb0 2008-07-15 drh: <h3>3.9 Error Cards</h3> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <p>If the server discovers anything wrong with a request, it generates d87ca60c58 2008-05-15 stephan: an error card in its reply. When the client sees the error card, d87ca60c58 2008-05-15 stephan: it displays an error message to the user and aborts the sync d87ca60c58 2008-05-15 stephan: operation. An error card looks like this:</p> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <blockquote> d87ca60c58 2008-05-15 stephan: <b>error</b> <i>error-message</i> d87ca60c58 2008-05-15 stephan: </blockquote> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <p>The error message is English text that is encoded in order to d87ca60c58 2008-05-15 stephan: be a single token. d87ca60c58 2008-05-15 stephan: A space (ASCII 0x20) is represented as "\s" (ASCII 0x5C, 0x73). A d87ca60c58 2008-05-15 stephan: newline (ASCII 0x0a) is "\n" (ASCII 0x6C, x6E). A backslash d87ca60c58 2008-05-15 stephan: (ASCII 0x5C) is represented as two backslashes "\\". Apart from d87ca60c58 2008-05-15 stephan: space and newline, no other whitespace characters nor any d87ca60c58 2008-05-15 stephan: unprintable characters are allowed in d87ca60c58 2008-05-15 stephan: the error message.</p> d87ca60c58 2008-05-15 stephan: adc0b3bfb0 2008-07-15 drh: <h3>3.10 Unknown Cards</h3> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <p>If either the client or the server sees a card that is not d87ca60c58 2008-05-15 stephan: described above, then it generates an error and aborts.</p> d87ca60c58 2008-05-15 stephan: adc0b3bfb0 2008-07-15 drh: <h2>4.0 Phantoms And Clusters</h2> adc0b3bfb0 2008-07-15 drh: e8c4f69c50 2008-10-24 drh: <p>When a repository knows that a artifact exists and knows the ID of e8c4f69c50 2008-10-24 drh: that artifact, but it does not know the artifact content, then it stores that e8c4f69c50 2008-10-24 drh: artifact as a "phantom". A repository will typically create a phantom when e8c4f69c50 2008-10-24 drh: it receives an igot card for a artifact that it does not hold or when it d87ca60c58 2008-05-15 stephan: receives a file card that references a delta source that it does not d87ca60c58 2008-05-15 stephan: hold. When a server is generating its reply or when a client is d87ca60c58 2008-05-15 stephan: generating a new request, it will usually send gimme cards for every d87ca60c58 2008-05-15 stephan: phantom that it holds.</p> d87ca60c58 2008-05-15 stephan: e8c4f69c50 2008-10-24 drh: <p>A cluster is a special artifact that tells of the existence of other e8c4f69c50 2008-10-24 drh: artifacts. Any artifact in the repository that follows the syntactic rules d87ca60c58 2008-05-15 stephan: of a cluster is considered a cluster.</p> d87ca60c58 2008-05-15 stephan: e8c4f69c50 2008-10-24 drh: <p>A cluster is line oriented. Each line of a cluster d87ca60c58 2008-05-15 stephan: is a card. The cards are separated by the newline ("\n") character. d87ca60c58 2008-05-15 stephan: Each card consists of a single character card type, a space, and a d87ca60c58 2008-05-15 stephan: single argument. No extra whitespace and no trailing or leading d87ca60c58 2008-05-15 stephan: whitespace is allowed. All cards in the cluster must occur in d87ca60c58 2008-05-15 stephan: strict lexicographical order.</p> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <p>A cluster consists of one or more "M" cards followed by a single e8c4f69c50 2008-10-24 drh: "Z" card. Each M card holds an argument which is a artifact ID for an e8c4f69c50 2008-10-24 drh: artifact in the repository. The Z card has a single argument which is the d87ca60c58 2008-05-15 stephan: lower-case hexadecimal representation of the MD5 checksum of all d87ca60c58 2008-05-15 stephan: preceding M cards up to and included the newline character that d87ca60c58 2008-05-15 stephan: occurred just before the Z that starts the Z card.</p> d87ca60c58 2008-05-15 stephan: e8c4f69c50 2008-10-24 drh: <p>Any artifact that does not match the specifications of a cluster d87ca60c58 2008-05-15 stephan: exactly is not a cluster. There must be no extra whitespace in e8c4f69c50 2008-10-24 drh: the artifact. There must be one or more M cards. There must be a d87ca60c58 2008-05-15 stephan: single Z card with a correct MD5 checksum. And all cards must d87ca60c58 2008-05-15 stephan: be in strict lexicographical order.</p> d87ca60c58 2008-05-15 stephan: adc0b3bfb0 2008-07-15 drh: <h3>4.1 The Unclustered Table</h3> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <p>Every repository maintains a table named "<b>unclustered</b>" e8c4f69c50 2008-10-24 drh: which records the identity of every artifact and phantom it holds that is not d87ca60c58 2008-05-15 stephan: mentioned in a cluster. The entries in the unclustered table can e8c4f69c50 2008-10-24 drh: be thought of as leaves on a tree of artifacts. Some of the unclustered e8c4f69c50 2008-10-24 drh: artifacts will be other clusters. Those clusters may contain other clusters, d87ca60c58 2008-05-15 stephan: which might contain still more clusters, and so forth. Beginning e8c4f69c50 2008-10-24 drh: with the artifacts in the unclustered table, one can follow the chain e8c4f69c50 2008-10-24 drh: of clusters to find every artifact in the repository.</p> adc0b3bfb0 2008-07-15 drh: adc0b3bfb0 2008-07-15 drh: <h2>5.0 Synchronization Strategies</h2> adc0b3bfb0 2008-07-15 drh: adc0b3bfb0 2008-07-15 drh: <h3>5.1 Pull</h3> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <p>A typical pull operation proceeds as shown below. Details d87ca60c58 2008-05-15 stephan: of the actual implementation may very slightly but the gist of d87ca60c58 2008-05-15 stephan: a pull is captured in the following steps:</p> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <ol> d87ca60c58 2008-05-15 stephan: <li>The client sends login and pull cards. d87ca60c58 2008-05-15 stephan: <li>The client sends a cookie card if it has previously received a cookie. d87ca60c58 2008-05-15 stephan: <li>The client sends gimme cards for every phantom that it holds. d87ca60c58 2008-05-15 stephan: <hr> d87ca60c58 2008-05-15 stephan: <li>The server checks the login password and rejects the session if d87ca60c58 2008-05-15 stephan: the user does not have permission to pull. d87ca60c58 2008-05-15 stephan: <li>If the number entries in the unclustered table on the server is e8c4f69c50 2008-10-24 drh: greater than 100, then the server constructs a new cluster artifact to d87ca60c58 2008-05-15 stephan: cover all those unclustered entries. d87ca60c58 2008-05-15 stephan: <li>The server sends file cards for every gimme card it received d87ca60c58 2008-05-15 stephan: from the client. e8c4f69c50 2008-10-24 drh: <li>The server sends ihave cards for every artifact in its unclustered d87ca60c58 2008-05-15 stephan: table that is not a phantom. d87ca60c58 2008-05-15 stephan: <hr> d87ca60c58 2008-05-15 stephan: <li>The client adds the content of file cards to its repository. d87ca60c58 2008-05-15 stephan: <li>The client creates a phantom for every ihave card in the server reply e8c4f69c50 2008-10-24 drh: that mentions an artifact that the client does not possess. d87ca60c58 2008-05-15 stephan: <li>The client creates a phantom for the delta source of file cards when e8c4f69c50 2008-10-24 drh: the delta source is an artifact that the client does not possess. d87ca60c58 2008-05-15 stephan: </ol> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <p>These ten steps represent a single HTTP round-trip request. d87ca60c58 2008-05-15 stephan: The first three steps are the processing that occurs on the client d87ca60c58 2008-05-15 stephan: to generate the request. The middle four steps are processing d87ca60c58 2008-05-15 stephan: that occurs on the server to interpret the request and generate a d87ca60c58 2008-05-15 stephan: reply. And the last three steps are the processing that the d87ca60c58 2008-05-15 stephan: client does to interpret the reply.</p> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <p>During a pull, the client will keep sending HTTP requests e8c4f69c50 2008-10-24 drh: until it holds all artifacts that exist on the server.</p> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <p>Note that the server tries d87ca60c58 2008-05-15 stephan: to limit the size of its reply message to something reasonable d87ca60c58 2008-05-15 stephan: (usually about 1MB) so that it might stop sending file cards as d87ca60c58 2008-05-15 stephan: described in step (6) if the reply becomes too large.</p> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <p>Step (5) is the only way in which new clusters can be created. d87ca60c58 2008-05-15 stephan: By only creating clusters on the server, we hope to minimize the d87ca60c58 2008-05-15 stephan: amount of overlap between clusters in the common configuration where d87ca60c58 2008-05-15 stephan: there is a single server and many clients. The same synchronization d87ca60c58 2008-05-15 stephan: protocol will continue to work even if there are multiple servers d87ca60c58 2008-05-15 stephan: or if servers and clients sometimes change roles. The only negative d87ca60c58 2008-05-15 stephan: effects of these unusual arrangements is that more than the minimum d87ca60c58 2008-05-15 stephan: number of clusters might be generated.</p> d87ca60c58 2008-05-15 stephan: adc0b3bfb0 2008-07-15 drh: <h3>5.2 Push</h3> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <p>A typical push operation proceeds roughly as shown below. As d87ca60c58 2008-05-15 stephan: with a pull, the actual implementation may vary slightly.</p> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <ol> d87ca60c58 2008-05-15 stephan: <li>The client sends login and push cards. e8c4f69c50 2008-10-24 drh: <li>The client sends file cards for any artifacts that it holds that have e8c4f69c50 2008-10-24 drh: never before been pushed - artifacts that come from local check-ins. d87ca60c58 2008-05-15 stephan: <li>If this is the second or later cycle in a push, then the d87ca60c58 2008-05-15 stephan: client sends file cards for any gimme cards that the server sent d87ca60c58 2008-05-15 stephan: in the previous cycle. e8c4f69c50 2008-10-24 drh: <li>The client sends igot cards for every artifact in its unclustered table d87ca60c58 2008-05-15 stephan: that is not a phantom. d87ca60c58 2008-05-15 stephan: <hr> d87ca60c58 2008-05-15 stephan: <li>The server checks the login and push cards and issues an error if d87ca60c58 2008-05-15 stephan: anything is amiss. e8c4f69c50 2008-10-24 drh: <li>The server accepts file cards from the client and adds those artifacts d87ca60c58 2008-05-15 stephan: to its repository. e8c4f69c50 2008-10-24 drh: <li>The server creates phantoms for igot cards that mention artifacts it e8c4f69c50 2008-10-24 drh: does not possess or for file cards that mention delta source artifacts that d87ca60c58 2008-05-15 stephan: it does not possess. d87ca60c58 2008-05-15 stephan: <li>The server issues gimme cards for all phantoms. d87ca60c58 2008-05-15 stephan: <hr> d87ca60c58 2008-05-15 stephan: <li>The client remembers the gimme cards from the server so that it d87ca60c58 2008-05-15 stephan: can generate file cards in reply on the next cycle. d87ca60c58 2008-05-15 stephan: </ol> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <p>As with a pull, the steps of a push operation repeat until the e8c4f69c50 2008-10-24 drh: server knows all artifacts that exist on the client. Also, as with d87ca60c58 2008-05-15 stephan: pull, the client attempts to keep the size of the request from d87ca60c58 2008-05-15 stephan: growing too large by suppressing file cards once the d87ca60c58 2008-05-15 stephan: size of the request reaches 1MB.</p> d87ca60c58 2008-05-15 stephan: adc0b3bfb0 2008-07-15 drh: <h3>5.3 Sync</h3> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <p>A sync is just a pull and a push that happen at the same time. d87ca60c58 2008-05-15 stephan: The first three steps of a pull are combined with the first five steps d87ca60c58 2008-05-15 stephan: of a push. Steps (4) through (7) of a pull are combined with steps d87ca60c58 2008-05-15 stephan: (5) through (8) of a push. And steps (8) through (10) of a pull d87ca60c58 2008-05-15 stephan: are combined with step (9) of a push.</p> d87ca60c58 2008-05-15 stephan: adc0b3bfb0 2008-07-15 drh: <h2>6.0 Summary</h2> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <p>Here are the key points of the synchronization protocol:</p> d87ca60c58 2008-05-15 stephan: d87ca60c58 2008-05-15 stephan: <ol> d87ca60c58 2008-05-15 stephan: <li>The client sends one or more PUSH HTTP requests to the server. d87ca60c58 2008-05-15 stephan: The request and reply content type is "application/x-fossil". d87ca60c58 2008-05-15 stephan: <li>HTTP request content is compressed using zlib. d87ca60c58 2008-05-15 stephan: <li>The content of request and reply consists of cards with one d87ca60c58 2008-05-15 stephan: card per line. d87ca60c58 2008-05-15 stephan: <li>Card formats are: d87ca60c58 2008-05-15 stephan: <ul> d87ca60c58 2008-05-15 stephan: <li> <b>login</b> <i>userid nonce signature</i> d87ca60c58 2008-05-15 stephan: <li> <b>push</b> <i>servercode projectcode</i> d87ca60c58 2008-05-15 stephan: <li> <b>pull</b> <i>servercode projectcode</i> d87ca60c58 2008-05-15 stephan: <li> <b>clone</b> e8c4f69c50 2008-10-24 drh: <li> <b>file</b> <i>artifact-id size</i> <b>\n</b> <i>content</i> e8c4f69c50 2008-10-24 drh: <li> <b>file</b> <i>artifact-id delta-artifact-id size</i> <b>\n</b> <i>content</i> e8c4f69c50 2008-10-24 drh: <li> <b>igot</b> <i>artifact-id</i> e8c4f69c50 2008-10-24 drh: <li> <b>gimme</b> <i>artifact-id</i> d87ca60c58 2008-05-15 stephan: <li> <b>cookie</b> <i>cookie-text</i> d87ca60c58 2008-05-15 stephan: <li> <b>error</b> <i>error-message</i> d87ca60c58 2008-05-15 stephan: </ul> e8c4f69c50 2008-10-24 drh: <li>Phantoms are artifacts that a repository knows exist but does not possess. e8c4f69c50 2008-10-24 drh: <li>Clusters are artifacts that contain IDs of other artifacts. d87ca60c58 2008-05-15 stephan: <li>Clusters are created automatically on the server during a pull. e8c4f69c50 2008-10-24 drh: <li>Repositories keep track of all artifacts that are not named in any e8c4f69c50 2008-10-24 drh: cluster and send igot messages for those artifacts. d87ca60c58 2008-05-15 stephan: <li>Repositories keep track of all the phantoms they hold and send e8c4f69c50 2008-10-24 drh: gimme messages for those artifacts. d87ca60c58 2008-05-15 stephan: </ol>