File Annotation
Not logged in
d87ca60c58 2008-05-15   stephan: <h1 align="center">The Fossil Sync Protocol</h1>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <p>Fossil supports commands <b>push</b>, <b>pull</b>, and <b>sync</b>
d87ca60c58 2008-05-15   stephan: for transferring information from one repository to another.  The
d87ca60c58 2008-05-15   stephan: command is run on the client repository.  A URL for the server repository
d87ca60c58 2008-05-15   stephan: is specified as part of the command.  This document describes what happens
d87ca60c58 2008-05-15   stephan: behind the scenes in order to synchronize the information on the two
d87ca60c58 2008-05-15   stephan: repositories.</p>
d87ca60c58 2008-05-15   stephan: 
adc0b3bfb0 2008-07-15       drh: <h2>1.0 Overview</h2>
adc0b3bfb0 2008-07-15       drh: 
adc0b3bfb0 2008-07-15       drh: <p>The global state of a fossil repository consists of an unordered
e8c4f69c50 2008-10-24       drh: collection of artifacts.  Each artifact is identified by its SHA1 hash
e8c4f69c50 2008-10-24       drh: expressed as a 40-character lower-case hexadecimal string.
adc0b3bfb0 2008-07-15       drh: Synchronization is simply the process of sharing artifacts between
adc0b3bfb0 2008-07-15       drh: servers so that all servers have copies of all artifacts.  Because
adc0b3bfb0 2008-07-15       drh: artifacts are unordered, the order in which artifacts are received
adc0b3bfb0 2008-07-15       drh: at a server is inconsequential.  It is assumed that the SHA1 hashes
adc0b3bfb0 2008-07-15       drh: of artifacts are unique - that every artifact has a different SHA1 hash.
adc0b3bfb0 2008-07-15       drh: To first approximation, synchronization proceeds by sharing lists
adc0b3bfb0 2008-07-15       drh: SHA1 hashes of available artifacts, then sharing those artifacts that
adc0b3bfb0 2008-07-15       drh: are not found on one side or the other of the connection.  In practice,
adc0b3bfb0 2008-07-15       drh: a repository might contain millions of artifacts.  The list of
adc0b3bfb0 2008-07-15       drh: SHA1 hashes for this many artifacts can be large.  So optimizations are
adc0b3bfb0 2008-07-15       drh: employed that usually reduce the number of SHA1 hashes that need to be
adc0b3bfb0 2008-07-15       drh: shared to a few hundred.</p>
adc0b3bfb0 2008-07-15       drh: 
adc0b3bfb0 2008-07-15       drh: <h2>2.0 Transport</h2>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <p>All communication between client and server is via HTTP requests.
d87ca60c58 2008-05-15   stephan: The server is listening for incoming HTTP requests.  The client
d87ca60c58 2008-05-15   stephan: issues one or more HTTP requests and receives replies for each
d87ca60c58 2008-05-15   stephan: request.</p>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <p>The server might be running as an independent server
d87ca60c58 2008-05-15   stephan: using the <b>server</b> command, or it might be launched from
d87ca60c58 2008-05-15   stephan: inetd or xinetd using the <b>http</b> command.  Or the server might
d87ca60c58 2008-05-15   stephan: be launched from CGI.  The details of how the server is configured
d87ca60c58 2008-05-15   stephan: to "listen" for incoming HTTP requests is immaterial.  The important
d87ca60c58 2008-05-15   stephan: point is that the server is listening for requests and the client
d87ca60c58 2008-05-15   stephan: is the issuer of the requests.</p>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <p>A single push, pull, or sync might involve multiple HTTP requests.
d87ca60c58 2008-05-15   stephan: The client maintains state between all requests.  But on the server
d87ca60c58 2008-05-15   stephan: side, each request is independent.  The server does not preserve
d87ca60c58 2008-05-15   stephan: any information about the client from one request to the next.</p>
d87ca60c58 2008-05-15   stephan: 
adc0b3bfb0 2008-07-15       drh: <h3>2.1 Server Identification</h3>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <p>The server is identified by a URL argument that accompanies the
d87ca60c58 2008-05-15   stephan: push, pull, or sync command on the client.  (As a convenience to
d87ca60c58 2008-05-15   stephan: users, the URL can be omitted on the client command and the same URL
d87ca60c58 2008-05-15   stephan: from the most recent push, pull, or sync will be reused.  This saves
d87ca60c58 2008-05-15   stephan: typing in the common case where the client does multiple syncs to
d87ca60c58 2008-05-15   stephan: the same server.)</p>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <p>The client modifies the URL by appending the method name "<b>/xfer</b>"
d87ca60c58 2008-05-15   stephan: to the end.  For example, if the URL specified on the client command
d87ca60c58 2008-05-15   stephan: line is</p>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <blockquote>
d87ca60c58 2008-05-15   stephan: http://fossil-scm.hwaci.com/fossil
d87ca60c58 2008-05-15   stephan: </blockquote>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <p>Then the URL that is really used to do the synchronization will
d87ca60c58 2008-05-15   stephan: be:</p>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <blockquote>
d87ca60c58 2008-05-15   stephan: http://fossil-scm.hwaci.com/fossil/xfer
d87ca60c58 2008-05-15   stephan: </blockquote>
d87ca60c58 2008-05-15   stephan: 
adc0b3bfb0 2008-07-15       drh: <h3>2.2 HTTP Request Format</h3>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <p>The client always sends a POST request to the server.  The
d87ca60c58 2008-05-15   stephan: general format of the POST request is as follows:</p>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <blockquote><pre>
d87ca60c58 2008-05-15   stephan: POST /fossil/xfer HTTP/1.0
d87ca60c58 2008-05-15   stephan: Host: fossil-scm.hwaci.com:80
d87ca60c58 2008-05-15   stephan: Content-Type: application/x-fossil
d87ca60c58 2008-05-15   stephan: Content-Length: 4216
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <i>content...</i>
d87ca60c58 2008-05-15   stephan: </pre></blockquote>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <p>In the example above, the pathname given after the POST keyword
d87ca60c58 2008-05-15   stephan: on the first line is a copy of the URL pathname.  The Host: parameter
d87ca60c58 2008-05-15   stephan: is also taken from the URL.  The content type is always either
d87ca60c58 2008-05-15   stephan: "application/x-fossil" or "application/x-fossil-debug".  The "x-fossil"
d87ca60c58 2008-05-15   stephan: content type is the default.  The only difference is that "x-fossil"
d87ca60c58 2008-05-15   stephan: content is compressed using zlib whereas "x-fossil-debug" is sent
d87ca60c58 2008-05-15   stephan: uncompressed.</p>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <p>A typical reply from the server might look something like this:</p>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <blockquote><pre>
d87ca60c58 2008-05-15   stephan: HTTP/1.0 200 OK
d87ca60c58 2008-05-15   stephan: Date: Mon, 10 Sep 2007 12:21:01 GMT
d87ca60c58 2008-05-15   stephan: Connection: close
d87ca60c58 2008-05-15   stephan: Cache-control: private
d87ca60c58 2008-05-15   stephan: Content-Type: application/x-fossil; charset=US-ASCII
d87ca60c58 2008-05-15   stephan: Content-Length: 265
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <i>content...</i>
d87ca60c58 2008-05-15   stephan: </pre></blockquote>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <p>The content type of the reply is always the same as the content type
d87ca60c58 2008-05-15   stephan: of the request.</p>
d87ca60c58 2008-05-15   stephan: 
adc0b3bfb0 2008-07-15       drh: <h2>3.0 Fossil Synchronization Content</h2>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <p>A synchronization request between a client and server consists of
d87ca60c58 2008-05-15   stephan: one or more HTTP requests as described in the previous section.  This
d87ca60c58 2008-05-15   stephan: section details the "x-fossil" content type.</p>
d87ca60c58 2008-05-15   stephan: 
adc0b3bfb0 2008-07-15       drh: <h3>3.1 Line-oriented Format</h3>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <p>The x-fossil content type consists of zero or more "cards".  Cards
d87ca60c58 2008-05-15   stephan: are separate by the newline character ("\n").  Leading and trailing
d87ca60c58 2008-05-15   stephan: whitespace on a card is ignored.  Blank cards are ignored.</p>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <p>Each card is divided into zero or more space separated tokens.
d87ca60c58 2008-05-15   stephan: The first token on each card is the operator.  Subsequent tokens
d87ca60c58 2008-05-15   stephan: are arguments.  The set of operators understood by servers is slightly
d87ca60c58 2008-05-15   stephan: different from the operators understood by clients, though the two
d87ca60c58 2008-05-15   stephan: are very similar.</p>
d87ca60c58 2008-05-15   stephan: 
adc0b3bfb0 2008-07-15       drh: <h3>3.2 Login Cards</h3>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <p>Every message from client to server begins with one or more login
d87ca60c58 2008-05-15   stephan: cards.  Each login card has the following format:</p>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <blockquote>
d87ca60c58 2008-05-15   stephan: <b>login</b>  <i>userid  nonce  signature</i>
d87ca60c58 2008-05-15   stephan: </blockquote>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <p>The userid is the name of the user that is requesting service
d87ca60c58 2008-05-15   stephan: from the server.  The nonce is the SHA1 hash of the remainder of
d87ca60c58 2008-05-15   stephan: the message - all text that follows the newline character that
d87ca60c58 2008-05-15   stephan: terminates the login card.  The signature is the SHA1 hash of
d87ca60c58 2008-05-15   stephan: the concatenation of the nonce and the users password.</p>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <p>For each login card, the server looks up the user and verifies
d87ca60c58 2008-05-15   stephan: that the nonce matches the SHA1 hash of the remainder of the
d87ca60c58 2008-05-15   stephan: message.  It then checks the signature hash to make sure the
d87ca60c58 2008-05-15   stephan: signature matches.  If everything
d87ca60c58 2008-05-15   stephan: checks out, then the client is granted all privileges of the
d87ca60c58 2008-05-15   stephan: specified user.</p>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <p>Privileges are cumulative.  There can be multiple successful
d87ca60c58 2008-05-15   stephan: login cards.  The session privileges are the bit-wise OR of the
d87ca60c58 2008-05-15   stephan: privileges of each individual login.</p>
d87ca60c58 2008-05-15   stephan: 
adc0b3bfb0 2008-07-15       drh: <h3>3.3 File Cards</h3>
adc0b3bfb0 2008-07-15       drh: 
e8c4f69c50 2008-10-24       drh: <p>Artifacts are transferred using "file" cards.  (The name "file"
e8c4f69c50 2008-10-24       drh: card comes from the fact that most artifacts correspond to files.)
e8c4f69c50 2008-10-24       drh: File cards come in two different formats depending
e8c4f69c50 2008-10-24       drh: on whether the artifact is sent directly or as a delta from some
e8c4f69c50 2008-10-24       drh: other artifact.</p>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <blockquote>
e8c4f69c50 2008-10-24       drh: <b>file</b> <i>artifact-id size</i> <b>\n</b> <i>content</i><br>
e8c4f69c50 2008-10-24       drh: <b>file</b> <i>artifact-id delta-artifact-id size</i> <b>\n</b> <i>content</i>
d87ca60c58 2008-05-15   stephan: </blockquote>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <p>File cards are different from all other cards in that they
e8c4f69c50 2008-10-24       drh: followed by in-line "payload" data.  The content of the artifact
e8c4f69c50 2008-10-24       drh: or the artifact delta consists of the first <i>size</i> bytes of the
d87ca60c58 2008-05-15   stephan: x-fossil content that immediately follow the newline that
d87ca60c58 2008-05-15   stephan: terminates the file card.  No other cards have this characteristic.
d87ca60c58 2008-05-15   stephan: </p>
d87ca60c58 2008-05-15   stephan: 
e8c4f69c50 2008-10-24       drh: <p>The first argument of a file card is the ID of the artifact that
e8c4f69c50 2008-10-24       drh: is being transferred.  The artifact ID is the lower-case hexadecimal
e8c4f69c50 2008-10-24       drh: representation of the SHA1 hash of the artifact.
d87ca60c58 2008-05-15   stephan: The last argument of the file card is the number of bytes of
d87ca60c58 2008-05-15   stephan: payload that immediately follow the file card.  If the file
d87ca60c58 2008-05-15   stephan: card has only two arguments, that means the payload is the
e8c4f69c50 2008-10-24       drh: complete content of the artifact.  If the file card has three
d87ca60c58 2008-05-15   stephan: arguments, then the payload is a delta and second argument is
e8c4f69c50 2008-10-24       drh: the ID of another artifact that is the source of the delta.</p>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <p>File cards are sent in both directions: client to server and
d87ca60c58 2008-05-15   stephan: server to client.  A delta might be sent before the source of
d87ca60c58 2008-05-15   stephan: the delta, so both client and server should remember deltas
d87ca60c58 2008-05-15   stephan: and be able to apply them when their source arrives.</p>
d87ca60c58 2008-05-15   stephan: 
adc0b3bfb0 2008-07-15       drh: <h3>3.4 Push and Pull Cards</h3>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <p>Among of the first cards in a client-to-server message are
d87ca60c58 2008-05-15   stephan: the push and pull cards.  The push card tell the server that
d87ca60c58 2008-05-15   stephan: the client is pushing content.  The pull card tell the server
d87ca60c58 2008-05-15   stephan: that the client wants to pull content.  In the event of a sync,
d87ca60c58 2008-05-15   stephan: both cards are sent.  The format is as follows:</p>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <blockquote>
d87ca60c58 2008-05-15   stephan: <b>push</b> <i>servercode projectcode</i><br>
d87ca60c58 2008-05-15   stephan: <b>pull</b> <i>servercode projectcode</i>
d87ca60c58 2008-05-15   stephan: </blockquote>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <p>The <i>servercode</i> argument is the repository ID for the
d87ca60c58 2008-05-15   stephan: client.  The server will only allow the transaction to proceed
d87ca60c58 2008-05-15   stephan: if the servercode is different from its own servercode.  This
d87ca60c58 2008-05-15   stephan: prevents a sync-loop.  The <i>projectcode</i> is the identifier
d87ca60c58 2008-05-15   stephan: of the software project that the client repository contains.
d87ca60c58 2008-05-15   stephan: The projectcode for the client and server must match in order
d87ca60c58 2008-05-15   stephan: for the transaction to proceed.</p>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <p>The server will also send a push card back to the client
d87ca60c58 2008-05-15   stephan: during a clone.  This is how the client determines what project
d87ca60c58 2008-05-15   stephan: code to put in the new repository it is constructing.</p>
d87ca60c58 2008-05-15   stephan: 
adc0b3bfb0 2008-07-15       drh: <h3>3.5 Clone Cards</h3>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <p>A clone card works like a pull card in that it is sent from
d87ca60c58 2008-05-15   stephan: client to server in order to tell the server that the client
d87ca60c58 2008-05-15   stephan: wants to pull content.  But unlike the pull card, the clone
d87ca60c58 2008-05-15   stephan: card has no arguments.</p>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <blockquote>
d87ca60c58 2008-05-15   stephan: <b>clone</b>
d87ca60c58 2008-05-15   stephan: </blockquote>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <p>In response to a clone message, the server also sends the client
d87ca60c58 2008-05-15   stephan: a push message so that the client can discover the projectcode for
d87ca60c58 2008-05-15   stephan: this project.</p>
d87ca60c58 2008-05-15   stephan: 
adc0b3bfb0 2008-07-15       drh: <h3>3.6 Igot Cards</h3>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <p>An igot card can be sent from either client to server or from
d87ca60c58 2008-05-15   stephan: server to client in order to indicate that the sender holds a copy
e8c4f69c50 2008-10-24       drh: of a particular artifact.  The format is:</p>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <blockquote>
e8c4f69c50 2008-10-24       drh: <b>igot</b> <i>artifact-id</i>
d87ca60c58 2008-05-15   stephan: </blockquote>
d87ca60c58 2008-05-15   stephan: 
e8c4f69c50 2008-10-24       drh: <p>The argument of the igot card is the ID of the artifact that
d87ca60c58 2008-05-15   stephan: the sender possesses.
d87ca60c58 2008-05-15   stephan: The receiver of an igot card will typically check to see if
e8c4f69c50 2008-10-24       drh: it also holds the same artifact and if not it will request the artifact
d87ca60c58 2008-05-15   stephan: using a gimme card in either the reply or in the next message.</p>
d87ca60c58 2008-05-15   stephan: 
adc0b3bfb0 2008-07-15       drh: <h3>3.7 Gimme Cards</h3>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <p>A gimme card is sent from either client to server or from server
d87ca60c58 2008-05-15   stephan: to client.  The gimme card asks the receiver to send a particular
e8c4f69c50 2008-10-24       drh: artifact back to the sender.  The format of a gimme card is this:</p>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <blockquote>
e8c4f69c50 2008-10-24       drh: <b>gimme</b> <i>artifact-id</i>
d87ca60c58 2008-05-15   stephan: </blockquote>
d87ca60c58 2008-05-15   stephan: 
e8c4f69c50 2008-10-24       drh: <p>The argument to the gimme card is the ID of the artifact that
d87ca60c58 2008-05-15   stephan: the sender wants.  The receiver will typically respond to a
d87ca60c58 2008-05-15   stephan: gimme card by sending a file card in its reply or in the next
d87ca60c58 2008-05-15   stephan: message.</p>
d87ca60c58 2008-05-15   stephan: 
adc0b3bfb0 2008-07-15       drh: <h3>3.8 Cookie Cards</h3>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <p>A cookie card can be used by a server to record a small amount
d87ca60c58 2008-05-15   stephan: of state information on a client.  The server sends a cookie to the
d87ca60c58 2008-05-15   stephan: client.  The client sends the same cookie back to the server on
d87ca60c58 2008-05-15   stephan: its next request.  The cookie card has a single argument which
d87ca60c58 2008-05-15   stephan: is its payload.</p>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <blockquote>
d87ca60c58 2008-05-15   stephan: <b>cookie</b> <i>payload</i>
d87ca60c58 2008-05-15   stephan: </blockquote>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <p>The client is not required to return the cookie to the server on
d87ca60c58 2008-05-15   stephan: its next request.  Or the client might send a cookie from a different
d87ca60c58 2008-05-15   stephan: server on the next request.  So the server must not depend on the
d87ca60c58 2008-05-15   stephan: cookie and the server must structure the cookie payload in such
d87ca60c58 2008-05-15   stephan: a way that it can tell if the cookie it sees is its own cookie or
d87ca60c58 2008-05-15   stephan: a cookie from another server.  (Typically the server will embed
d87ca60c58 2008-05-15   stephan: its servercode as part of the cookie.)</p>
d87ca60c58 2008-05-15   stephan: 
adc0b3bfb0 2008-07-15       drh: <h3>3.9 Error Cards</h3>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <p>If the server discovers anything wrong with a request, it generates
d87ca60c58 2008-05-15   stephan: an error card in its reply.  When the client sees the error card,
d87ca60c58 2008-05-15   stephan: it displays an error message to the user and aborts the sync
d87ca60c58 2008-05-15   stephan: operation.  An error card looks like this:</p>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <blockquote>
d87ca60c58 2008-05-15   stephan: <b>error</b> <i>error-message</i>
d87ca60c58 2008-05-15   stephan: </blockquote>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <p>The error message is English text that is encoded in order to
d87ca60c58 2008-05-15   stephan: be a single token.
d87ca60c58 2008-05-15   stephan: A space (ASCII 0x20) is represented as "\s" (ASCII 0x5C, 0x73).  A
d87ca60c58 2008-05-15   stephan: newline (ASCII 0x0a) is "\n" (ASCII 0x6C, x6E).  A backslash
d87ca60c58 2008-05-15   stephan: (ASCII 0x5C) is represented as two backslashes "\\".  Apart from
d87ca60c58 2008-05-15   stephan: space and newline, no other whitespace characters nor any
d87ca60c58 2008-05-15   stephan: unprintable characters are allowed in
d87ca60c58 2008-05-15   stephan: the error message.</p>
d87ca60c58 2008-05-15   stephan: 
adc0b3bfb0 2008-07-15       drh: <h3>3.10 Unknown Cards</h3>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <p>If either the client or the server sees a card that is not
d87ca60c58 2008-05-15   stephan: described above, then it generates an error and aborts.</p>
d87ca60c58 2008-05-15   stephan: 
adc0b3bfb0 2008-07-15       drh: <h2>4.0 Phantoms And Clusters</h2>
adc0b3bfb0 2008-07-15       drh: 
e8c4f69c50 2008-10-24       drh: <p>When a repository knows that a artifact exists and knows the ID of
e8c4f69c50 2008-10-24       drh: that artifact, but it does not know the artifact content, then it stores that
e8c4f69c50 2008-10-24       drh: artifact as a "phantom".  A repository will typically create a phantom when
e8c4f69c50 2008-10-24       drh: it receives an igot card for a artifact that it does not hold or when it
d87ca60c58 2008-05-15   stephan: receives a file card that references a delta source that it does not
d87ca60c58 2008-05-15   stephan: hold.  When a server is generating its reply or when a client is
d87ca60c58 2008-05-15   stephan: generating a new request, it will usually send gimme cards for every
d87ca60c58 2008-05-15   stephan: phantom that it holds.</p>
d87ca60c58 2008-05-15   stephan: 
e8c4f69c50 2008-10-24       drh: <p>A cluster is a special artifact that tells of the existence of other
e8c4f69c50 2008-10-24       drh: artifacts.  Any artifact in the repository that follows the syntactic rules
d87ca60c58 2008-05-15   stephan: of a cluster is considered a cluster.</p>
d87ca60c58 2008-05-15   stephan: 
e8c4f69c50 2008-10-24       drh: <p>A cluster is line oriented.  Each line of a cluster
d87ca60c58 2008-05-15   stephan: is a card.  The cards are separated by the newline ("\n") character.
d87ca60c58 2008-05-15   stephan: Each card consists of a single character card type, a space, and a
d87ca60c58 2008-05-15   stephan: single argument.  No extra whitespace and no trailing or leading
d87ca60c58 2008-05-15   stephan: whitespace is allowed.  All cards in the cluster must occur in
d87ca60c58 2008-05-15   stephan: strict lexicographical order.</p>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <p>A cluster consists of one or more "M" cards followed by a single
e8c4f69c50 2008-10-24       drh: "Z" card.  Each M card holds an argument which is a artifact ID for an
e8c4f69c50 2008-10-24       drh: artifact in the repository.  The Z card has a single argument which is the
d87ca60c58 2008-05-15   stephan: lower-case hexadecimal representation of the MD5 checksum of all
d87ca60c58 2008-05-15   stephan: preceding M cards up to and included the newline character that
d87ca60c58 2008-05-15   stephan: occurred just before the Z that starts the Z card.</p>
d87ca60c58 2008-05-15   stephan: 
e8c4f69c50 2008-10-24       drh: <p>Any artifact that does not match the specifications of a cluster
d87ca60c58 2008-05-15   stephan: exactly is not a cluster.  There must be no extra whitespace in
e8c4f69c50 2008-10-24       drh: the artifact.  There must be one or more M cards.  There must be a
d87ca60c58 2008-05-15   stephan: single Z card with a correct MD5 checksum.  And all cards must
d87ca60c58 2008-05-15   stephan: be in strict lexicographical order.</p>
d87ca60c58 2008-05-15   stephan: 
adc0b3bfb0 2008-07-15       drh: <h3>4.1 The Unclustered Table</h3>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <p>Every repository maintains a table named "<b>unclustered</b>"
e8c4f69c50 2008-10-24       drh: which records the identity of every artifact and phantom it holds that is not
d87ca60c58 2008-05-15   stephan: mentioned in a cluster.  The entries in the unclustered table can
e8c4f69c50 2008-10-24       drh: be thought of as leaves on a tree of artifacts.  Some of the unclustered
e8c4f69c50 2008-10-24       drh: artifacts will be other clusters.  Those clusters may contain other clusters,
d87ca60c58 2008-05-15   stephan: which might contain still more clusters, and so forth.  Beginning
e8c4f69c50 2008-10-24       drh: with the artifacts in the unclustered table, one can follow the chain
e8c4f69c50 2008-10-24       drh: of clusters to find every artifact in the repository.</p>
adc0b3bfb0 2008-07-15       drh: 
adc0b3bfb0 2008-07-15       drh: <h2>5.0 Synchronization Strategies</h2>
adc0b3bfb0 2008-07-15       drh: 
adc0b3bfb0 2008-07-15       drh: <h3>5.1 Pull</h3>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <p>A typical pull operation proceeds as shown below.  Details
d87ca60c58 2008-05-15   stephan: of the actual implementation may very slightly but the gist of
d87ca60c58 2008-05-15   stephan: a pull is captured in the following steps:</p>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <ol>
d87ca60c58 2008-05-15   stephan: <li>The client sends login and pull cards.
d87ca60c58 2008-05-15   stephan: <li>The client sends a cookie card if it has previously received a cookie.
d87ca60c58 2008-05-15   stephan: <li>The client sends gimme cards for every phantom that it holds.
d87ca60c58 2008-05-15   stephan: <hr>
d87ca60c58 2008-05-15   stephan: <li>The server checks the login password and rejects the session if
d87ca60c58 2008-05-15   stephan: the user does not have permission to pull.
d87ca60c58 2008-05-15   stephan: <li>If the number entries in the unclustered table on the server is
e8c4f69c50 2008-10-24       drh: greater than 100, then the server constructs a new cluster artifact to
d87ca60c58 2008-05-15   stephan: cover all those unclustered entries.
d87ca60c58 2008-05-15   stephan: <li>The server sends file cards for every gimme card it received
d87ca60c58 2008-05-15   stephan: from the client.
e8c4f69c50 2008-10-24       drh: <li>The server sends ihave cards for every artifact in its unclustered
d87ca60c58 2008-05-15   stephan: table that is not a phantom.
d87ca60c58 2008-05-15   stephan: <hr>
d87ca60c58 2008-05-15   stephan: <li>The client adds the content of file cards to its repository.
d87ca60c58 2008-05-15   stephan: <li>The client creates a phantom for every ihave card in the server reply
e8c4f69c50 2008-10-24       drh: that mentions an artifact that the client does not possess.
d87ca60c58 2008-05-15   stephan: <li>The client creates a phantom for the delta source of file cards when
e8c4f69c50 2008-10-24       drh: the delta source is an artifact that the client does not possess.
d87ca60c58 2008-05-15   stephan: </ol>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <p>These ten steps represent a single HTTP round-trip request.
d87ca60c58 2008-05-15   stephan: The first three steps are the processing that occurs on the client
d87ca60c58 2008-05-15   stephan: to generate the request.  The middle four steps are processing
d87ca60c58 2008-05-15   stephan: that occurs on the server to interpret the request and generate a
d87ca60c58 2008-05-15   stephan: reply.  And the last three steps are the processing that the
d87ca60c58 2008-05-15   stephan: client does to interpret the reply.</p>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <p>During a pull, the client will keep sending HTTP requests
e8c4f69c50 2008-10-24       drh: until it holds all artifacts that exist on the server.</p>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <p>Note that the server tries
d87ca60c58 2008-05-15   stephan: to limit the size of its reply message to something reasonable
d87ca60c58 2008-05-15   stephan: (usually about 1MB) so that it might stop sending file cards as
d87ca60c58 2008-05-15   stephan: described in step (6) if the reply becomes too large.</p>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <p>Step (5) is the only way in which new clusters can be created.
d87ca60c58 2008-05-15   stephan: By only creating clusters on the server, we hope to minimize the
d87ca60c58 2008-05-15   stephan: amount of overlap between clusters in the common configuration where
d87ca60c58 2008-05-15   stephan: there is a single server and many clients.  The same synchronization
d87ca60c58 2008-05-15   stephan: protocol will continue to work even if there are multiple servers
d87ca60c58 2008-05-15   stephan: or if servers and clients sometimes change roles.  The only negative
d87ca60c58 2008-05-15   stephan: effects of these unusual arrangements is that more than the minimum
d87ca60c58 2008-05-15   stephan: number of clusters might be generated.</p>
d87ca60c58 2008-05-15   stephan: 
adc0b3bfb0 2008-07-15       drh: <h3>5.2 Push</h3>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <p>A typical push operation proceeds roughly as shown below.  As
d87ca60c58 2008-05-15   stephan: with a pull, the actual implementation may vary slightly.</p>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <ol>
d87ca60c58 2008-05-15   stephan: <li>The client sends login and push cards.
e8c4f69c50 2008-10-24       drh: <li>The client sends file cards for any artifacts that it holds that have
e8c4f69c50 2008-10-24       drh: never before been pushed - artifacts that come from local check-ins.
d87ca60c58 2008-05-15   stephan: <li>If this is the second or later cycle in a push, then the
d87ca60c58 2008-05-15   stephan: client sends file cards for any gimme cards that the server sent
d87ca60c58 2008-05-15   stephan: in the previous cycle.
e8c4f69c50 2008-10-24       drh: <li>The client sends igot cards for every artifact in its unclustered table
d87ca60c58 2008-05-15   stephan: that is not a phantom.
d87ca60c58 2008-05-15   stephan: <hr>
d87ca60c58 2008-05-15   stephan: <li>The server checks the login and push cards and issues an error if
d87ca60c58 2008-05-15   stephan: anything is amiss.
e8c4f69c50 2008-10-24       drh: <li>The server accepts file cards from the client and adds those artifacts
d87ca60c58 2008-05-15   stephan: to its repository.
e8c4f69c50 2008-10-24       drh: <li>The server creates phantoms for igot cards that mention artifacts it
e8c4f69c50 2008-10-24       drh: does not possess or for file cards that mention delta source artifacts that
d87ca60c58 2008-05-15   stephan: it does not possess.
d87ca60c58 2008-05-15   stephan: <li>The server issues gimme cards for all phantoms.
d87ca60c58 2008-05-15   stephan: <hr>
d87ca60c58 2008-05-15   stephan: <li>The client remembers the gimme cards from the server so that it
d87ca60c58 2008-05-15   stephan: can generate file cards in reply on the next cycle.
d87ca60c58 2008-05-15   stephan: </ol>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <p>As with a pull, the steps of a push operation repeat until the
e8c4f69c50 2008-10-24       drh: server knows all artifacts that exist on the client.  Also, as with
d87ca60c58 2008-05-15   stephan: pull, the client attempts to keep the size of the request from
d87ca60c58 2008-05-15   stephan: growing too large by suppressing file cards once the
d87ca60c58 2008-05-15   stephan: size of the request reaches 1MB.</p>
d87ca60c58 2008-05-15   stephan: 
adc0b3bfb0 2008-07-15       drh: <h3>5.3 Sync</h3>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <p>A sync is just a pull and a push that happen at the same time.
d87ca60c58 2008-05-15   stephan: The first three steps of a pull are combined with the first five steps
d87ca60c58 2008-05-15   stephan: of a push.  Steps (4) through (7) of a pull are combined with steps
d87ca60c58 2008-05-15   stephan: (5) through (8) of a push.  And steps (8) through (10) of a pull
d87ca60c58 2008-05-15   stephan: are combined with step (9) of a push.</p>
d87ca60c58 2008-05-15   stephan: 
adc0b3bfb0 2008-07-15       drh: <h2>6.0 Summary</h2>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <p>Here are the key points of the synchronization protocol:</p>
d87ca60c58 2008-05-15   stephan: 
d87ca60c58 2008-05-15   stephan: <ol>
d87ca60c58 2008-05-15   stephan: <li>The client sends one or more PUSH HTTP requests to the server.
d87ca60c58 2008-05-15   stephan:     The request and reply content type is "application/x-fossil".
d87ca60c58 2008-05-15   stephan: <li>HTTP request content is compressed using zlib.
d87ca60c58 2008-05-15   stephan: <li>The content of request and reply consists of cards with one
d87ca60c58 2008-05-15   stephan:     card per line.
d87ca60c58 2008-05-15   stephan: <li>Card formats are:
d87ca60c58 2008-05-15   stephan:     <ul>
d87ca60c58 2008-05-15   stephan:     <li> <b>login</b> <i>userid nonce signature</i>
d87ca60c58 2008-05-15   stephan:     <li> <b>push</b> <i>servercode projectcode</i>
d87ca60c58 2008-05-15   stephan:     <li> <b>pull</b> <i>servercode projectcode</i>
d87ca60c58 2008-05-15   stephan:     <li> <b>clone</b>
e8c4f69c50 2008-10-24       drh:     <li> <b>file</b> <i>artifact-id size</i> <b>\n</b> <i>content</i>
e8c4f69c50 2008-10-24       drh:     <li> <b>file</b> <i>artifact-id delta-artifact-id size</i> <b>\n</b> <i>content</i>
e8c4f69c50 2008-10-24       drh:     <li> <b>igot</b> <i>artifact-id</i>
e8c4f69c50 2008-10-24       drh:     <li> <b>gimme</b> <i>artifact-id</i>
d87ca60c58 2008-05-15   stephan:     <li> <b>cookie</b>  <i>cookie-text</i>
d87ca60c58 2008-05-15   stephan:     <li> <b>error</b> <i>error-message</i>
d87ca60c58 2008-05-15   stephan:     </ul>
e8c4f69c50 2008-10-24       drh: <li>Phantoms are artifacts that a repository knows exist but does not possess.
e8c4f69c50 2008-10-24       drh: <li>Clusters are artifacts that contain IDs of other artifacts.
d87ca60c58 2008-05-15   stephan: <li>Clusters are created automatically on the server during a pull.
e8c4f69c50 2008-10-24       drh: <li>Repositories keep track of all artifacts that are not named in any
e8c4f69c50 2008-10-24       drh: cluster and send igot messages for those artifacts.
d87ca60c58 2008-05-15   stephan: <li>Repositories keep track of all the phantoms they hold and send
e8c4f69c50 2008-10-24       drh: gimme messages for those artifacts.
d87ca60c58 2008-05-15   stephan: </ol>