File Annotation
Not logged in
2e275c1420 2009-01-23       drh: <h1 align="center">
2e275c1420 2009-01-23       drh: Branching, Forking, Merging, and Tagging
2e275c1420 2009-01-23       drh: </h1>
2e275c1420 2009-01-23       drh: 
2e275c1420 2009-01-23       drh: In a simple and perfect world, the development of a project would proceed
2e275c1420 2009-01-23       drh: linearly, as shown in figure 1.
2e275c1420 2009-01-23       drh: 
2e275c1420 2009-01-23       drh: <center><table border=1 cellpadding=10 hspace=10 vspace=10>
2e275c1420 2009-01-23       drh: <tr><td align="center">
2e275c1420 2009-01-23       drh: <img src="branch01.gif"><br>
2e275c1420 2009-01-23       drh: Figure 1
2e275c1420 2009-01-23       drh: </td></tr></table></center>
2e275c1420 2009-01-23       drh: 
2e275c1420 2009-01-23       drh: Each circle represents a check-in.  For the sake of clarity, the check-ins
2e275c1420 2009-01-23       drh: are given small consecutive numbers.  In a real system, of course, the
2e275c1420 2009-01-23       drh: check-in numbers would be 40-character SHA1 hashes since it is not possible
2e275c1420 2009-01-23       drh: to allocate collision-free sequential numbers is a distributed system.
2e275c1420 2009-01-23       drh: But sequential numbers are easier to read, so we will substitute them for
2e275c1420 2009-01-23       drh: the 40-character SHA1 hashes in this document.
2e275c1420 2009-01-23       drh: 
2e275c1420 2009-01-23       drh: The arrows in figure 1 show evolution of the project.  The initial
2e275c1420 2009-01-23       drh: check-in is 1.  Check-in 2 is derived from 1.  In other words, check-in 2
2e275c1420 2009-01-23       drh: was created by making edits to check-in 1 and then committing those edits.
2e275c1420 2009-01-23       drh: We say that 2 is a <i>child</i> of 1
2e275c1420 2009-01-23       drh: and that 1 is a <i>parent</i> of 2.
2e275c1420 2009-01-23       drh: Check-in 3 is derived from check-in 2, making
2e275c1420 2009-01-23       drh: 3 a child of 2.  We say that 3 is a <i>descendant</i> of both 1 and 2 and that 1
2e275c1420 2009-01-23       drh: and 2 are both <i>ancestors</i> of 3.
2e275c1420 2009-01-23       drh: 
2e275c1420 2009-01-23       drh: We call the graph of check-ins a <i>tree</i>.  Check-in 1 is the <i>root</i>
2e275c1420 2009-01-23       drh: since it has no ancestors.  Check-in 4 is a <i>leaf</i> of the tree since
2e275c1420 2009-01-23       drh: it has no descendants.
2e275c1420 2009-01-23       drh: 
2e275c1420 2009-01-23       drh: Alas, reality often interferes with the simple linear development of a
2e275c1420 2009-01-23       drh: project.  Suppose two programmers make independent modifications to check-in 2.
2e275c1420 2009-01-23       drh: After both changes are checked in, we have a check-in graph that looks
2e275c1420 2009-01-23       drh: like figure 2:
2e275c1420 2009-01-23       drh: 
2e275c1420 2009-01-23       drh: <center><table border=1 cellpadding=10 hspace=10 vspace=10>
2e275c1420 2009-01-23       drh: <tr><td align="center">
2e275c1420 2009-01-23       drh: <img src="branch02.gif"><br>
2e275c1420 2009-01-23       drh: Figure 2
2e275c1420 2009-01-23       drh: </td></tr></table></center>
2e275c1420 2009-01-23       drh: 
2e275c1420 2009-01-23       drh: The graph in figure 2 has two leaves: check-ins 3 and 4.  Check-in 2 has
2e275c1420 2009-01-23       drh: two children, check-ins 3 and 4.  We call this stituation a <i>fork</i>.
2e275c1420 2009-01-23       drh: 
2e275c1420 2009-01-23       drh: Fossil tries to prevent forks.  Suppose the two programmers who were
2e275c1420 2009-01-23       drh: editing check-in 2 are named Alice and Bob.  Suppose Alice finished her
2e275c1420 2009-01-23       drh: edits first and did a commit, resulting in check-in 3.  Later, when Bob
2e275c1420 2009-01-23       drh: tried to commit his changes, fossil would try to verify that check-in 2
2e275c1420 2009-01-23       drh: was still a leaf.  Fossil would see that check-in 3 had occurred and would
2e275c1420 2009-01-23       drh: abort Bob's commit attempt with a message "would fork".  This allows Bob
2e275c1420 2009-01-23       drh: to do a "fossil update" which would pull in Alices changes and merge them
2e275c1420 2009-01-23       drh: together with his own changes.  After merging, Bob could then commit
2e275c1420 2009-01-23       drh: check-in 4 as a child of check-in 3 and the result would be a linear graph
2e275c1420 2009-01-23       drh: as shown in figure 1.  This is how CVS works.  This is also how fossil
2e275c1420 2009-01-23       drh: works in "autosync" mode.
2e275c1420 2009-01-23       drh: 
2e275c1420 2009-01-23       drh: But it might be that Bob is off-network when he does his commit, so he
2e275c1420 2009-01-23       drh: has no way of knowing that Alice has already committed her changes.
2e275c1420 2009-01-23       drh: Or, it could be that Bob has turned of "autosync" mode in SQLite.  Or,
2e275c1420 2009-01-23       drh: maybe Bob just doesn't want to merge in Alices changes before he has
2e275c1420 2009-01-23       drh: saved his own, so he forces the commit to occur using the "--force" option
2e275c1420 2009-01-23       drh: to the fossil <b>commit</b> command.  For whatever reason, two commits against
2e275c1420 2009-01-23       drh: check-in 2 have occurred and now the tree has two leaves.
2e275c1420 2009-01-23       drh: 
2e275c1420 2009-01-23       drh: So which version of the project is the "latest" in the sense of having
2e275c1420 2009-01-23       drh: the most features and the most bug fixes?  When there is more than
2e275c1420 2009-01-23       drh: one leaf in the graph, you don't really know.  So we like to have
2e275c1420 2009-01-23       drh: graphs with a single leaf.
2e275c1420 2009-01-23       drh: 
2e275c1420 2009-01-23       drh: To resolve this situation, Alice can use the fossil <b>merge</b> command
2e275c1420 2009-01-23       drh: to me merge in Bob's changes in here local copy of check-in 3.  Then she
2e275c1420 2009-01-23       drh: can commit the results as check-in 5.  This results in a tree as shown
2e275c1420 2009-01-23       drh: in figure 3.
2e275c1420 2009-01-23       drh: 
2e275c1420 2009-01-23       drh: <center><table border=1 cellpadding=10 hspace=10 vspace=10>
2e275c1420 2009-01-23       drh: <tr><td align="center">
2e275c1420 2009-01-23       drh: <img src="branch03.gif"><br>
2e275c1420 2009-01-23       drh: Figure 3
2e275c1420 2009-01-23       drh: </td></tr></table></center>
2e275c1420 2009-01-23       drh: 
2e275c1420 2009-01-23       drh: Check-in 5 is a direct child of check-in 3 because it was created by editing
2e275c1420 2009-01-23       drh: check-in 3.  But check-in 5 also inherits the changes from check-in 4 by
2e275c1420 2009-01-23       drh: virtual of the merge.  So we say that check-in 5 is a <i>merge child</i>
2e275c1420 2009-01-23       drh: of check-in 4 and that it is a <i>direct child</i> of check-in 3.
2e275c1420 2009-01-23       drh: The graph is now back to a single leaf (check-in 5).
2e275c1420 2009-01-23       drh: 
2e275c1420 2009-01-23       drh: We have already seen that if fossil is in autosync mode then Bob would
2e275c1420 2009-01-23       drh: have been warned about the potential fork the first time he tried to
2e275c1420 2009-01-23       drh: commit check-in 4.  If Bob had updated his local check-out to merge in
2e275c1420 2009-01-23       drh: Alice's check-in 3 changes, then committed, then the fork would have
2e275c1420 2009-01-23       drh: never occurred.  The resulting graph would have been linear, as shown
2e275c1420 2009-01-23       drh: in figure 1.  Really the graph of figure 1 is a subset of figure 3.
2e275c1420 2009-01-23       drh: Hold your hand over the check-in 4 circle of figure 3 and then figure
2e275c1420 2009-01-23       drh: 3 looks exactly like figure 1 (except that the leaf has a different check-in
2e275c1420 2009-01-23       drh: number, but that is just a notational difference - the two check-ins have
2e275c1420 2009-01-23       drh: exactly the same content).  In other words, figure 3 is really a superset
2e275c1420 2009-01-23       drh: of figure 1.  The check-in 4 of figure 3 captures addition state which
2e275c1420 2009-01-23       drh: is omitted from figure 1.  In check-in 4 of figure 3 is a copy
2e275c1420 2009-01-23       drh: of Bob's local checkout before he merged in Alices changes.  That snapshot
2e275c1420 2009-01-23       drh: of Bob's changes independent of Alice's changes is omitted from figure 1.
2e275c1420 2009-01-23       drh: Some people say that the approach taken in figure 3 is better because it
2e275c1420 2009-01-23       drh: preserves this extra intermediate state.  Others say that the approach
2e275c1420 2009-01-23       drh: taken in figure 1 is better because it is much easier to visualize a
2e275c1420 2009-01-23       drh: linear line of development and because the the merging happens automatically
2e275c1420 2009-01-23       drh: instead of as a separate manual step.  We will not take sides in this
2e275c1420 2009-01-23       drh: debate.  We will simply point out that fossil enables you to do it either way.
2e275c1420 2009-01-23       drh: 
2e275c1420 2009-01-23       drh: <h2>Forking Versus Branching</h2>
2e275c1420 2009-01-23       drh: 
2e275c1420 2009-01-23       drh: Forking and having more than one leaf in the check-in tree is usually
2e275c1420 2009-01-23       drh: considered undesirable, and so forks are usually quickly resolved as
2e275c1420 2009-01-23       drh: shown in figure 3 above.
2e275c1420 2009-01-23       drh: But sometimes, one does want to have multiple leaves.  For example, a project
2e275c1420 2009-01-23       drh: might have one leaf that is the latest version of the project under
2e275c1420 2009-01-23       drh: development and another leaf that is the latest version that has been
2e275c1420 2009-01-23       drh: tested.
2e275c1420 2009-01-23       drh: When multiple leaves are desirable, we call the phenomenon <i>branching</i>
2e275c1420 2009-01-23       drh: instead of <i>forking</i>.
2e275c1420 2009-01-23       drh: Figure 4 shows an example of a project where there are two branches, one
2e275c1420 2009-01-23       drh: for development work and another for testing.
2e275c1420 2009-01-23       drh: 
2e275c1420 2009-01-23       drh: <center><table border=1 cellpadding=10 hspace=10 vspace=10>
2e275c1420 2009-01-23       drh: <tr><td align="center">
2e275c1420 2009-01-23       drh: <img src="branch04.gif"><br>
2e275c1420 2009-01-23       drh: Figure 4
2e275c1420 2009-01-23       drh: </td></tr></table></center>
2e275c1420 2009-01-23       drh: 
2e275c1420 2009-01-23       drh: The hypothetical scenario of figure 4 is this:  The project starts and
2e275c1420 2009-01-23       drh: progresses to a point where (at check-in 2)
2e275c1420 2009-01-23       drh: it is ready to enter testing for its first release.
2e275c1420 2009-01-23       drh: In a real project, of course, there might be hundreds or thousands of
2e275c1420 2009-01-23       drh: check-ins before a project reaches this point, but for simplicity of
2e275c1420 2009-01-23       drh: presentation we will say that the project is ready after check-in 2.
2e275c1420 2009-01-23       drh: The project then splits into two branches that are used by separate
2e275c1420 2009-01-23       drh: teams.  The testing team, using the blue branch, finds and fixes a few
2e275c1420 2009-01-23       drh: bugs.  This is shown by check-ins 6 and 9.  Meanwhile the development
2e275c1420 2009-01-23       drh: team, working on the red branch, is busy adding features for the second
2e275c1420 2009-01-23       drh: release.  Of course, the development team would like to take advantage of
2e275c1420 2009-01-23       drh: the bug fixes implemented by the testing team.  So periodically, the
2e275c1420 2009-01-23       drh: changes in the test branch are merged into the dev branch.  This is
2e275c1420 2009-01-23       drh: shown by the dashed merge arrows between check-ins 6 and 7 and between
2e275c1420 2009-01-23       drh: check-ins 9 and 10.
2e275c1420 2009-01-23       drh: 
2e275c1420 2009-01-23       drh: In both figures 2 and 4, check-in 2 has two children.  In figure 2,
2e275c1420 2009-01-23       drh: we called this a "fork".  In diagram 4, we call it a "branch".  What is
2e275c1420 2009-01-23       drh: the difference?  As far as the internal fossil data structure are
2e275c1420 2009-01-23       drh: concerned, there is no difference.  The distinction is in the intent.
2e275c1420 2009-01-23       drh: In figure 2, the fact that check-in 2 has multiple children is an
2e275c1420 2009-01-23       drh: accident that stems from concurrent development.  In figure 4, giving
2e275c1420 2009-01-23       drh: check-in 2 multiple children is a deliberate act.  So, to a good
2e275c1420 2009-01-23       drh: approximating, we define forking to be by accident and branching to
2e275c1420 2009-01-23       drh: be by intent.  Apart from that, they are the same.
2e275c1420 2009-01-23       drh: 
2e275c1420 2009-01-23       drh: <h2>Tags And Properties</h2>
2e275c1420 2009-01-23       drh: 
2e275c1420 2009-01-23       drh: Tags and properties are used in fossil to help express the intent, and
2e275c1420 2009-01-23       drh: thus to distinguish between forks and branches.  Figure 5 shows the
2e275c1420 2009-01-23       drh: same scenario as figure 4 but with tags and properties added:
2e275c1420 2009-01-23       drh: 
2e275c1420 2009-01-23       drh: <center><table border=1 cellpadding=10 hspace=10 vspace=10>
2e275c1420 2009-01-23       drh: <tr><td align="center">
2e275c1420 2009-01-23       drh: <img src="branch05.gif"><br>
2e275c1420 2009-01-23       drh: Figure 5
2e275c1420 2009-01-23       drh: </td></tr></table></center>
2e275c1420 2009-01-23       drh: 
2e275c1420 2009-01-23       drh: A <i>tag</i> is a name that is attached to a check-in.  A
2e275c1420 2009-01-23       drh: <i>property</i> is a name/value pair.  Internally, fossil implements
2e275c1420 2009-01-23       drh: tags as properties with a NULL value.  So, tags and properties really
2e275c1420 2009-01-23       drh: are much the same thing, and henceforth we will use the word "tag"
2e275c1420 2009-01-23       drh: to mean either a tag or a property.
2e275c1420 2009-01-23       drh: 
2e275c1420 2009-01-23       drh: A tag can be either a one-time tag or an propagating tag or a cancellation.
2e275c1420 2009-01-23       drh: A one-time tag only applies to the check-in to which it is attached.  An
2e275c1420 2009-01-23       drh: propagating tag applies to the check-in to which it is attached and also
2e275c1420 2009-01-23       drh: to all direct descendants of that check-in.  A <i>direct descendant</i>
2e275c1420 2009-01-23       drh: is a descendant through direct children.  Tags propagation does not
2e275c1420 2009-01-23       drh: cross merges.  Tag propagation also stops as soon
2e275c1420 2009-01-23       drh: as it encounters another check-in with the same tag.  A cancellation tag
2e275c1420 2009-01-23       drh: is attached to a single check-in in order to either override a one-time
2e275c1420 2009-01-23       drh: tag that was placed on that same check-in, or to block tag propagation.
2e275c1420 2009-01-23       drh: 
2e275c1420 2009-01-23       drh: Every repository is created with a single empty check-in that has two
2e275c1420 2009-01-23       drh: propagating tags.  In figure 5, that initial empty check-in is check-in 1.
2e275c1420 2009-01-23       drh: The <b>branch</b> tag tells (by its value)
2e275c1420 2009-01-23       drh: what branch the check-in is a member of.
2e275c1420 2009-01-23       drh: The default branch is called "trunk".  All tags that begin with "<b>sym-</b>"
2e275c1420 2009-01-23       drh: are symbolic name tags.  When a symbolic name tag is attached to a
2e275c1420 2009-01-23       drh: check-in, that allows you to refer to that check-in by its symbolic
2e275c1420 2009-01-23       drh: name rather than by its 40-character SHA1 hash name.  When a symbolic name
2e275c1420 2009-01-23       drh: tag propagates (as does the <b>sym-trunk</b> tag) then referring to that
2e275c1420 2009-01-23       drh: name is the same as referring to the most recent check-in with that name.
2e275c1420 2009-01-23       drh: Thus the two tags on check-in once cause all decendents to be in the
2e275c1420 2009-01-23       drh: "trunk" branch and to have the symbolic name "trunk".
2e275c1420 2009-01-23       drh: 
2e275c1420 2009-01-23       drh: Check-in 4 has a <b>branch</b> tag which changes the name of the branch
2e275c1420 2009-01-23       drh: to "test".  The branch tag on check-in 4 propagates to check-ins 6 and 9.
2e275c1420 2009-01-23       drh: But because tag propagation does not follow merge links, the <b>branch=test</b>
2e275c1420 2009-01-23       drh: tag does not propagate to check-ins 7, 8, or 9.  Note also that the
2e275c1420 2009-01-23       drh: <b>branch</b> tag on check-in 4 blocks the propagation of <b>branch=trunk</b>
2e275c1420 2009-01-23       drh: so that it cannot reach check-ins 6 or 9.  This causes check-ins 4, 6, and
2e275c1420 2009-01-23       drh: 9 to be in the "test" branch and all others to be in the "trunk" branch.
2e275c1420 2009-01-23       drh: 
2e275c1420 2009-01-23       drh: Check-in 4 also has a <b>sym-test</b> tag, which gives the symbolic name
2e275c1420 2009-01-23       drh: "test" to check-ins 4, 6, and 9.  Because tags do not propagate across
2e275c1420 2009-01-23       drh: merges, check-ins 7, 8, and 9 do not inherit the <b>sym-test</b> tag and
2e275c1420 2009-01-23       drh: are hence not known by the name "test".
2e275c1420 2009-01-23       drh: To prevent the <b>sym-trunk</b> tag from propagating from check-in 1
2e275c1420 2009-01-23       drh: into check-ins 4, 6, and 9, there is a cancellation tag for
2e275c1420 2009-01-23       drh: <b>sym-trunk</b> on check-in 4.  The net effect of all of this is that
2e275c1420 2009-01-23       drh: check-ins on the trunk go by the symbolic name of "trunk" and check-ins
2e275c1420 2009-01-23       drh: that are on the test branch go by the symbolic name "test".
2e275c1420 2009-01-23       drh: 
2e275c1420 2009-01-23       drh: The <b>bgcolor=blue</b> tag on check-in 4 causes the background color
2e275c1420 2009-01-23       drh: of timelines to be blue for check-in 4 and its descendants.
2e275c1420 2009-01-23       drh: 
2e275c1420 2009-01-23       drh: Figure 5 also shows two one-time tags on check-in 9.  (The diagram does
2e275c1420 2009-01-23       drh: not make a graphical distinction between one-time and propagating tags.)
2e275c1420 2009-01-23       drh: The <b>sym-release-1.0</b> tag means that check-in 9 can be referred to
2e275c1420 2009-01-23       drh: using the more meaningful name "release-1.0".  The <b>closed</b> tag means
2e275c1420 2009-01-23       drh: that check-in 9 is a "closed leaf".  A closed leaf is a leaf that intended
2e275c1420 2009-01-23       drh: to never have any childred.
2e275c1420 2009-01-23       drh: 
2e275c1420 2009-01-23       drh: <h2>Review Of Terminology</h2>
2e275c1420 2009-01-23       drh: 
2e275c1420 2009-01-23       drh: Here is a list of definitions of key terms:
2e275c1420 2009-01-23       drh: 
2e275c1420 2009-01-23       drh: 
2e275c1420 2009-01-23       drh: <blockquote><dl>
2e275c1420 2009-01-23       drh: <dt><b>Branch</b></dt>
2e275c1420 2009-01-23       drh: <dd><p>A branch is a set of check-ins that have the same value for their
2e275c1420 2009-01-23       drh: <dt><b>Leaf</b></dt>
2e275c1420 2009-01-23       drh: <dd><p>A leaf is a check-in that has no children in the same branch.</p></dd>
2e275c1420 2009-01-23       drh: <dt><b>Closed Leaf</b></dt>
2e275c1420 2009-01-23       drh: <dd><p>A closed leaf is leaf that has the <b>closed</b> tag.  Such leaves
2e275c1420 2009-01-23       drh: are intented to never be extended with descendents and hence are omitted
2e275c1420 2009-01-23       drh: from lists of leaves in the command-line and web interface.</p></dd>
2e275c1420 2009-01-23       drh: <dt><b>Open Leaf</b></dt>
2e275c1420 2009-01-23       drh: <dd><p>A open leaf is a leaf that is not closed.</p></dd>
2e275c1420 2009-01-23       drh: <dt><b>Fork</b></dt>
2e275c1420 2009-01-23       drh: <dd><p>A fork occurs when a check-in has two or more direct (non-merge)
2e275c1420 2009-01-23       drh: children in the same branch.</p></dd>
2e275c1420 2009-01-23       drh: <dt><b>Branch Point</b></dt>
2e275c1420 2009-01-23       drh: <dd><p>A branch point occurs when a check-in has two or more direct (non-merge)
2e275c1420 2009-01-23       drh: children in the different branches.  A branch point is similar to a fork,
2e275c1420 2009-01-23       drh: except that the children are in different branches.</p></dd>
2e275c1420 2009-01-23       drh: </dl></blockquote>
2e275c1420 2009-01-23       drh: 
2e275c1420 2009-01-23       drh: Check-in 4 of figure 3 is not a leaf because it has a child (check-in 5)
2e275c1420 2009-01-23       drh: in the same branch.  Check-in 9 of figure 5 also has a child (check-in 10)
2e275c1420 2009-01-23       drh: but that child is in a different branch, so check-in 9 is a leaf.  Because
2e275c1420 2009-01-23       drh: of the <b>closed</b> tag check-in 9, it is a closed leaf.
2e275c1420 2009-01-23       drh: 
2e275c1420 2009-01-23       drh: Check-in 2 of figure 3 is considered a "fork"
2e275c1420 2009-01-23       drh: because it has two children in the same branch.  Check-in 2 of figure 5
2e275c1420 2009-01-23       drh: also has two children, but each child is in a different branch, hence in
2e275c1420 2009-01-23       drh: figure 5, check-in 2 is considered a "branch point".