Artifact Content
Not logged in

Artifact a61e6b72f9b6ea8686f69afc13ec216890f93dac

File www/concepts.wiki part of check-in [d87ca60c58] - initial ports of static .html to static /doc .wiki by stephan on 2008-05-15 20:25:46. Also file www/concepts.wiki part of check-in [f94f7e5f49] - Merge the fork back together. by drh on 2008-05-16 00:27:49.

Fossil Concepts

1.0 Introduction

Fossil is a software configuration management system. Fossil is software that is designed to control and track the development of a software project and to record the history of the project. There are many such systems in use today. Fossil strives to distinguish itself from the others by being extremely simple to setup and operate.

This document is intended as a quick introduction to the concepts behind fossil.

2.0 Composition Of A Project

A software project normally consists of a "source tree". A source tree is a hierarchy of files that are used to generate the end product. The source tree changes over time as the software grows and expands and as features are added and bugs are fixed. A snapshot of the source tree at any point in time is called a "version" or "revision" or a "baseline" of the product. In fossil, we use the name "baseline".

A "repository" is a database that contains copies of all historical versions or baselines for a project. Baselines are normally stored in the repository in a highly space-efficient compressed format (delta encoding). But that is an implementation detail that you the user need not worry over. Think of the repository as a safe place where all your old baselines are securely stored away and available for retrieval whenever you need them.

A repository in fossil is a single file on your disk. This file might be rather large (dozens or hundreds of megabytes for a large or long running project) but it is nevertheless just a file. You can move it around, rename it, write it out to a memory stick, or do anything else you normally do with files.

Each source tree that is controlled by fossil is associated with a single repository on the local disk drive. You can tie two or more source trees to a single repository if you want (though one tree per repository is the most common configuration.) So a single repository can be associated with many source trees, but each source tree is associated with only one repository.

Fossil source trees may not overlap. A fossil source tree is identified by a file named "_FOSSIL_" in the root directory of the source tree. Every file that is a sibling of _FOSSIL_ and every file in every subfolder is considered potentially a part of the source tree. The _FOSSIL_ file contains (among other things) the pathname of the repository with which the source tree is associated. On the other hand, the repository has no record of its source trees. So you are free to delete a source tree or move it around without consequence. But if you move or rename or delete a repository, then any source trees associated with that repository will no longer be able to locate their repository and will stop working.

When multiple developers are working on the same project, each developer typically has his or her own local repository and an associated source tree in which to work. Developers share their work by "syncing" the content of their local repositories either directly or through a central server. Changes can "push" from the local repository into a remote repository. Or changes can "pull" from a remote repository into a local repository. Or one can do a "sync" which is a shortcut for doing both a push and a pull at the same time. Fossil also has the concept of "cloning". A "clone" is like a "pull", except that instead of beginning with an existing local repository, a clone begins with nothing and creates a new local repository that is a duplicate of a remote repository.

Communication between repositories is via HTTP. Remote repositories are identified by URL. You can also point a webbrowser at a repository and get human-readable status, history, and tracking information about the project.

2.1 Identification Of Artifacts

A particular version of a particular file is called an "artifact". Each artifact has a universally unique name which is the SHA1 hash of the content of that file expressed as 40 characters of lower-case hexadecimal. Such a hash is referred to as the Universally Unique Identifier or UUID for the artifact. The SHA1 algorithm is created with the purpose of providing a highly forgery-resistent identifier for a file. Given any file it is simple to find the UUID for that file. But given a UUID it is computationally intractable to generate a file that will have that UUID.

UUIDs look something like this:

6089f0b563a9db0a6d90682fe47fd7161ff867c8
59712614a1b3ccfd84078a37fa5b606e28434326
19dbf73078be9779edd6a0156195e610f81c94f9
b4104959a67175f02d6b415480be22a239f1f077
997c9d6ae03ad114b2b57f04e9eeef17dcb82788

When referring to an artifact using fossil, you can use a unique prefix of the UUID that is four characters or longer. This saves a lot of typing. When displaying UUIDs, fossil will usually only show the first 10 digits since that is normally enough to uniquely identify a file.

Changing (or adding or removing) a single byte in a file results in a completely different UUID. And since the UUID is the name of the artifact, making any change to a file results in a new artifact. In this way, artifacts are immutable.

A repository is really just an unordered collection of artifacts. New artifacts can be added to the repository, but existing artifacts can never be removed. Fossil is designed in such a way that it can be handed a set of artifacts in any order and it can figure out the relationship between those artifacts and reconstruct the complete development history of a software project.

2.2 Manifests

At the root of a source tree is a special file called the "manifest". The manifest is a listing of all other files in that source tree. The manifest contains the (complete) UUID of the file and the name of the file as it appears on disk, and thus serves as a mapping from UUID to disk name. The UUID of the manifest is the UUID that identifies a baseline. When you look at a "timeline" of changes in fossil, the UUID associated with each check-in or commit is really just the UUID of the manifest for that baseline.

Fossil automatically generates a manifest whenever you "commit" a new baseline. So this is not something that you, the developer, need to worry with. The format of a manifest is intentionally designed to be simple to parse, so that if you want to read and interpret a manifest, either by hand or with a script, that is easy to do. But you will probably never need to do so.

In addition to identifying all files in the baseline, a manifest also contains a check-in comment, the date and time when the baseline was established, who created the baseline, and links to other baselines from which the current baseline is derived. There is also a couple of checksums used to verify the integrity of the baseline. And the whole manifest might be PGP clearsigned.

2.3 Key concepts

3.0 Fossil - The Program

Fossil is software. The implementation of fossil is in the form of a single executable named "fossil". To install fossil on your system, all you have to do is obtain a copy of this one executable file (either by downloading a precompiled version or compiling it yourself) and then putting that file somewhere on your PATH.

Fossil is completely self-contained. It is not necessary to install any other software in order to use fossil. You do not need CVS, gzip, diff, rsync, Python, Perl, Tcl, Java, apache, PostgreSQL, MySQL, SQLite, patch, or any similar software on your system in order to use fossil effectively. You will want to have some kind of text editor for entering check-in comments. Fossil will use whatever text editor is identified by your VISUAL environment variable. Fossil will also use GPG to clearsign your manifests if you happen to have it installed, but fossil will skip that step if GPG missing from your system. You can optionally set up fossil to use external "diff" programs, though a perfectly functional "diff" algorithm is built it and works find for most people.

To uninstall fossil, simply delete the executable.

To upgrade an older version of fossil to a newer version, just replace the old executable with the new one. You might need to run a one-time command to restructure your repositories after an upgrade. Check the instructions that come with the upgrade for details.

To use fossil, simply type the name of executable in your shell, followed by one of the various built-in commands and arguments appropriate for that command. For example:

fossil help

In the next section, when we say things like "use the help command" we mean to use the command name "help" as the first token after the name of the fossil executable, as shown above.

4.0 Workflow

  1. Establish a local repository using either the new command to start a new project, or the clone command to make a clone of a repository for an existing project.