522824b26a 2009-08-28 drh: <title>Fossil Performance</title> 2cc95180a0 2009-08-22 drh: <h1 align="center">Performance Statistics</h1> 2cc95180a0 2009-08-22 drh: 2cc95180a0 2009-08-22 drh: The questions will inevitably arise: How does Fossil perform? 2cc95180a0 2009-08-22 drh: Does it use a lot of disk space or bandwidth? Is it scalable? 2cc95180a0 2009-08-22 drh: 2cc95180a0 2009-08-22 drh: In an attempt to answers these questions, this report looks at five 2cc95180a0 2009-08-22 drh: projects that use fossil for configuration management and examines how 2cc95180a0 2009-08-22 drh: well they are working. The following table is a summary of the results. 2cc95180a0 2009-08-22 drh: Explanation and analysis follows the table. 2cc95180a0 2009-08-22 drh: 2cc95180a0 2009-08-22 drh: <table border=1> 2cc95180a0 2009-08-22 drh: <tr> 2cc95180a0 2009-08-22 drh: <th>Project</th> 2cc95180a0 2009-08-22 drh: <th>Number Of Artifacts</th> 2cc95180a0 2009-08-22 drh: <th>Number Of Check-ins</th> 2cc95180a0 2009-08-22 drh: <th>Project Duration<br>(as of 2009-08-23)</th> 2cc95180a0 2009-08-22 drh: <th>Average Check-ins Per Day</th> 2cc95180a0 2009-08-22 drh: <th>Uncompressed Size</th> 2cc95180a0 2009-08-22 drh: <th>Repository Size</th> 2cc95180a0 2009-08-22 drh: <th>Compression Ratio</th> 2cc95180a0 2009-08-22 drh: <th>Clone Bandwidth</th> 2cc95180a0 2009-08-22 drh: </tr> 2cc95180a0 2009-08-22 drh: 2cc95180a0 2009-08-22 drh: <tr align="center"> 2cc95180a0 2009-08-22 drh: <td>SQLite 2cc95180a0 2009-08-22 drh: <td>28643 2cc95180a0 2009-08-22 drh: <td>6755 2cc95180a0 2009-08-22 drh: <td>3373 days<br>9.24 yrs 2cc95180a0 2009-08-22 drh: <td>2.00 2cc95180a0 2009-08-22 drh: <td>1.27 GB 2cc95180a0 2009-08-22 drh: <td>35.4 MB 2cc95180a0 2009-08-22 drh: <td>35:1 2cc95180a0 2009-08-22 drh: <td>982 KB up<br>12.4 MB down 2cc95180a0 2009-08-22 drh: </tr> 2cc95180a0 2009-08-22 drh: 2cc95180a0 2009-08-22 drh: <tr align="center"> 2cc95180a0 2009-08-22 drh: <td>Fossil 2cc95180a0 2009-08-22 drh: <td>4981 2cc95180a0 2009-08-22 drh: <td>1272 2cc95180a0 2009-08-22 drh: <td>764 days<br>2.1 yrs 2cc95180a0 2009-08-22 drh: <td>1.66 2cc95180a0 2009-08-22 drh: <td>144 MB 2cc95180a0 2009-08-22 drh: <td>8.74 MB 2cc95180a0 2009-08-22 drh: <td>16:1 2cc95180a0 2009-08-22 drh: <td>128 KB up<br>4.49 MB down 2cc95180a0 2009-08-22 drh: </tr> 2cc95180a0 2009-08-22 drh: 2cc95180a0 2009-08-22 drh: <tr align="center"> 2cc95180a0 2009-08-22 drh: <td>SLT 2cc95180a0 2009-08-22 drh: <td>2062 2cc95180a0 2009-08-22 drh: <td>67 2cc95180a0 2009-08-22 drh: <td>266 days 2cc95180a0 2009-08-22 drh: <td>0.25 2cc95180a0 2009-08-22 drh: <td>1.76 GB 2cc95180a0 2009-08-22 drh: <td>147 MB 2cc95180a0 2009-08-22 drh: <td>11:1 2cc95180a0 2009-08-22 drh: <td>1.1 MB up<br>141 MB down 2cc95180a0 2009-08-22 drh: </tr> 2cc95180a0 2009-08-22 drh: 2cc95180a0 2009-08-22 drh: <tr align="center"> 2cc95180a0 2009-08-22 drh: <td>TH3 2cc95180a0 2009-08-22 drh: <td>1999 2cc95180a0 2009-08-22 drh: <td>429 2cc95180a0 2009-08-22 drh: <td>331 days 2cc95180a0 2009-08-22 drh: <td>1.30 2cc95180a0 2009-08-22 drh: <td>70.5 MB 2cc95180a0 2009-08-22 drh: <td>6.3 MB 2cc95180a0 2009-08-22 drh: <td>11:1 2cc95180a0 2009-08-22 drh: <td>55 KB up<br>4.66 MB down 2cc95180a0 2009-08-22 drh: </tr> 2cc95180a0 2009-08-22 drh: 2cc95180a0 2009-08-22 drh: <tr align="center"> 2cc95180a0 2009-08-22 drh: <td>SQLite Docs 2cc95180a0 2009-08-22 drh: <td>1787 2cc95180a0 2009-08-22 drh: <td>444 2cc95180a0 2009-08-22 drh: <td>650 days<br>1.78 yrs 2cc95180a0 2009-08-22 drh: <td>0.68 2cc95180a0 2009-08-22 drh: <td>43 MB 2cc95180a0 2009-08-22 drh: <td>4.9 MB 2cc95180a0 2009-08-22 drh: <td>8:1 2cc95180a0 2009-08-22 drh: <td>46 KB up<br>3.35 MB down 2cc95180a0 2009-08-22 drh: </tr> 2cc95180a0 2009-08-22 drh: 2cc95180a0 2009-08-22 drh: </table> 2cc95180a0 2009-08-22 drh: 2cc95180a0 2009-08-22 drh: <h2>The Five Projects</h2> 2cc95180a0 2009-08-22 drh: 2cc95180a0 2009-08-22 drh: The five projects listed above were chosen because they have been in 2cc95180a0 2009-08-22 drh: existance for a long time (relative to the age of fossil) or because 2cc95180a0 2009-08-22 drh: they have larges amounts of content. The most important project using 2cc95180a0 2009-08-22 drh: fossil is SQLite. Fossil itself 2cc95180a0 2009-08-22 drh: is built on top of SQLite and so obviously SQLite has to predate fossil. 2cc95180a0 2009-08-22 drh: SQLite was originally versioned using CVS, but recently the entire 9-year 2cc95180a0 2009-08-22 drh: and 320-MB CVS history of SQLite was converted over to Fossil. This is 45dbaa0c94 2009-08-22 drh: an important datapoint because it demonstrates fossil's ability to manage 2cc95180a0 2009-08-22 drh: a significant and long-running project. 2cc95180a0 2009-08-22 drh: The next-longest running fossil project is fossil itself, at 2.1 years. 2cc95180a0 2009-08-22 drh: The documentation for SQLite 2cc95180a0 2009-08-22 drh: (identified above as "SQLite Docs") was split off of the main SQLite 2cc95180a0 2009-08-22 drh: source tree and into its own fossil repository about 1.75 years ago. 2cc95180a0 2009-08-22 drh: The "SQL Logic Test" or "SLT" project is a massive 2cc95180a0 2009-08-22 drh: collection of SQL statements and their output used to compare the 2cc95180a0 2009-08-22 drh: processing of SQLite against MySQL, PostgreSQL, Microsoft SQL Server, 2cc95180a0 2009-08-22 drh: and Oracle. 2cc95180a0 2009-08-22 drh: Finally "TH3" is a proprietary set of test cases for SQLite used to give 2cc95180a0 2009-08-22 drh: 100% branch test coverage of SQLite on embedded platforms. All projects 2cc95180a0 2009-08-22 drh: except for TH3 are open-source. 2cc95180a0 2009-08-22 drh: 2cc95180a0 2009-08-22 drh: <h2>Measured Attributes</h2> 2cc95180a0 2009-08-22 drh: 2cc95180a0 2009-08-22 drh: In fossil, every version of every file, every wiki page, every change to 2cc95180a0 2009-08-22 drh: every ticket, and every check-in is a separate "artifact". One way to 2cc95180a0 2009-08-22 drh: think of a fossil project is as a bag of artifacts. Of course, there is 2cc95180a0 2009-08-22 drh: a lot more than this going on in fossil. Many of the artifacts have meaning 2cc95180a0 2009-08-22 drh: and are related to other artifacts. But at a low level (for example when 2cc95180a0 2009-08-22 drh: synchronizing two instances of the same project) the only thing that matters 2cc95180a0 2009-08-22 drh: is the unordered collection of artifacts. In fact, one of the key 2cc95180a0 2009-08-22 drh: characteristics of fossil is that the entire project history can be 2cc95180a0 2009-08-22 drh: reconstructed simply by scanning the artifacts in an arbitrary order. 2cc95180a0 2009-08-22 drh: 2cc95180a0 2009-08-22 drh: The number of check-ins is the number of times that the "commit" command 2cc95180a0 2009-08-22 drh: has been run. A single check-in might change a 3 or 4 files, or it might 5556630a9c 2009-08-23 bch: change several dozen different files. Regardless of the number of files 2cc95180a0 2009-08-22 drh: changed, it still only counts as one check-in. 2cc95180a0 2009-08-22 drh: 2cc95180a0 2009-08-22 drh: The "Uncompressed Size" is the total size of all the artifacts within 2cc95180a0 2009-08-22 drh: the fossil repository assuming they were all uncompressed and stored 5150b9de83 2009-09-04 drh: separately on the disk. Fossil makes use of delta compression between related 5150b9de83 2009-09-04 drh: versions of the same file, and then uses zlib compression on the resulting 2cc95180a0 2009-08-22 drh: deltas. The total resulting repository size is shown after the uncompressed 2cc95180a0 2009-08-22 drh: size. 2cc95180a0 2009-08-22 drh: 2cc95180a0 2009-08-22 drh: On the right end of the table, we show the "Clone Bandwidth". This is the 2cc95180a0 2009-08-22 drh: total number of bytes sent from client to server ("uplink") and from server 2cc95180a0 2009-08-22 drh: back to client ("downlink") in order to clone a repository. These byte counts 5150b9de83 2009-09-04 drh: include HTTP protocol overhead. 2cc95180a0 2009-08-22 drh: 2cc95180a0 2009-08-22 drh: In the table and throughout this article, 2cc95180a0 2009-08-22 drh: "GB" means gigabytes (10<sup><small>9</small></sup> bytes) 2cc95180a0 2009-08-22 drh: not <a href="http://en.wikipedia.org/wiki/Gibibyte">gibibytes</a> 2cc95180a0 2009-08-22 drh: (2<sup><small>30</small></sup> bytes). Similarly, "MB" and "KB" 2cc95180a0 2009-08-22 drh: means megabytes and kilobytes, not mebibytes and kibibytes. 2cc95180a0 2009-08-22 drh: 2cc95180a0 2009-08-22 drh: <h2>Analysis And Supplimental Data</h2> 2cc95180a0 2009-08-22 drh: 2cc95180a0 2009-08-22 drh: Perhaps the two most interesting datapoints in the above table are SQLite 2cc95180a0 2009-08-22 drh: and SLT. SQLite is a long-running project with long revision chains. 2cc95180a0 2009-08-22 drh: Some of the files in SQLite have been edited close to a thousand times. 2cc95180a0 2009-08-22 drh: Each of these edits is stored as a delta, and hence the SQLite project 2cc95180a0 2009-08-22 drh: gets excellent 35:1 compression. SLT, on the other hand, consists of 2cc95180a0 2009-08-22 drh: many large (megabyte-sized) SQL scripts that have one or maybe two 2cc95180a0 2009-08-22 drh: versions. There is very little delta compression occurring and so the 2cc95180a0 2009-08-22 drh: overall repository compression ratio is much lower. Note also that 2cc95180a0 2009-08-22 drh: quite a bit more bandwidth is required to clone SLT than SQLite. 2cc95180a0 2009-08-22 drh: 2cc95180a0 2009-08-22 drh: For the first nine years of its development, SQLite was versioned by CVS. 2cc95180a0 2009-08-22 drh: The resulting CVS repository measured over 320MB in size. So, the 2cc95180a0 2009-08-22 drh: developers were 2cc95180a0 2009-08-22 drh: pleasently surprised to see that this entire project could be cloned in 2cc95180a0 2009-08-22 drh: fossil using only about 13MB of network traffic. The "sync" protocol 2cc95180a0 2009-08-22 drh: used by fossil has turned out to be surprisingly efficient. A typical 2cc95180a0 2009-08-22 drh: check-in on SQLite might use 3 or 4KB of network bandwidth total. Hardly 2cc95180a0 2009-08-22 drh: worth measuring. The sync protocol is efficient enough that, once cloned, 2cc95180a0 2009-08-22 drh: fossil could easily be used over a dial-up connection.