Web site index

Home

Version control



What follows is way outdated. I leave it here for historical purposes, though maybe I should delete it. Since a while ago (early 2014), I switched to git. Bazaar is no longer well maintained and either git or mercurial offer many advantages over Bazaar. I debated between git and mercurial (I liked mercurial better), and finally settled for git since it is the most widely used for projects I am most interested in (from BioConductor to many R packages and to Julia packages), because of the availability of github (bitbucket can use both mercurial and git), and because you can google your way around (virtually) any git question. There is great documentation for git freely available. For me a great place to start (and basically stop, since I just want to use git in fairly simple scenarios, not hack it) was Chacon's Pro Git.

What we want from our revision control system

General

Our code repositories are not that large, and are divided among "projects" (e.g., Tnasas, SignS), which are largely independent and are also in separate directories. Each directory rarely contains more than 200 to 600 files (we do not place the tmp directories on our repositories, of course), or more than 150 MB; however, the files that are often modified are generally between 10 and 20 files, and these files are rarely larger than a couple of hundred Kbytes. (In other words, that a version control system does not scale up to the Linux kernel is not that serious an issue for us).

It is important that development can continue, with all the revision control support, when we are off-line (in the train, behind impenetrable firewalls). In other words, it is not enough to check out a working copy to the laptop. Cherry-picking would be very convenient.

Additionally, it would be nice once and for all to use the same system to do the revision control for our code in Asterias and for our other projects (e.g., talks, courses, ad-hoc analyses for other people, etc). (E.g., that I use svn for asterias and RCS for my talks is a pain in the ass).

The following are some notes that document how/why I ended up choosing Bazaar-NG.

Specific requirements

Works on Linux, including i386 and amd64

This is a must, of course. Nicer if debian packages. bazaar-ng, mercurial, dars, montone fulfill all these. Installing Bazaar and Mercurial from source under Linux is a piece of cake.

Works on Windows

Nice if possible, with native windows support (i.e., not just cygwin-based). Nice when preparing, e.g., courses for windows audience.

Easy to use a remote repository

For central storage and backups.

Easy to set up version control anywhere, for anything in one command

Just like I used to do with RCS. Sure, you can do it if you want with CVS or SVN, but RCS was simpler. And you can store the control stuff in the very own directory if you want.

  • bzr init, bzr add
  • hg init, hg add
  • darcs init
  • mtn setup

Line-by-line history of a file

Documentation

  • darcs: great
  • mercurial: very good, and excellent command-line help and suggestions
  • monotone: great, with nice tutorial
  • bzr: not as good; docs are available, but a comprehensive manual and tutorial that include the latest stuff are missing.
  • svk: a book as work in progress

Emacs support

DVC mode; not as well supported for Darcs. See the Darcs wiki for details.

Options available

From what I've read, CVS (and SVN) are, so far, de facto standards and have been around for a long while. CVS has several problems. Subversion is supposed to be just "CVS done right" but it is not "the next great thing in version control" (it follows the same general approach as CVS). Thus, the "great thing" in version control will come from somewhere else. And many new approaches allow (or even revolve around) ditributed, not centralized, repositories. As well, there are many people working on better merging algorithms, cherry-picking, etc (some of these issues are, now, not that relevant for us, but could be in the future) which are also somewhat related to making branching (and eventual remerging) a piece of cake.

Many of the options currently available are < 1.0 and in early stages of development. Many still have serious bugs. Many are undergoing very fast changes. Comments on some options (see also the links) follow.

svk

On top of subversion. It adds a layer to subversion, so you have all the good and bad of subersion + some Perl code for the distributed stuff. But underlying model seems more complex than, say, Darcs (see links explaining what the workflow is, and compare that to the simplicity of Darcs or monotone or Bazaar-NG or Mercurial), and the idea of setting up repositories in every directory one wants to do version control is more of a pain in the ass. And true file renames do not seem to be available.

Monotone

Another great-looking project and it has excellent documentation (a manual with a tutorial included). It is still beta (version is 0.26), and there are some concerns about speed. It is C++ and Lua. As with Bazaar-NG, reading the email lists, it looks like many important design and implementation decissions remain to be made. This is reinforced by last release, which introduces major changes (both in underlaying formats and even how the commands are called) that require care when moving from previous versions. So we can expect both important changes and major improvements. Setting up a "repository" is a piece of cake and its workflow and usage is very, very simple. Will keep an eye on this project too.

Darcs

Writen in Haskell (so it might suffer from unpredictable extremely low performance). It seems relatively mature. I did not find, in the bug tracking or mailing lists, as many serious/critical bugs as for the other projects. There is, however, the prominent problem of the exponential time issue when doing some merges, and the difficulties when dealing with huge repositories (e.g., Linux kernel). These, however, do not seem such a big deal for us if we do frequent commits and we each work with little overlap. Darcs might actually undergo important changes to improve these two issues, but I do not expect major changes in what a user would have to type or mode of working. It is very well documented, and has an active community. I am not sure I like the "theory of patches" that underlies Darcs; some people argue that this is not such a big deal, but I think I get confussed. The workflow of Bazaar or Mercurial seem more "natural"; I don't think I got used to the Darcs-approach during my playing with Darcs. Setting up a "repository" is a piece of cake and its workflow and usage is very, very simple.

Mercurial

Many people are very entusiastic about it, and other extremely critical (see, e.g., comments by Kai here.) I've been using it, and I really like it. It is easy to use and the workflow seems very straightforwad and natural to me. Command-line help is great, and I specially like the "suggestions" on what command you are likely to want after the one you just typed. The wiki is well organized, and documentation is very easy to find. Moreover, the recent "Quickstart" and "Mercurial Usage", by Sébastien Pierre, provide a very helpful two-page summary of commands and workflow. The email list is very friendly and responsive. And, as advertised, mercurial it is lightning fast! I think people that like graphical tools will appreaciate hgct (a GUI tool for choosing what to commit); "hg view", from hgk, the interactive history viewer, is intuitive and very useful with convoluted repository histories.

Several big projects are using Mercurial. Recently, Open Solaris decided to use Mercurial after an open discussion about the pros and cons of different systems, Theodore Tso is using Mercurial for e2fsprogs, and Xen is also using Mercurial.

I am a little bit concerned about the lack of full support for renames, though, in particular that renames are not used in merges. Developers are working on it, and just today (2006-05-17) it was mentioned on the list that they are trying to secure funding for this specific issue so it is likely that it will be dealt with soon.

My last two contenders were Mercurial and Bazaar, and the rename issue was what eventually led me to choose Bazaar.

Bazaar-NG

Very nice system, which seems to incorporate a lot of great ideas from other systems. (And it is written in Python). Has support from Canonical, the people that support Ubuntu. Documentation is not as great as, say, Darcs or Monotone, though developers are aware of this and thus it is likely that this will be improved. Setting up a "repository" is a piece of cake and its workflow and usage is very, very simple and, at least to me, it seemed very "natural". Cherry-picking works in the way I'd expect (there is an example in my notes on the use of Bazaar).

The email list is very active and helpful and there seems to be a good group of additional contributors (e.g., plugins' authors). Plugins do, in fact, provide a lot of extra functionality, and I particularly enjoy several of them (bzrtools, gannotate, bzrk, show-paths).

Several well known projects are using Bazaar; some of them are closely related to Ubuntu, but others aren't.

My main concerns with Bazaar relate to speed. Bazaar and Mercurial are very, very similar in terms of commands issued, but Mercurial is often much faster (by factors of 5 to 8x in informal tests I did). For me, this is most annoying when branching or clonning to try new things; operations that take 2 seconds with Mercurial can take 15 seconds with Bazaar, and that sometimes upsets my "train of thought". But then looking at the big picture, most of the speed differences are largely irrelevant (I mean, what is time consuming is writing code and debugging; that "bzr commit" takes 2 or 10 seconds is irrelevant at the end of the day). Developers are well aware of speed issues, and most of the work from the current version (0.8) till version 1.0 will deal with improving speed.

I finally decided to choose Bazaar. My final contenders were Bazaar and Mercurial. Both have a very similar command set, a similar workflow, and both are intuitive and natural to work with. Both use Python (mercurial also uses some C), and it is very easy to go from one to the other (I often tried the same examples with both, just switching "bzr" and "hg"). The pros for Mercurial were speed, documentation, and help. The big pro for Bazaar was the support for renames. And since renames is something I do often, Bazaar is the system I finally choose.

In case it is of any help, I have prepared some notes on the use of Bazaar ; they are focused on our most common use of bzr, and some sections largely overlap what is available on the Bazaar wiki. Once I fix and eliminate the redundand parts, I'll put these notes in the wiki. Feel free to grab the original rst (reStructuredText) file for this html (but remember to give credit where credit is due).

Other systems

There are many others, but I did not look into them too much. They did not satisfy some of the important requirements for me, or seemed obscure or a pain to work with. Reasons why I did not look further into things such as Arch (including Arch 2.0), Git/Cogito, Arx, Codeville, etc, etc, are well detailed in the links below.


Some links with additional information

These are my bookmarks for revision control.

Date:2006-05-17 (third revision; first version: 2006-05-02).



Creative
Commons License
This page is copyright, ©, by Ramón Díaz-Uriarte, and is licensed under a Creative Commons License.