With the Git-craze currently flourishing it’s hard to see that there’s competition in distributed versions control systems (DVCS) space. In fact there is a strong contender that I’ve had the privilege of getting to know more intimately during the last couple of months. Mercurial was born around the same time as Git, in 2005 when an open and free distributed version control system for maintaining the Linux kernel was in urgent need. It was finally Git that got the honor, but that doesn’t mean Mercurial hasn’t been entrusted any projects of significance. Among the Mercurial users today are OpenJDK, Netbeans and Mozilla. Open source project hosting provider Google Code adopted Mercurial rather than Git after careful analysis.
I’m a bit of “late adopter” when it comes to DCVS:s, but in the unlikely case there is even later adopters – here’s a brief summary of the whole “distributed” thing: version control systems are something all developer are (should?) be very familiar with. The most common ones such as old rusty CVS and the more modern Subversion are based on the client-server model – developers work on a working copy of a project stored on a central repository on a server. The client working copy is updated from the remote repository and the fruits of the developer’s labor are committed the other way around. The central repo stores all revisions of the project ever committed including commit comments, tags and branches. The working copy is essentially little more than a particular revision from the central repo plus the developer’s outstanding uncommitted changes against this version.
Enter distributed version systems. The idea is that developers no longer work with plain working copies of projects, instead they work with full repositories. These distributed repositories are no longer heavy weight, singular entities on the lines of a Subversion server, rather they are small manageable things that lives beside the working copy in the file system (in the case of Mercurial in the
.hg sub directory of the working copy folder). The repo can easily be zipped up and mailed somewhere. DVCS:s however provide native functionality for cloning repos between different computers (over a network) or just between different file paths on the same machine. Developers still commit and update the working copy as usual but the counter part is now the private local repo. Changes to the repo are pushed to other repos (again over the network or on the same machine). The other way around is known as a pull. There is no pre-determined repository topology, developers are free to push and pull between different repos in any direction, at any time.
A distributed version system is a generalization of your typically vanilla client-server version system. You could use a DVCS in a completely centralized way, always pushing changes to the local repo to a central repo after each commit. A distributed version control system however lends itself to a distributed development process well suited for open source projects, and provides some other cool possibilities:
1. Carry around entire projects, including the complete commit history, tags and branches, on a flash stick. Ideal in academic/corporate environments where restrictive firewalls makes it problematic to connect to remote repositories.
2. Work the same project with two different IDE:s. Having two of the Java heavy-weights (Eclipse, Netbeans or IntelliJ) share project folder is not something I’d recommend. A better approach is to clone separate repos for each IDE and push/pull deltas between them to heart’s content. I’ve used this approach with Java Swing apps – utilizing the excellent GUI builder Matisse in Netbeans to lay out the screens while doing the rest of the coding in my preferred IDE, Eclipse.
3. Speed up deployments. Modern web application frameworks tend to have many dependencies and thus produce voluminous deployment artifacts. On a typical ADSL-hookup a 50MB WAR (a realistic figure for a Grails app the uses a couple of plugins) could take several minutes to upload to a deployment server. A better approach is to clone a repo on the deployment server(s) and simply push the delta since the previous deploy. The WAR can then be built on the server faster than it would have taken to build it locally and then upload it.
4. Add version control to anything, anywhere with ease. Transforming an existing directory structure into the working copy of a full-blown pushable, taggable, revertable and diffable repository is trivial and unobtrusive. This can be immensely powerful.
Want to start tracking the system configuration of a Unix system? Simply:
~# cd /etc
/etc# hg init
Changed your mind and want to revert to the previous non-version-controlled state and remove all traces of the version control?
/etc# rm -rf .hg/