Vincent Danen shows you how to use the git-cvs plugin to convert CVS repositories to Git and reap the rewards of speed and efficiency that come with it.
A few weeks ago I found myself in the unenviable position of having to extract a patchset from an upstream CVS repository that spanned multiple files in multiple directories. Having gotten so used to subversion's concept of one commit resulting in one easily retrievable diff, it shocked me to realize just how far we've come in terms of version control.
For instance, when you commit to CVS, each file has its own distinct revision which prevents you from doing a diff of the entire repository if you want to know what has changed in commit number 30. In Subversion and Git, if you commit three files at once, you can very easily get a diff of the changes from that one commit. With CVS, you have to determine the differing revision for each of those three files and diff them individually, even if they were committed at the same time.
This makes trying to get a single coherent patch of multiple files out of CVS annoying at best, and downright infuriating at worst. Luckily, you can use the git-cvs plugin to convert a CVS repository into a Git repository, and it will do the behind-the-scenes magic to make these single-file-commits into coherent repository commits.
You must have the cvs-git plugin installed; on Fedora this can be done by installing the git-cvs package: yum install git-cvs cvsps (the cvsps utility is required to create the CVS patchsets). Once this is done, you can point Git to a CVS repository:
$ git cvsimport -v -d :pserver:email@example.com:/sources/classpath classpath
The above takes the GNU Classpath CVS repository located at cvs.sv.gnu.org:/sources/classpath and turns it into a local Git repository located in ./classpath/. Depending on the size of the CVS repository you are converting, this can take hours (the above literally took about 12 hours to accomplish).
Once it is done, you can use the git log command to view a changelog of the commits. Gone are the CVS version numbers for each file; instead you will see each commit is assigned a unique identifier that git uses to distinguish commits. Also notice that commits done at the same time, across multiple files, are now finally available as single commits.
By looking at the output of git log, you can see the commit ID of the changeset you want; note it and the commit ID of the previous commit. Now you can generate a diff of that changeset using:
$ git diff [new_revision] [old_revision]
The output of this is the entire changeset grouped together. Being able to get the difference between multiple files for a single commit in one command versus scouring individual files, finding each one's individual updated revision, and then putting the individual diffs together is a huge time saver. Well, if you don't account for the 12 hours it took to convert the repository in the first place (in all fairness, there is a lot of data there; the git log on that converted repository alone is 6MB of text). Of course, the cvsps tool alone can generate these patchsets as well but the conversion to Git allows for using all the other neat Git-related functionality as well.
If, for some reason, you have stuck with CVS and have not moved to Subversion or Git, there are easy conversion tools that work quite well and will offer some great new features and capabilities that really make it worth the switch.
Get the PDF version of this tip here.