Open Source

Use open source Subversion for personal document management

The open source version control system Subversion has rapidly become a developer favorite. Chad Perrin explores some of Subversion's features and explains how it can help you not only manage versions of source code but also personal documents.

This article is also available as a TechRepublic download.

There is an open source version control system, or revision control system, known as Subversion (svn for short) that has rapidly become a favorite of developers. It enjoys an excellent reputation and a wealth of free, online documentation, as well as a growing body of published texts on the subject of its efficient and practical use. It is stable, flexible, capable, security-conscious, free, open source software, and scales well for any size project.

The previous king of open source version control was CVS, the Concurrent Versioning System. Subversion began as an attempt to build upon the solid, respected foundation of the venerable CVS, and to improve upon it based on the lessons of years of widespread CVS use. It has succeeded in all respects, if its ever-growing popularity is any indication.

Thanks in part to the nearly transparent use of Subversion, the high number of available client applications across a number of operating system platforms for it, and Subversion's low overhead and ease of administration, version control isn't just for source code anymore. Subversion provides an excellent near-real-time backup system for any directory structure whose contents can be described in some way as a "project" or as a collection of projects. A growing number of BSD, Linux, and OpenSolaris users are keeping document directories in version control with Subversion, in addition to its more traditional use for software developers.


While a one-click installer for Windows is available, part of the reason for the runaway success of Subversion in open source communities must surely be the easy access in BSD and Linux software archives. In Debian GNU/Linux, for instance, installing Subversion, complete with a command-line client and administrative interface, is as simple as typing apt-get install subversion when signed in as the root user.

Similarly, the YUM package manager that is standard on Fedora Core Linux installs it by way of the command yum install subversion. For FreeBSD, either pkg_add subversion or make -C /usr/ports/devel/subversion install clean achieves the same effect, depending on whether you want to install from a binary package or from source. Even MacOS X provides a software archive from which Subversion can be installed.

Version control is superior to more mundane backup utilities for certain tasks. Chief among these is, of course, source code control for software development projects. Another is document management, where document and directory contents change regularly due to user activity, necessitating a means of undoing deletions and viewing older versions. This is accomplished by way of a changelog, commonly called a "revision history" in version control system jargon.

A mechanism for automatic revision history management is probably most likely to be familiar to non-programmers because the most famous examples of wiki software employ such a technique for tracking changes to content and allowing undesirable changes to be reversed.

As part of the revision history mechanism, a version control system such as Subversion not only maintains a central data repository copy of the current version of files that have been entrusted to version control, but also maintains a log of changes that have been made from the present all the way back to the moment the files entered version control. Anyone who has been doing software development work for very long should be able to tell you how important the ability to roll back a file to a known-good state can be. This is in fact the central feature of any version control software: the primary reason it exists.

Subversion does this and much more. For instance, it also provides the ability to resolve version conflicts when two people have been editing the same file at the same time. In the real world, users who employ good practices such as making regular commits when working on files in version control, and updating local copies before committing changes, rarely run afoul of others' work. That rarity is nonetheless accounted for by Subversion, with conflict resolution features built in. It also supports easy branching of modified versions of the main development trunk, merging of divergent development branches, varying levels of checkout and update permissions for various classes of user, and a number of other useful features that project managers often find invaluable.

Personal document management

Another benefit of version control systems is that they allow you to work on a single project from a number of different locations, using a number of different computers, without having to keep any USB storage devices or CD-RW media on you at all times. As long as you have a version control client installed on the computer where you're going to work and have access to the server where the version control magic happens, you can check out the current version of the project and get to work.

Because of the fact that multiple copies of the same data are automatically synchronized to the same state when the checked out copy is updated on multiple client machines, a version control system like Subversion can also serve as an excellent backup system for a collection of files. This covers your everyday personal documents as well as source code; that is, if you interpret "project" to mean any relatively small collection of data—small enough so that you don't require a bandwidth optimized weekly backup to minimize the time spent copying your data. A personal documents directory usually fits this description perfectly, especially when you don't keep many files that tend toward multiple-megabyte file sizes (such as music, video, and high resolution image files).

If you are the type of computer user who understands that regular backups are extremely important as a precaution against hardware or file system failures, but just find yourself putting off regular backups because of the effort involved in configuring a traditional backup system or copying data to huge stacks of CD-R media, Subversion could be just what the doctor ordered. The simplicity of a tool like Subversion for personal document backups can save you from yourself, or at least from your own tendency to procrastinate, and all you need is a second computer running the Subversion server software.

Because Subversion is not tied to a single, purpose-specific graphical user interface the way many proprietary systems like Visual SourceSafe and ClearCase are, it is easily adapted to nonstandard uses such as standard document control as well. You can still have your GUI environment, however, because there are a number of stand-alone GUI clients for Subversion, and Subversion has been integrated with a number of other GUI tools, such as Eclipse and even Microsoft's Windows Explorer file browser, via the TortoiseSVN client.

Configuration and setup

For document management with Subversion it is most likely that you will want to use TortoiseSVN on a Microsoft Windows system or the Subversion client software available through your operating system's software management toolset if you use a free UNIX-like OS. Installing the basic Subversion software available through your BSD or Linux software archives will provide you with both the server software and the command line client and admin tools, so you do not need to install anything else on either your Subversion server or your desktop machine unless you want a graphical user interface client.

To set up a Microsoft Windows server for Subversion, you will most likely want to use the SVN 1-Click Setup installer available from the Web site of, the maintainers of the Subversion project. Many MS Windows users who make use of Subversion choose to use a Linux or BSD server instead of a Windows server, however, and thus avoid the necessity of a separate installer.

Once you have the server software installed on your server, you need to create a version control repository. The following examples assume a UNIX-like shell command environment. Terms in [brackets] can be changed to suit your needs and preferences. Do not type the brackets. The '#' hash mark indicates that you are logged in as root, or using sudo for administrative access. Where [nnnn] appears, use the userid value for the user account you want to have access to the subversion repository.

Listing A

# addgroup [svn-users]
# usermod -u [nnnn] -G [svn-users]
# mkdir -m 770 [/home/svn-repos]
# chgrp [svn-users] [/home/svn-repos]
# svnadmin create —fs-type fsfs [/home/svn-repos]

Note that it is important to specify "fsfs" for the —fs-type option. This is because originally the Subversion repository file system used a database called "bdb" by default, which proved to be somewhat unstable. The new default is to use fsfs, but some older Subversion releases and some nonstandard distributions of it may still use bdb. To be on the safe side, just specify the fsfs database format whenever you create a new repository, and you will ensure greater stability.

Once you have the repository directory set up on the server, you can create the actual project repository inside that directory. Perhaps counter intuitively, you shouldn't do this from inside the repository, however. Instead, Listing B shows you how to create the repository inside your repository directory by creating a separate directory, containing a throw-away file, which will then be "checked in" to create the project repository itself.

It is a good idea to do this while logged in as the user account whose userid you specified in the previous set of commands. A good place to follow these steps would be in your user account's home directory. The use of a non-root user account is indicated by the '$' dollar sign prompt in the following example. Again, [brackets] around terms indicates that you can change them to suit your needs and preferences. Remember to leave the brackets themselves out when typing these commands.

Listing B

$ mkdir [project]
$ cd [project]
$ touch [file.txt]
$ svn import -m "[importing file to create project]" . file:///[home/username/svn-repos/project]

Once you have finished creating the project repository on the server, you are done. You can delete the project directory you just created before importing it to the repository, because the data in that directory is now safely stored on the server. Since the repository was created from a single empty file, created with the touch command, there's not much to lose by deleting it anyway.

On the client machine where you want to be able to work on your project or have access to your documents, you will then need to check out the contents of the repository to create a local copy. The same guidelines for using these commands for your own purposes as described above apply here, as well. As with the previous examples, this assumes you are using a free UNIX-like operating system such as a BSD, a Linux distribution, or OpenSolaris.

Listing C

$ cd [/home/username]
$ mkdir [svn-local]
$ cd [svn-local]
$ svn co svn+ssh://[hostname/home/username/svn-repos/project project]

At this point, it is normal to need to enter your password for the system where the repository is kept three times. This example assumes, in addition to your choice of operating system, that you have a fully functional SSH client installed on the client system and SSH server installed on the server system. The svn command knows how to "tunnel" its network requests through SSH to provide a secure, encrypted connection between the client and server systems, protecting your username, password, and data from malicious security crackers who might be "listening in" on your network.

The svn-local directory is not a necessity. You can simply place your newly checked out project or document directory directly in your user account's home directory, if you prefer. It is usually a good idea to keep directories whose contents are maintained in a Subversion repository separate from those directories that are not similarly backed up, however, to help eliminate confusion. In all the previous examples, you obviously do not need to name your directory of important data "project," you do not need to name your initial empty file "file.txt," and so on.

Once you have checked out a local copy of the project, you can remove the empty file from the project directory on the client machine. You should then add other files that you want to store in Subversion into the directory, and check them into the Subversion repository on the server. Listing D is an example of performing these operations.

Listing D

$ cd [/home/username/svn-local/project]
$ svn rm file.txt
$ cp -R [/home/username/project/*] .
$ svn add `ls -R`
$ svn ci -m "[added all base files to the project directory]"

The cp -R command copies all contents of a directory of important data from that directory (in this case, /home/username/project) to the current directory (specified by the single period). The svn add `ls -R` command adds all the contents of this directory and all its subdirectories to version control on the local copy of the project directory. The svn ci command commits the current copy to the main repository on the server. After the add command, the svn client tool will show you a list of all the files you have added, with a capital letter 'A' at the beginning of each line, indicating that they have been Added. The ci command will ask you for your password to ensure you have authorization to commit to the repository.

Using Subversion

Various graphical user interface clients will vary, from one client to the next, in how they are used. In using the command line client, however, the most important and basic commands for updating, managing, and committing your project files are as follows:

  • svn up — updates your local copy from the main repository. Updating before starting work on any files and, if others also have access to the repository, updating again just before you commit is a very good habit to have.
  • svn status — tells you which files have been modified, which have been added but not yet committed, which have been created but not yet added to version control, which have mysteriously vanished without being removed from version control (most common when you forget to use svn rm instead of simply rm), and which have been properly removed from version control, among other states.
  • svn add — adds files to version control, as already demonstrated.
  • svn rm — removes files from version control. This also actually deletes the files from the local filesystem.
  • svn mv — moves a file from one place to another in your local copy of the repository without causing confusion for the Subversion database. It works basically just like the standard mv command, but is used in local copies of projects under version control.
  • svn mkdir — creates a directory, just like the normal mkdir command, but also automatically adds the directory to version control at the same time.
  • svn ci — commits all current changes in the local copy of the repository, in the current directory and in all subdirectories within it, to the main repository. Before this can work, you must resolve all question marks (unknown file statuses) reported by the svn status command. When executed without any arguments, svn ci will open your default text editor to enter a commit message that can be used later if you need to find files from an older revision of the project. The -m option can be used to specify a commit message from the command line, as demonstrated in the above examples of how to create and use a Subversion repository. The purpose of this is to make a note to yourself about the committed changes in case you have to sort through earlier changes at some point later on.
  • svn help — gives you handy usage information for the Subversion command line client, such as more in-depth explanations and examples of how to use each of these svn commands, as well as information on more commands than these. You can get more information about specific svn commands by typing (for instance, if you want information about the add command) svn help add.

This will not make you an instant Subversion guru. It may however serve as a useful introduction to version control with Subversion, and the use of a version control system is one of the most important skills for any serious programmer. More immediately, the example tasks in this article can provide a very real, tangible benefit, even if you're not a programmer.


Chad Perrin is an IT consultant, developer, and freelance professional writer. He holds both Microsoft and CompTIA certifications and is a graduate of two IT industry trade schools.

Editor's Picks