Security

More control with CVS

Using Concurrent Version System (CVS) to manage your development projects is a wise choice and learning CVS will test your mettle as a developer. Let Vincent Danen help you to control CVS and unleash its power!


In the previous two Daily Drill Downs dealing with source code management, Checking it out: CVS! and Source Code Management: Installing CVS, we looked at the Concurrent Version System (CVS). We learned how to install CVS, and we started our first CVS-aware project, which was a fictitious Web site. We learned how to make new CVS projects (or modules), how to import data, and how to check it out to create what we call sandboxes: local repositories in which a developer does his or her work prior to checking it back into the primary repository. In the second part of this series, we also examined many of the commands you may want to use when dealing with CVS-controlled files.

Now we will take a look at some other ways to increase the power of CVS. We will discuss some keywords you can embed in your text files that CVS will expand with meaningful information to give you the status of a file at first glance. We will also see how to manipulate revision numbers to make them more meaningful. Finally, we will learn how to access CVS repositories remotely so that you can share your work with other users or become part of a development community yourself.

CVS keywords
The macros used by CVS, also called keywords, are words that appear in the text file with which CVS performs string substitution. If a keyword appears in the text, CVS will replace it with a string that the keyword represents. Keywords are of the form $keyword$ and must be placed in the text file in this fashion to be recognized by CVS. Once a keyword has been substituted the first time, CVS will continue to update the keyword string even if it is already expanded. Basically, you need enter the keyword only once for it to be a permanent (yet changing) part of your file.

The valid keywords that can be used are as follows:
  • Author: The user ID of the person who committed the revision
  • Date: The date and time, in standard UTC format, when the revision was committed
  • Header: The full path of the repository Revision Control System (RCS) file, the revision number, the commit date and time, the user ID that committed it, the file's state, and the lock holder's user ID if the file is locked
  • Id: A shorter form of $Header$, which displays only the filename, not the full path
  • Name: The tag name that retrieves the file—it will be empty if no explicit tag was given when the file was retrieved.
  • Locker: The user ID of the user holding a lock on the file—it will be empty if the file is not locked.
  • Log: The RCS filename—it will also display the entire revision log entries on the lines following the keyword entry. (The first following line will contain the revision number, the commit date and time, and the user ID that committed it. The next lines will contain the log entries in reverse chronological order. Each line is prefixed by the same characters that prefixed the keyword itself. This allows you to have the log in a format that will be seen as a comment by compilers and interpreters.)
  • Revision: The revision number of the file
  • Source: The full path of the RCS file
  • State: The state of the file as determined by the cvs admin -s command—if it is not defined, it will always be Exp.

For example, let's assume that you have a Web page called index.php and you want to put some information about the file in it, but you don't want the information to be viewable by the general public. You might use the following at the top of your file:
<? // $Id$ ?>

If you were to commit the file now and then open it once it was complete, you might see something similar to this in its place:
<? // $Id: index.php,v 1.2 2001/01/04 12:28:32 joe Exp $ ?>

This tells us that the RCS file is index.php,v, the revision number is 1.2, and the date and time of the commit was January 4, 2001 at 12:28:32 P.M. (UTC) by user joe. The RCS file contains all of the information on the current and previous revisions of the file. Since CVS is built upon RCS (we will take a look at RCS in a future Daily Drill Down), it uses the same file format, which is known as the RCS file, and the same convention.

Now let’s assume that you have written a C program in which you want to record the entire revision log. You would use the $Log$ keyword, but you must do it in a special way. Since C uses the /* character sequence to start a comment and the */ character sequence to end it, we need to place our $Log$ keyword between these characters, but not on the same line. Each line of the $Log$ keyword expansion will start with the same characters that started the original line. For instance, if you were to use this code
/* $Log$ */

your expanded keyword might look like this:
/* $Log: foobar.c,v $
/* Revision 1.2 2001/01/04 12:28:32 joe
/* Example of how not to use the Log keyword
/* */


If you tried to compile this code, you would get all kinds of errors because there are four beginning comment tokens and one ending comment token. A better way to accomplish the same thing would be to use this code instead:
/*
 * $Log$
 */


Now this code, when expanded, will look like so:
/*
 * $Log: foobar.c,v $
 * Revision 1.2 2001/01/04 12:28:32 joe
 * Example of how to use the Log keyword
 */


This comment is proper and will not generate errors when you compile the program.

Typically, what I like to do with the files I have under CVS control is to use the $Id$ keyword alone. This gives me enough information that when I glance at a file, I can quickly see who last committed it, when they did so, and what the current revision number is.

Changing the revision number
Each file in the CVS repository has its own revision number. Often, these numbers are not in sync with each other because some files have changed more often than others. When the CVS repository for a given module is first created, all files in the module are assigned an initial revision number of 1.1. After each subsequent revision, the second number is increased by one.

The revision number is made up of two parts: The first number is the major number, followed by a decimal point, and then the second number, which is the minor number. So each file will always have a major number and minor number. It is the minor number that increases with each new revision. When a new file is added to the repository, CVS gives it a new revision number based on the highest major number in the directory with a minor number of 1. So if most of your files had a revision of 1.x but one or two had a revision of 2.x, the new file would be assigned a revision number of 2.1.

There may come a time, however, when you want all files to be consistent, and you want each file to have the same revision number. Let's imagine for a moment that you are writing the next killer Linux application, and you made your debut release a few months ago. At that time, your program was not very amazing so you gave it a 1.0 release number. In the meantime, however, you have changed and enhanced your program, and now you want to release your 2.0 version. Unfortunately, all of your files are in an array of revision numbers ranging from 1.3 to 1.58. To clean things up a little bit, you want to give every single file in the repository a new revision number of 2.0.

To accomplish this, you must execute the commit command and specify a new revision number like this:
cvs commit -r 2.0

This will instruct CVS to go through every file in the directory and assign each file a new revision number of 2.0. The only thing you must keep in mind is that the new revision number you assign must be higher than any previous revision number. If you already have a file with a revision number of 2.1, then you will only be able to use a revision number of 2.2 or higher.

The other option is to use a tag to identify snapshots of code. This will allow you to retain individual revision numbers while still making a coherent group of files for a release. For instance, by making every file have the same revision number, you can easily determine which files belonged with a particular release. You can, in the future, retrieve every file in the state it was when the 2.0 release was made. The other option, and one that may work better, is to tag the code at a particular point in time.

By tagging the code, you can retain individual revision numbers while still having the benefit of a coherent release system. Even though you have one file with a revision of 1.1 and another with 1.58, you can still retrieve the source code files as they were for the 2.0 release of your project, regardless of how much time has elapsed since the 2.0 release was made. To tag the source tree thusly, we invoke the tag command like this:
cvs tag rel-2.0

This will tag all files with the release tag name “rel-2.0.” In the second Daily Drill Down of this series, when we first imported Joe's Web site, we gave it a release tag of “base.” This is a similar concept. If you wanted to check out the files as they were when you first imported them into the repository, you could do so using the following command:
cvs checkout -r base website

This tells CVS to check out the module named “website” at the time of the release tag “base.” Likewise, in our previous example when we created a new tag called “rel-2.0," we could check it out in a similar fashion:
cvs checkout -r rel-2.0 myproject

This will tell CVS to check out the module “myproject” at the time of the release tag “rel-2.0.” Using release tags in this fashion may be more useful than changing revision numbers. If you do decide to change revision numbers, you should also create a release tag afterwards. This will allow you to easily check out certain snapshots of code, as we've illustrated. You can also use the update command to accomplish the same thing by using
cvs update -r rel-2.0 myproject

This command will update all of the files in your sandbox to the state they were in when “rel-2.0" was tagged.

Accessing CVS remotely
Having a local CVS repository might be ideal for single-user situations, but more often than not, if you're using CVS, it's for collaboration between developers. There are a few ways you can access CVS remotely, but not all of them are good. CVS traditionally used rsh for remote access, but I won't even discuss that because I do not believe in using rsh, or any of the other r-utilities, for that matter, due to their extremely insecure nature.

Later versions of CVS have built-in support for using ssh instead of rsh. I highly recommend using ssh as the method of communicating with a remote CVS server. Preferably, this would be your only way of connecting to a remote CVS server, as it offers the greatest security and protection for your data. Let's take a look at how to use ssh to connect to a remote CVS server.

The first things you must have are both CVS and ssh installed on your local computer. For the sake of remaining organized, create a new subdirectory in your home directory; we'll call ours ~/cvs. Then, let's make a few shell scripts to make life easier. The first we'll simply call checkout, and we’ll use it to check out CVS modules:
#!/bin/sh
CVSROOT=:ext:joe@remotecvs.com:/usr/local/cvsroot
CVS_RSH=ssh
export CVSROOT CVS_RSH
cvs -z3 checkout $1


This script is quite simple and extremely basic. We have to set up the CVSROOT environment variable first, and it may look a little different from what you've seen before because we're connecting to a remote server. Basically, the string :ext:joe@remotecvs.com:/usr/local/cvsroot indicates to CVS that the repository is external. The next part of the string is a typical ssh path: It indicates the user joe on the server remotecvs.com, with a path of /usr/local/cvsroot, which is the CVSROOT of the remote server. The next environment variable we set up is CVS_RSH, which tells CVS we are using ssh to do the transfer. Then we export both environment variables to make them available to the shell.

The next command is a typical CVS checkout command, with one notable exception. The -z3 command-line option indicates to CVS to use compression to transfer files. This will conserve bandwidth between our system and the remote repository and make the transfer faster. Finally, the $1 is a placeholder for the name of the module to check out, which means that you must execute the checkout script like this:
./checkout website

where the module you want to check out is called “website.”

Let's make two more scripts that are similar to the first one. The second script we will call commit, and we will use it to commit our changes to the repository. The third script we will call update, and we will use it to update our repository to the latest source files in the remote repository.

The commit script looks like this:
#!/bin/sh
CVSROOT=:ext:joe@remotecvs.com:/usr/local/cvsroot
CVS_RSH=ssh
export CVSROOT CVS_RSH
cd $1
cvs -z3 commit
cd ..


The update script looks like this:
#!/bin/sh
CVSROOT=:ext:joe@remotecvs.com:/usr/local/cvsroot
CVS_RSH=ssh
export CVSROOT CVS_RSH
cd $1
cvs -z3 update
cd ..


Both scripts use the module name to update or commit to as the only argument. The only other requirement for this kind of setup is that you must have ssh access to the remote CVS server. For public projects, you may not get ssh access to the remote CVS server easily, and you may have to rely on the CVS pserver, which is a password-protected server to allow CVS access. Unfortunately, pserver is not very secure because it transmits the information and your password in clear-text form. We will take a look at pserver in the next part of this series.

Conclusion
You should now know enough about CVS to make very heavy use of it. From this and the previous two Daily Drill Downs in this series, you have gotten a quick and comprehensive look at CVS and how it works. Now you know about keyword substitution, which will help you quickly identify file information for a CVS-managed file. You also know about changing release information and creating snapshots of code by using release tags. Finally, you understand the most secure method of connecting to a CVS repository remotely by using ssh.

In the next Daily Drill Down of this series, we will take a look at a few other methods of connecting to a CVS repository, including the pserver program included with CVS, and using cvsweb, a Web interface to the CVS repository.
The authors and editors have taken care in preparation of the content contained herein but make no expressed or implied warranty of any kind and assume no responsibility for errors or omissions. No liability is assumed for any damages. Always have a verified backup before making any changes.

About Vincent Danen

Vincent Danen works on the Red Hat Security Response Team and lives in Canada. He has been writing about and developing on Linux for over 10 years and is a veteran Mac user.

Editor's Picks

Free Newsletters, In your Inbox