By Gerald M. Weinberg
Contract professionals who work with software often find themselves acting as code archaeologists, digging through old code to try to understand the code itself. At the same time, if they're attentive, they can often learn about the culture that produced the code. In doing so, they may learn useful things about the organization they've been hired to work in.
A little archaeology among my old writings suggests I named the field of "code archaeology" at least as far back as the summer of 1983, though of course, many of us had been practicing it for some years before that. Here are some of the things I've dug up.
Contract Professional on TechRepublic
Contract Professional (CP) is a monthly magazine written specifically for contract IT workers. CP provides information on technology trends, training and job opportunities, tax and financial issues, and contractor-friendly cities. To read more, visit Contract Professional online. This article is the first of a two-part series that appeared in an earlier issue of CP.
Different ways of doing the same thing
One of the earliest examples I can recall was the multiple ways of editing and converting more or less the same input in the original FORTRAN, which had different algorithms for converting input number and code constants to floating point. This discrepancy arose because of the social organization of the original FORTRAN team—the input routines were developed by different people than those who developed the code translation algorithms.
The existence of many "temporary" structures
A graduate student of mine and I discovered in our research that error-prone modules could often be identified by the string "TEMP" among one or more of their names—temporary variables which remain in the code as delivered. I suspect these tell us about the haste in which the work was done and the lack of review.
Routines that transform interface data before and after connection
These routines could have been eliminated entirely by some agreement on interfaces in advance. For example, permuting the order of parameters in a subroutine call or the order of items in a message is perhaps the most revealing instance of this type, indicating total lack of cooperation among developers and unwillingness to budge a millimeter.
A plethora of programming languages used in the same system
By this I mean more than one, maybe two in special circumstances. I once "dug" a system that had 14. This usually indicates a total lack of management control over the development project, so each programmer uses a favorite language. It's not exactly code archaeology, but an even clearer indication of this problem is the inability to find anyone in an organization who can say how many programming languages are used in their system.
Structure without nesting
A reader once sent me a code artifact from a very large bank that had COBOL standards that included "use structured programming techniques" and "avoid nested IFs." The code contained stuff like this:
IF INPUT-CODE = "A" PERFORM A-PROCESSING.
IF INPUT-CODE = "B" PERFORM B-PROCESSING.
IF INPUT-CODE = "C" PERFORM C-PROCESSING.
....<for long lists>
The code certainly seems to meet the standards, but it misses the point. They probably meant "avoid overly nested IFs." Apparently, this organization couldn't follow the simple two-word standard, "Be reasonable."
Rat's nesting without structure
A correspondent once showed me an engineering package with pages of FORTRAN code having a general schema like this:
IF(INDEX .NE. 1)GOTO 100
100 INDEX = 1
Instead of this:
IF(INDEX .NE. 1)WRITE(5,6)
INDEX = 1
The package is used to figure the arrangement of pipes in nuclear power plants. Code like this is probably what's given nuclear power plants a bad name. I've seen this pattern so many times that I suspect it follows from another standard. The idea behind the standard may be a good thing, because early FORTRAN didn't allow grouping of statements with some kind of DO-END bracketing. This way, if the program ever needs modification, there is lots of room to insert new code. But if you're going to insert that much new code, perhaps you ought to have to redesign. And clearly, this organization didn't ever go back into this ancient—but still very active—code to bring it up to date with the latest FORTRAN, which the company uses.
Slow that once was fast
Another correspondent sent me an example from a manufacturer of embedded computers for process control. The chip had no square-root capability, so there was the following (pseudo-code for the machine language) integer square-root function in the run-time library:
S := 0
DO WHILE (S*S < X)
S := S + 1
Elegantly simple, it got the right answer every time, but it was a trifle slow, which was how it was eventually discovered lurking deep in the code.
The archaeological deduction here is interesting, because I suspect that much of the time, originally, the value of X was zero, or at least a very small integer, in which case this routine could actually be very fast. This is the common situation when the environment changes beyond what the designer ever anticipated.
Modifications upon modifications
Layers of modifications are common in dirt archaeology and are probably the most common source of really bizarre code as well. I've seen hundreds of examples, as have many of my readers, but they generally take too long to unravel and explain—all wonderful fun for code archaeologists.
Generally, though, most of this outlandish code originated for some good reason. Then things changed, and perhaps the code didn't. Or perhaps the code did change—under the hands of someone who no longer understood the original reasoning and so was afraid to modify the code in any radical way to better suit the present circumstances.
Consider this advice for the contract software professional coming upon such abnormalities:
- Learn to dig and understand.
- Try to appreciate how the code got that way. Understanding will teach you important things about the organization to which you've been attached.
- Be gracious in your interpretations. The people you're working with—including the manager—are very likely the same people who have touched this code at some time in the past. They won't generally appreciate ridicule; they were doing the best they could at the time.
- If you want to change such quaint code, start by finding who was involved and discussing it in a noncritical way. You may have missed something. And if you think you can avoid trouble through bypassing the originators, you may run into vicious resistance if you're discovered.
Gerald M. Weinberg and his wife, Dani, run Weinberg & Weinberg, a consultancy that also conducts professional workshops for IT consultants.
How do you work with old code?
Do you have some standard rules to help you keep your cool when rewriting someone else’s masterpiece? Are there some things you never do and some that you never fail to do? Send us your suggestions for working with old code.