Developer

Counting lines of code can help measure progress

One developer has come around to thinking that counting lines of code can yield useful information. He explains why and shares a sample app that automates the process.


Throughout my career, I've heard horror stories from developers about having lines of code counted to measure their performance. I also read several books that argued against this approach of gauging productivity. For the longest time, I believed that monitoring the number of lines a developer produces was a poor measure of the work accomplished. But as I’ve grown older, my opinion has changed. In this article, I'll tell you why.

A change of heart
Over the last few years, I started to reconsider the value of counting the lines of code produced. I realized no measurement stands alone, so I decided not to evaluate line counting as a single performance measure. After all, I wouldn’t measure a development team’s performance solely on the fact that they meet the deadline when the code could have an exorbitant amount of bugs. Many other factors can help measure developer performance. For instance, consider augmenting line counting with project goal completion and time tracking. Also consider using your source code control program to summarize history information and create general activity reports for each developer. The cvs history command can produce output that may be fed into another program for parsing.

Also, I realized that my resistance to counting the lines of code I produced was not based on facts but rather on the feeling that I would be considered less productive if I spent time on design, meetings, and so forth. But when I gave it more thought, it seemed silly not to count lines of code.

Development is often compared to building a house. I’ve built a house, and I can tell you that while I did want the bricklayers to be finished on time, that wasn't enough to satisfy me when I had to write the check. Generally, I’d check their progress every day, and I would notice how much brick had been laid. I’d clean up the work area (they don’t clean up after themselves very well) and make sure that everything else was okay so that their next day would be spent laying more brick. Those bricks are just like lines of code. If managers count the lines of code in a project on a weekly basis, they will have a good idea of how things are going. Factor in project status updates, documentation creation, and testing milestones, and managers can measure project progress. After all, I am a developer; if I don’t produce any code, what sort of work have I been doing?

Another factor driving my change of heart was the desire to work from home part of the time each week. How would my employer know that progress was being made? Sure, I always hit my deadlines, and I make progress updates to the project plan. But some items are longer than just the current week. What other measurements can I use to show that I am performing?

Line counting to the rescue
I soon realized that counting the lines of code I had produced on a weekly basis would quickly give an indication that work was taking place. I put together a script using Windows Scripting Host (WSH) that goes through designated directories and collects file statistics in addition to counting the lines of code in source files. The script is shown in Listing A. While it was written in WSH script, any scripting language or platform can implement something similar.

I purposely made the script verbose and easy to use. It takes advantage of the newer WSH features that allow an XML-based format and automatically control the required and optional command-line arguments. An example call appears in Listing B.

I usually put the call in a .bat file, so it is easier to manage. The output of the command is a text format that can be redirected to a file or perhaps placed into a spreadsheet. Listing C shows a sample output from running the routine. Of course, you can easily modify the script output to fit your needs.

Understanding line counts
The key to using source code line counts is including additional information to explain variances. If you look at the sample output in Listing C, you'll notice that the routine counts binary files and their sizes. This is handy if you're doing a lot of administrative work, because that work will show up with the source code line counts. Files such as word processing documents, graphic design tool files, and spreadsheets can all be counted, and their size can be taken into account with the line counts.

Another key to properly using the line counts is to include all of your work. When I first started collecting the data, I realized that I needed to include the directory where I keep all the database scripts. This directory would often contain many lines of code for stored procedures and table additions/modifications that should count as lines of source code.

A second item I wasn't including was the temporary throwaway work that I often perform for proof of concept. I frequently create small programs for testing. I never kept these in source control since they were throwaway code, so I was missing out on a lot of work. I solved this by creating a prototypes directory in my source code repository and checking in prototype and proof of concept work.

I realized this was a good idea even if I wasn’t counting lines of code. After all, much of that work may be needed later and having versions and controlled backup is vital. The extra bonus for me is that if I do reuse a piece of that code, it will be counted twice since I will paste it into the project I am working on and clean it up. Some might feel that is cheating. But the time saved in some project weeks later is a direct result of the work on the proof of concept code, so the extra lines of code when it is reused are reflective of the performance.

Check out these other Builder.com titles
"Evaluate the use and cost of development tools"
"Making the transition from developer to manager"
"Time tracking by developers promises a strong return"


What the line counts don't show
As with any tool, it's just as important to know how not to use it as it is to know how to use it. One of the big shortcomings of this technique is it requires quite a bit more work to measure the line counts for an individual. For those of you still not convinced about counting lines of code, this should come as good news. For my purposes, I am interested in making sure there is progress and that the line counts change over time (up or down). Due to my requirements, I don’t use the line counts to prove that I worked but rather that the team has made progress. Line counts don’t spot bad developers or individuals who write bad code. If someone on your team isn’t pulling his or her weight, it's up to other team members and the manager to notice and act.

Make sure that you establish a baseline. I measured the line counts for several weeks before telling my manager I was doing it. That way, there was an established normal rate of change. The baseline prevented perceptions about normal code production from being used as the baseline. The established baseline counts reflected what was considered to be normal productivity.

Try using line counts
Combining line count data with other statistics like bugs found or resolved can really help to draw a picture of team performance. Seeing an increase in binary file sizes with no code increase can point to periods of time when design and administration are taking away from coding (lots of word processing documents and spreadsheets generate increased binary file sizes). On the other hand, noticing a decrease in line counts can indicate a refactoring or cleanup of code.

To count or not to count
Do you already count lines of code? If so, share your experience in the discussion or send an e-mail. If not, tell us why you don’t—or won’t—use source code line counts.

 

Editor's Picks