Developer

Different types of documentation for programmers

Justin James goes back to basics in this overview of the documentation types developers should know. Read about "self-documenting" code, UML, and more.

A new developer might be confused about which kinds of documentation are important, because we often use the very general term documentation rather than specifying the type. Here's a look at some of the types of documentation out there.

Code comments

When documentation is mentioned amongst developers, comments inserted directly into the source code are probably the most common understanding. This is especially true for recent graduates or newer programmers who encountered it in school, but never learned about more rigorous forms of documentation.

Comments have lost a lot of their utility over the years. Between the development of systems allowing longer, more descriptive variable names and development platforms and systems that allow for other kinds of documentation, comments no longer serve as the de facto documentation solution. That said, code comments still have value.

Code comments should not be used to replace descriptive variable names, though they are excellent for explaining the logic underlying a piece of code — not necessarily the how of a code block but the why. For example, a useful comment might be, "Spec says that a name must be three characters long and have only letters" to explain a piece of validation code. Comments become more useful when they directly refer to the specification, a bug, or other external documentation in an easy-to-reference way. "Fix for bug #598" or "Refer to change request A991" can go a long way in helping future maintainers understand the thinking behind an otherwise incomprehensible piece of code. Writing useful comments along these lines should become a habit if it isn't one already.

"Self-documenting" code

Thanks to systems that allow variable, class, and function names to be longer than they used to be, it is much easier to write "self-documenting" code; that is to say, the names of things convey their meaning without the need for inline comments. For example, a function such as "prfltocnsl" does not let the potential user know what it does as well as "PrintFloatToConsole." Like using inline comments wisely, this should become a standard practice for developers.

Generated API-style documentation

Some languages allow you to embed detailed documentation within the source code in a format (typically XML) that automated tools can use to generate packaged help. Some systems (like Visual Studio) can pick it up and use it in other ways too. This can be a really useful, but it is a lot of work to do something useful.

How many times have you seen the documentation for a ToString() method read something like, "Produces the string representation of the object." Gee, that was... uh... completely obvious, thanks. How about letting me know how that happens instead? For example, let's say you have a collection with a ToString() method. Instead of forcing the developer to guess what the output will look like, the documentation should provide an example of what a sample instance looks like when ToString() is called. Likewise, too many times the "examples" just show the obvious syntax in a basic "hello world" context without explaining how or why you would really do any of this.

Bug tracker, task list, or project management system

There has been an explosion in tools that allow teams to enter bugs, tasks, to-do lists, and so on. The tools allow items to be tracked very granularly, and for the user to assign gobs of metadata to any given item. With this metadata, managers can do things like make graphs, charts, and reports showing a ton of different stats, like average bug resolution time or the number of features implemented per developer. Some of these systems can tie into your source code system, so that you can easily view code check-ins in the context of the tasks they addressed (this is a very handy feature).

While the stats that can be pulled are often of dubious relevance towards evaluating quality or productivity, these systems have lots of value. Being able to rapidly find and mine common bugs, change requests, and so on is a big help. It's also nice to not have to wade through endless piles of separate pieces of paper, emails, or files trying to figure out where someone stuck that change request so you can figure out why you spent three weeks making a change that everyone seems to hate.

UML

UML is a special file format design for documenting applications. UML can be consumed by a variety of tools to produce documents, database diagrams, process flowcharts, and more. Even better, some tools can take UML and stub out applications and databases based upon it.

UML is particularly prevalent in the Java ecosystem, thanks to the Rational Suite of tools that IBM owns. UML seems to be considered an enterprise development tool, due to the learning curve and cost of the tools associated with it.

Ad-hoc documents

This style of documentation is sadly too prevalent. With ad-hoc documentation, you usually lack version control. It's also difficult to search and, worst of all, you tend to get multiple copies of the documents with differences floating all over the place.

There are some uses for these kinds of files, but they work a lot better if they are participating in a more rigorous documentation system, such as attached to a bug ticket or change request.

Your documentation stories

If you have documentation stories that you want to share with your fellow developers, tell us about them in the forums.

J.Ja

About

Justin James is the Lead Architect for Conigent.

Editor's Picks