Apps

Different types of documentation for programmers

Justin James goes back to basics in this overview of the documentation types developers should know. Read about "self-documenting" code, UML, and more.

A new developer might be confused about which kinds of documentation are important, because we often use the very general term documentation rather than specifying the type. Here's a look at some of the types of documentation out there.

Code comments

When documentation is mentioned amongst developers, comments inserted directly into the source code are probably the most common understanding. This is especially true for recent graduates or newer programmers who encountered it in school, but never learned about more rigorous forms of documentation.

Comments have lost a lot of their utility over the years. Between the development of systems allowing longer, more descriptive variable names and development platforms and systems that allow for other kinds of documentation, comments no longer serve as the de facto documentation solution. That said, code comments still have value.

Code comments should not be used to replace descriptive variable names, though they are excellent for explaining the logic underlying a piece of code -- not necessarily the how of a code block but the why. For example, a useful comment might be, "Spec says that a name must be three characters long and have only letters" to explain a piece of validation code. Comments become more useful when they directly refer to the specification, a bug, or other external documentation in an easy-to-reference way. "Fix for bug #598" or "Refer to change request A991" can go a long way in helping future maintainers understand the thinking behind an otherwise incomprehensible piece of code. Writing useful comments along these lines should become a habit if it isn't one already.

"Self-documenting" code

Thanks to systems that allow variable, class, and function names to be longer than they used to be, it is much easier to write "self-documenting" code; that is to say, the names of things convey their meaning without the need for inline comments. For example, a function such as "prfltocnsl" does not let the potential user know what it does as well as "PrintFloatToConsole." Like using inline comments wisely, this should become a standard practice for developers.

Generated API-style documentation

Some languages allow you to embed detailed documentation within the source code in a format (typically XML) that automated tools can use to generate packaged help. Some systems (like Visual Studio) can pick it up and use it in other ways too. This can be a really useful, but it is a lot of work to do something useful.

How many times have you seen the documentation for a ToString() method read something like, "Produces the string representation of the object." Gee, that was... uh... completely obvious, thanks. How about letting me know how that happens instead? For example, let's say you have a collection with a ToString() method. Instead of forcing the developer to guess what the output will look like, the documentation should provide an example of what a sample instance looks like when ToString() is called. Likewise, too many times the "examples" just show the obvious syntax in a basic "hello world" context without explaining how or why you would really do any of this.

Bug tracker, task list, or project management system

There has been an explosion in tools that allow teams to enter bugs, tasks, to-do lists, and so on. The tools allow items to be tracked very granularly, and for the user to assign gobs of metadata to any given item. With this metadata, managers can do things like make graphs, charts, and reports showing a ton of different stats, like average bug resolution time or the number of features implemented per developer. Some of these systems can tie into your source code system, so that you can easily view code check-ins in the context of the tasks they addressed (this is a very handy feature).

While the stats that can be pulled are often of dubious relevance towards evaluating quality or productivity, these systems have lots of value. Being able to rapidly find and mine common bugs, change requests, and so on is a big help. It's also nice to not have to wade through endless piles of separate pieces of paper, emails, or files trying to figure out where someone stuck that change request so you can figure out why you spent three weeks making a change that everyone seems to hate.

UML

UML is a special file format design for documenting applications. UML can be consumed by a variety of tools to produce documents, database diagrams, process flowcharts, and more. Even better, some tools can take UML and stub out applications and databases based upon it.

UML is particularly prevalent in the Java ecosystem, thanks to the Rational Suite of tools that IBM owns. UML seems to be considered an enterprise development tool, due to the learning curve and cost of the tools associated with it.

Ad-hoc documents

This style of documentation is sadly too prevalent. With ad-hoc documentation, you usually lack version control. It's also difficult to search and, worst of all, you tend to get multiple copies of the documents with differences floating all over the place.

There are some uses for these kinds of files, but they work a lot better if they are participating in a more rigorous documentation system, such as attached to a bug ticket or change request.

Your documentation stories

If you have documentation stories that you want to share with your fellow developers, tell us about them in the forums.

J.Ja

About

Justin James is the Lead Architect for Conigent.

5 comments
cougar.b
cougar.b

I have no idea if this comment is useful to anyone else, because I'm a novice who is working in a number of languages--formerly a senior technical writer, which also changes my perspective. As I prepare for a programming task and start to flesh it out, I use the comments sections that I create on the fly for brainstorming, planning, and stream of consciousness reflections on where I'm going and why. This means that my to-do list for any section is always right there in the code where I'm working on, as well as all of those crazy ideas that come out of nowhere and seem to get lost in the shuffle if you don't process them somehow. When I'm done with a piece of code, I can remove all the crap, store ideas that are still relevant in another location, and edit what remains down to a fairly good description of why I'm doing certain things. Maybe if I was a more mature coder, this would all seem like a useless exercise, but for me, where I'm at, it makes the documentation part totally useful to me personally and ultimately creates better comment documentation for others. When I worked at Fujitsu in the late 90s, I corresponded with Sun programmers about a module that they had for exporting javadocs into FrameMaker, and they responded by reworking the module to do exactly what I needed it to do. I did this because maintaining documentation in printed APIs and in code meant that I could never keep up. The new system allowed me to write once, publish in two places. So I have a lot of respect for those comment sections.

sysop-dr
sysop-dr

I find your list lacking as it doesn't outline anything required for a any proper QA regimen. A QA plan. Who does what, what other docs are needed, what testing will be done, by who. Who checks that everything is being done to the plan. Number one would be a document outlining the basic premise of the software, what it will do, scope, users and a rough estimate of time and cots against estimated revenue. Cost benefit analysis, should we even try to do this. A document showing you thought about what level of quality assurance that you need to use to build a quality product. You would then set out requirements. These may not all be known at this point but try to get as many as you can down. Then a design. What parts of your program, the interfaces between them. What does what, how will it look, what inputs and what outputs. A programmers handbook, info programmers need, what tools and processes to be used. Then code and document your code. Updating the previous documents as you go. And making unit test code as well. Unit tests should never be in the same file as your code that is being tested and a build of your software should not rely on unit test code. Unit tests should be repeatable and output something that can be compared for regression testing. Software configuration management and change management. Change management is work flow for changes. How this is handled should be documented and your change management software should capture this information for auditing. Test plans. test reports of some type. Integration testing, automated if possible and output in a way that it can be checked and compared to previous integration test runs. User manuals. Maintenance and update plans. I write software for nuclear reactors. We have to be able to prove we make our software in a quality way and show that the software is tested, tested properly and will work without error. We have ISO standards for software quality assurance and software quality processes. Your requirements may not require all of the above but even if you simplify and put some of those documents together you should be doing the all of the above in some manner. Then not only do you know your software does what it's supposed to but that it also doesn't do what it's not supposed to. And also when someone like me comes to you and wants to use your software I can check that it's done in a quality manner and I can defend using it because believe me I have to defend using the software I do. if you do all of these things then not only does your software have a better chance of working but then a lot of people in business and government will be able to use your software.

TamaraPeters
TamaraPeters

Nice article about traditional documentation ... I find a lot of interest among developers in documentation they can update themselves, in a more real-time fashion. Along with this, there is value for the 'curator' role, a librarian / editor who organizes, edits, prunes the documentation tree. Atlassian is one vendor we work with whose public wiki is a nice example of crowd-sourced documentation combined with other types of information. Open source projects often have similar documentation sites. Any thoughts on how that fits into the pardigm you are describing?

Justin James
Justin James

... like Visual Studio, comments that start with "TODO" are treated specially and added to a task list, you might find that helpful with your workflow. J.Ja

Justin James
Justin James

Almost no one views creating documentation as their job, they see it as an annoying fact of life, like getting the tires on your car rotated. If there is no one over their shoulder reminding them to do it, they start to forget really quickly. The more work that a tool makes it to do it (for example, leaving the development environment to open a document, wiki editor, etc.), the less likely it is to be used, unfortunately. While I haven't used the Atlassian tool directly and can't comment on it, I know from just about every app I've ever written, the more effort (even if it's "just five seconds more") it takes to get something done, the less likely it will be to get done unless it produces huge, obvious benefits, or the process makes it unavoidable (like "can't check into source control without do it it" or something). J.Ja

Editor's Picks