Documentation. It’s not generally much fun to produce, but in a pinch, it can save hours of troubleshooting time. Without reasonable documentation, IT administration can become a nightmare, even for people who understand the environment—let alone the poor saps that have to come in afterward and try to figure it out.

In this article, I will discuss my current documentation strategy. This is not intended to provide a comprehensive list of everything you should document but rather to give you an example of a strategy you can use to build effective documentation for managing your systems.

Documentation guidelines
The type of documentation I developed is based on a number of factors:

  • The size of the environment
  • What is being documented
  • Personal preference
  • Management preference

I am going to focus here on documentation related to a project I recently completed. While this documentation was developed specifically for this new build-out, I am currently working on moving all of our infrastructure to this documentation model as well.

I prefer to track as much as possible, within reason. At first glance, it may appear that I track some extraneous information, but some of that information has been vital to the continued operation of my environment. For example, I have always tracked the circuit ID that each of my servers is connected to. At one point, we had a circuit that had to be replaced by our building maintenance team, and I was able to determine what the exact impact on my environment would be.

My documentation currently consists of documents from three programs: Microsoft Visio, Word, and Excel. All of the primary documentation is kept in an Excel workbook. I have a document in Word that connects to the Excel file via a mail merge and automatically generates a hard copy printout with all the pertinent information for each server on one page. It is difficult to print the workbook due to the number of columns of information that I track. This method makes it very easy to keep the documentation up to date.

In addition to tracking my documentation in Word and Excel, I find it extremely useful to have visual diagrams for my entire infrastructure. It’s much easier to draw a picture than it is to write 1,000 words. In addition, it is important to be able to have something to quickly refer to when there is a crisis to solve or when management asks for it. Therefore, I keep a number of documents in my documentation folder that are nothing but Visio diagrams.

What I have described above is simple and manageable. All of the primary documentation is in Excel, and I can print the information in table form or in a book form via a mail merge to Word. If I want a picture to refer to, I can easily reference a diagram in Visio. Now, let’s take a look at the information I track.

What I track and why
Table A shows a list of the things I track, along with the details of how and why I track them.

Table A
Field Why I track it
Server name This is an obvious requirement.
Environment I also like to know which environment—development, staging, or production—the server is in. This helps me determine how critical an outage is depending on the time of day, etc.
Which product does the server support? In the event of an actual outage, it’s nice to be able to notify the right people.
Function Each product consists of specific components. I generally track which component is on which server. For example, is the primary function of the server to be a Web server or a database server?
Domain For a Windows server, which domain does the server belong to?
Administrative user This is the user id for the administrator or root user. In addition, I put the password for the admin user in the workbook.
Dell Open Manage user We use a lot of Dell servers, as well as their Open Manage components, which require their own user id and password. Both of these values are stored.
Remote administration Depending on the server, I manage it with VNC, PC Anywhere, or Windows Terminal Services. If the software is to be administered using a specific user, I put the user id and password information in the worksheet as well.
Database server user This is the database account and password on the machine if a database is installed.
Contacts If the server happens to fail or needs maintenance, who are the primary and secondary contacts and how can they be reached within 24 hours?
OS/Version/Patch level This information is always nice to have on hand, especially when a bad patch comes out and you need to find out which servers it was installed to (particularly if you did not do the installation yourself).
OS licensing I always track the vendor I purchased the license from, the purchase order number, the invoice number, and how much it cost. With Microsoft’s somewhat heavy-handed licensing policies, this information will eventually be very useful to prove compliance. For those of you running Linux, feel free to leave this field blank.
Location Since we have our servers in a hosting facility that is geographically separate from our office, I list the cage number, rack number, and rack position (starting from the top) that the server is in. I also list KVM information in this area—which KVM the server is connected to and the port on that KVM. Our KVMs can be cascaded together with each subsequent unit taking port numbers in sequence (e.g., Port 1 on the second 8-port KVM would be port 9 overall).
Network setup In our environment, a server can be multihomed to a number of different networks depending on its function and how it is managed. I list the DNS name, IP address, subnet mask, default gateway, DNS servers, network adapter(s) MAC address, patch panel port, and which blade/port on the backbone switch it connects to.
Power information In this section, I list the power connections for each server. I indicate which server power supply plugs into which power strip and what the circuit ID is for that particular power strip. Each of our power strips is connected to its own 20-amp circuit. Each of my cabinets is completely self-contained, and there are six power strips in each cabinet.
System information Here, I list the server manufacturer, model, size, serial number, asset tag, purchase date, purchase invoice, and other detailed information about how the hardware is configured, such as RAID sizes, boot volume sizes, etc.

It’s important to keep in mind that compiling all this information in one place also represents a serious security risk. If the wrong person—inside or outside your network—obtained this document, he or she could render major damage to your network. Therefore, you’ll want to make sure that you set very restrictive permissions on the files themselves, or better yet, the directory that holds these files. Only your high-level system administrators (and possibly some of your managers) should have access to the files. It’s also not a bad idea to password-protect the files as a second layer of defense.

Final steps
Gathering all the information shown above can be a daunting task, but it definitely pays off in the long run. When a crisis occurs, no one has to run around looking for documentation. If and when I leave the company, I’ll be able to easily transition things over to a new person since I have everything documented so well. Now that it is all gathered and set up in a consistent format, keeping it current is very manageable.

However, there is one more step to the documentation that I find very important. For everything I write down in the documentation for the physical infrastructure, there is a matching label somewhere in the cage at our hosting provider. And for even more added convenience, the label is actually affixed to the matching piece of equipment rather than stuck on something random.

For example, I list the power strips in the documentation in this format:

rack number/power strip number

Each power strip in every rack is labeled the same way (e.g., 25/2). In addition, we use specific colors in every Visio diagram for consistency. In our actual physical infrastructure, we use the same colors so that they match the documentation. For instance, red lines on our diagrams mean private network connections; in our cages, private network cables are always red. This makes troubleshooting much easier and makes the infrastructure much simpler for a new person to learn.

I hope that this article has given you some ideas of things to track in your own infrastructure. Remember that when you set up documentation, you’re not always doing it for yourself. Often, you’re doing it for everyone else in management or the poor admin who has to come in after you and try to figure the mess out. Thus, you need to keep it simple.

How do you document your infrastructure?

What tips do you have for effective documentation? We look forward to getting your input and hearing about your experiences regarding this topic. Join the discussion below or send the editor an e-mail.