Nagios is a free open source network monitoring system that runs on Linux or UNIX operating systems that is capable of monitoring hosts as well as the services that run on the host.
Nagios, originally known as NetSaint, has been a tangible product since early 1999. Since then it has been used by thousands of network administrators in organizations including education, government, corporate, and non-profit. Although often considered to be purely a network administrator's tool, Nagios can also be extended to benefit the end user. In this article, I'll talk about how my organization used enhanced Nagios to benefit its end user base.
Our medium-sized K-12 school district had issues with notifying end users of service interruptions and system outages. Many of our IT services -- such as phone, voice mail, e-mail, Web services, and remote access -- required 24x7 availability. However, due to staffing shortages and budget shortfalls, our IT staff did not have the resources to provide the level of support needed -- such as might be found in the corporate arena. Here's how we increased our ability to provide system status information to our technologically challenged user base.
After researching a number of possibilities, we decided to leverage our pre-existing monitoring system: Nagios. Using some of Nagios' advanced features, as well as some simple scripts, we were able to provide the needed solution without purchasing any software. We were able to provide a simple Web page that provided end-users with a method for checking to see if any critical systems were online or offline. This then limited the number of unnecessary support calls that our IT staff would have to process.
How we did it
First, for this method to work, I'm going to assume that you:
- Already have Nagios installed and working.
- You have a network administrator's level of understanding of Windows Server, Linux, and Nagios.
- Your organization's help desk/support Web site is hosted on a server other than the Nagios server, which runs Apache over Linux.
Step 1: Set up FTP
Here are the steps for setting up the FTP:
- Set up a directory on the server housing the help desk/support Web site and allow FTP access to it.
- Set access to the directory based on login or IP address (only the Nagios server needs access).
- Allow create/delete access inside that directory.
Step 2: Configure Nagios
The goal is for Nagios to send status update files over to the FTP directory created in Step 1. This can be done by using Nagios external commands.
- Create directory eventhandlers in the libexec directory. For example: /usr/local/nagios/libexec/eventhanlders
- Create simple HTML files and place them in the
- Create one HTML file named server1-online.htm and make the background color green and the text say "ONLINE".
- Create one HTML file named server1-offline.htm and make the background color red and the text say "OFFLINE".
two event handler scripts for the FTP file update process and place them in the
# Primary Event Handler processing file
# This file calls the actual processing file
/usr/local/nagios/libexec/eventhandlers/hdss_eh2 $1 $2 $3 $4
- Sample script hdss_eh2 can be found on the Nagios documentation site:
- Sample script hdss_eh:
- Assign the event handler script to the host to monitor.
the following lines to the misccommands.cfg file:
# 'check_hdss_eh' command definition
command_line /usr/local/nagios/libexec/eventhandlers/hdss_eh $HOSTSTATE$ $STATETYPE$ $HOSTATTEMPT$ $HOSTNAME$
Assign the event handler script to the host to monitor.
Add the following line to the host entry in hosts.cfg:
Step 3: Set up help desk Web site status page
- Create an HTML page on the help desk/support Web site and link to it.
- On the page, use iframe or other methods to display the server1-online.htm file in small frame (or table).
- Set the page (or the individual frame if using iframe) to update at a regular interval such as every ten seconds.
Step 4: Test and understand logic
Now you need to test/debug the process to see if it works. Before doing so, ensure the following:
- Host changes state (UP to DOWN or DOWN to UP)
- Host's configuration tells Nagios to execute an external command
- External command determines if the system is up or down
- If the system is up, the server1-online.htm file is sent via FTP to the help desk Web server directory
- If the system is down, the server1-offline.htm file is sent via FTP to the help desk Web server directory
- The "Status" page on the help desk Web site updates often showing whether the system is up or down
The result is a status Web page that is easily accessible and updates automatically. Figure A shows a sample of what the Help Desk/Support Status page could look like.
|A sample help desk/support status page.|
Even inexperienced end users can browse to the page and see if any critical systems are available or unavailable with up-to-the-minute accuracy. You can do all this without having to provide end users with direct access to the Nagios system. And, since the output is HTML, it can be customized or integrated to fit into virtually any application. This is just one scenario that shows how powerful an open source tool like Nagios is, and how its power extends beyond that of the network administrator.