Vincent Danen explains how to get started with the munin server-monitoring tool, which can help you spot problems before they become serious.
One of the most important things for any system administrator to do is keep an eye on the systems he or she is responsible for. Usually this involves an arsenal of programs that monitor various things: tools that watch and parse log files, reports that get emailed on a daily basis, and so on.
A lot of these tools, however, are passive reporting and often you will get reports after-the-fact. For instance, a report that indicates a hard drive is full getting mailed at 4 a.m. doesn't help much when you could have known that at 9 p.m. it was 95% full. Or getting a report about an unavailable system doesn't help much when you could have known an hour previous that the load average was climbing incredibly high.
There is a tool that can help diagnose and spot troubles before they occur. The tool is Munin, a networked resource-monitoring tool that can be used to monitor just the host it is installed on, or a network of hosts. It is even possible to tie Munin into other tools, like Nagios, for active alerting.
On Red Hat Enterprise Linux 5, or Centos 5, setting up Munin is incredibly simple. On the server (server.host.com), we will set up Munin to monitor itself and another client system (client.host.com). The client will report to the server the status of various plugins: HDD temperatures, disk IO, load average, network throughput, etc. Munin does not ship with Red Hat Enterprise Linux 5 itself, but it is available via EPEL or RPMForge. If you have the EPEL or RPMForge repositories already setup, you can already begin; if not you will need to set them up, which can be done for EPEL or RPMForge.
Then, on the server, install munin and munin-node:
# yum install munin munin-node
This will pull in the required dependencies. On the client, you only need to install munin-node and its dependencies.
On the server, edit /etc/munin/munin.conf. The uncommented configuration file will look something like this:
Here, we tell Munin to monitor the localhost and also client.host.com, and we provide their IP addresses. On the server, start the munin-node program:
# service munin-node start
On the client, edit /etc/munin/munin-node.conf: The defaults are mostly fine; however, you will want to add two things to the end of the configuration file:
This tells munin-node that the hostname to present is client.host.com rather than localhost.localdomain. This is required as Munin on the server will make sure that the domain name reported by the node is the same as that configured in its munin.conf file. Also, we tell munin-node to allow connections from the IP address 192.168.100.5 (the IP address of the server; this has to be noted in an odd-ball regexp format, hence why it looks so strange). To tighten this up further, you should add an iptables firewall rule to only allow connections to port 4949 from that same IP address. For instance:
# iptables -A INPUT -s 192.168.100.5 -p tcp --dport 4949 -j ACCEPT
Finally, start the munin-node service on client.host.com as we previously did on the server. By default, the graphs for Munin are viewable at http://server.host.com/munin/. After about five minutes, the initial data will be taken. There won't be much there, of course, but Munin will start working and you will see empty charts for both hosts. Once Munin has had a few hours to run, the graphs will look far more interesting.
This is really all you need to do to start monitoring with Munin. You can adjust plugins, enable some and disable others, even create your own. This allows you to really tailor Munin to your needs and report only on the things you care about (you may not want to see file table or inode table usage, for instance).
Munin is really quite slick, and is very easy to set up. It is very much worth looking into, as it can really help to pinpoint slowdowns or problems with systems, and can even help to head off little problems before they become big problems.