Linux

Execute local Nagios checks on remote systems


I've covered the system monitoring tool Nagios in a previous tip, focusing on using it to monitor remote systems and explaining some of its basic configurations. But, in addition to monitoring remote services and local statistics, it can monitor local statistics on remote servers with a little bit of work and planning.

To start with, you need a means to execute the Nagios check plug-ins, which are all stand-alone programs on the remote system, and to be able to see the results locally. For this, a combination of Netcat and a "super-server" such as xinetd or ipsvd are needed. First, however, copy some of the Nagios plug-ins from a local system to the remote (if they are the same architecture) or compile the Nagios plug-ins on the remote server. The plug-ins of primary interest are: check_disk, check_load, check_procs, and check_users. Copy these files somewhere on the remote system such as /usr/local/nagios.

Next, you need a wrapper script that will execute each program. To prevent having to have a network port open for each plug-in, the wrapper will take arguments to determine which plug-in to execute. This wrapper is a simple shell script, so copy the snippet below into a file like /usr/local/bin/nagioscheck and make it executable:

#!/bin/sh

read -e -t 10 cmd

if [ "${cmd}" == "" ]; then

    exit 0

fi

cname="`echo ${cmd}|cut -d ' ' -f 1`"

carg="`echo ${cmd}|cut -d ' ' -f 2-`"

case "${cname}" in

    disk)

        /usr/local/nagios/check_disk ${carg} && exit 0

        ;;

    users)

        /usr/local/nagios/check_users ${carg} && exit 0

        ;;

    procs)

        /usr/local/nagios/check_procs ${carg} && exit 0

        ;;

    load)

        /usr/local/nagios/check_load ${carg} && exit 0

        ;;

    default)

        exit 0

        ;;

esac

exit 0

Finally, using ipsvd with runit, we would make a supervised service to execute this program on an incoming connection. In this instance, we use port 122 and the following run script:

#!/bin/execlineb

/bin/fdmove -c 2 1

/bin/export PATH "/sbin:/bin:/usr/sbin:/usr/bin"

/sbin/chpst -e /etc/sysconfig/env/tcpsvd/

/sbin/chpst -e ./env/

/bin/multisubstitute {

    import -D "localhost" HOSTNAME

    import -D 0 IP

    import -D 122 PORT

    import -D 20 MAX_CONN

    import -D 5 MAX_PER_HOST

    import -D 20 MAX_BACKLOG

    import OPTIONS

}

/sbin/tcpsvd -v -l ${HOSTNAME} -x peers.cdb -c ${MAX_CONN} -C ${MAX_PER_HOST} -b ${MAX_BACKLOG} ${IP} ${PORT}

     /usr/local/bin/nagioscheck

The above is a run script suitable for use with runit on an Annvix system. It can easily be written in bash or adapted to another system (i.e., an initscript or using xinetd or inetd instead). The benefits of using ipsvd is the nice ACL control it provides, which is essential for a system such as this to ensure that only the Nagios server can connect.

Once this service is started, configure the Nagios server. First, in the services definition file, usually services.cfg, add:

define service{

        use                             local-service

        hostgroup_name                  remote

        service_description             Total Processes

        check_command                   check_remote_procs!250!400!RSZDT

        }

And in the commands.cfg file:

define command{

        command_name    check_remote_procs

        command_line    $USER1$/check_remotelocals procs -H $HOSTADDRESS$ -w $ARG1$ -c $ARG2$ -s $ARG3$

        }

The astute will notice these are essentially copies of the check_local_procs definitions that come predefined with Nagios. The name has been changed along with the command_line option to call your new plug-in, check_remotelocals. You'll notice the first argument is the item to check (procs), and the -H $HOSTADDRESS$ option has been added. Otherwise it's almost identical. Likewise, you will want to make similar copies for check_remote_load, check_remote_users, and check_remote_disk.

The check_remotelocals plug-in is a bash script to actually contact the remote server and get the information you need. This file is made executable and stored with the other Nagios plug-ins, for instance, in /usr/local/nagios/libexec:

#!/bin/sh

port=122

function usage()

{

    printf "\ncheck_remotelocals -H [host] [nagios_args]\n\n"

}

cmd=${1}

if [ "${cmd}" == "" ]; then

    usage

    exit 1

fi

if [ "${2}" == "" ]; then

    usage

    exit 1

fi

shift

if [ "${1}" == "-H" ]; then

    if [ "${2}" == "" ]; then

        usage

        exit 1

    else

        host="${2}"

    fi

else

    usage

    exit 1

fi

shift; shift

nagargs="${*}"

if [ "${nagargs}" == "" ]; then

    usage

    exit 1

fi

echo "${cmd} ${nagargs}" | nc -q 1 ${host} ${port}

This script uses Netcat to contact the remote server on the specified port and pushes the custom command, which would be procs, load, etc., as well as the Nagios plug-in commandline to the remote system. You can test the script prior to restarting the Nagios server by using:

./check_remotelocals procs -H www.myhost.com procs -w 20% -c 10% -s RSZDT

PROCS OK: 123 processes with STATE = RSZDT

That indicates there are 123 processes in that state on the remote server. If you see that returned, you know the connection works.

This is a very brief look at creating custom plug-ins and wrappers with Nagios, but shows the flexibility Nagios provides with its completely open plug-in architecture. By gluing a few things together with ipsvd, Netcat, and Nagios, you can monitor system load, process usage, free disk space, and other items securely on remote servers via Nagios.

Delivered each Tuesday, TechRepublic's free Linux NetNote provides tips, articles, and other resources to help you hone your Linux skills. Automatically sign up today!

About

Vincent Danen works on the Red Hat Security Response Team and lives in Canada. He has been writing about and developing on Linux for over 10 years and is a veteran Mac user.

2 comments
tech
tech

I admit, I'm not hugely experience with bash scripting, or various distys of linux. But having implemented quite a complex Nagios solution from scratch, and learning as I go, I have tried most of the different methods. I have but one linux box, the nagios server, montitoring 60+ windoze servers, spanning over 700 service at last count. I use a combination of Check_nt and NRPE_nt. Check_nt is remarkably simple to set up, and monitors services, various loads, and other system stats "out-of-the-box". NRPE_nt, was also a sinch to set up, and merely executes local perl scripts on the monitored servers, returning the output to the nagios server as if it were a local check. When I attempted to setup various check routines using Xnetd or inetd, it went quickly wrong, and confused me no end! The beauty of using NRPE_nt is that you can write the check script in any language you wish, hell, could even be a DOS batch file if you want! Well there's my two cents! Sorry if I've stated the obvious, but would hate a newbie at nagios (me 6 months ago) to attempt the above assuming its the simplest solution! Ben Benson.

garnerl
garnerl

How does this compare in terms of performance and security to Nagios' built-in NRPE?