Network monitoring with Nagios and OpenBSD

One of Nagios' key features is its extensibility; new functionality can be easily added thanks to its plugin-based architecture, the external command interface and the Apache web server. In this chapter, we will take a look at a few common issues that can be addressed with some of the most popular addons for Nagios.

5.1 NRPE

Suppose you want Nagios to monitor local services on remote hosts, such as disk space usage, system load or the number of users currently logged in. These are not network services, so they can't be directly checked out with standard plugins: what we would need is some kind of agent to install on remote systems and that Nagios could periodically query for the status of local services.

Well, that's exactly what the Nagios Remote Plugin Executor (NRPE) does: it allows you to execute local plugins on remote hosts! It is made up of two components:

In addition, the Nagios plugins package will be installed on the monitored host as a dependency: this will allow the NRPE agent to take advantage of the standard Nagios plugins to perform local checks. The package installation automatically creates the _nrpe user and group that the daemon will run as and copy a sample nrpe.cfg configuration file to /etc/:

/etc/nrpe.cfg

# The syslog facility that should be used for logging purposes
log_facility=daemon

# Path to the pid file (ignored if running under inetd)
pid_file=/var/run/nrpe.pid

# Address to bind to, to avoid binding on all interfaces (ignored if running
# under inetd)
server_address=172.16.0.170
# Port to wait connections on (ignored if running under inetd)
server_port=5666

# User and group the NRPE daemon should run as (ignored if running under inetd)
nrpe_user=_nrpe
nrpe_group=_nrpe

# Comma-delimited list of IP addresses or hostnames that are allowed to connect
# to the NRPE daemon (ignored if running under inetd)
allowed_hosts=127.0.0.1,172.16.0.164

# Don't allow clients to specify arguments to commands that are executed
dont_blame_nrpe=0

# Uncomment the following option to prefix all commands with a specific string
#command_prefix=/usr/bin/sudo

# Don't log debugging messages to the syslog facility
debug=0

# Maximum length (in seconds) of executed plugins
command_timeout=60

# Command definitions are in the form
#
#   command[<command_name>]=<command_line>
#
# Thus, when the NRPE daemon receives a request to execute the command
# 'command_name', it will run the *local* script specified by 'command_line'.
# Note: macros are NOT allowed within command definitions
command[check_users]=/usr/local/libexec/nagios/check_users -w 5 -c 10
command[check_load]=/usr/local/libexec/nagios/check_load -w 15,10,5 -c 30,25,20
command[check_disk1]=/usr/local/libexec/nagios/check_disk -w 20 -c 10 -p /dev/wd0a
command[check_total_procs]=/usr/local/libexec/nagios/check_procs -w 150 -c 200

# /usr/local/sbin/nrpe -c /etc/nrpe.cfg -d

and add nrpe to the pkg_scripts variable in /etc/rc.conf.local(8) to start it automatically after reboot:

/etc/rc.local/

pkg_scripts="nrpe"

/etc/inetd.conf

nrpe	stream	tcp	wait	_nrpe:_nrpe	/usr/local/sbin/nrpe	nrpe -c /etc/nrpe.cfg -i

/etc/services

nrpe	5666/tcp	# Nagios Remote Plugin Executor

Then enable the inetd(8) daemon, which is disabled by default starting from OpenBSD 5.4:

/etc/rc.conf.local

inetd_flags=""

# /etc/rc.d/inetd start

Now, on the Nagios server, you can perform checks using NRPE simply by defining commands such as the following (only make sure that the command name passed to the "-c" option has a corresponding command definition in the nrpe.cfg file on the remote host!):

/var/www/etc/nagios/commands.cfg

define command {
    command_name    check-disk1-nrpe
    command_line    $USER1$/check_nrpe -H $HOSTADDRESS$ -c check_disk1
}

5.2 NSCA

Now suppose you want to monitor the correct execution of a process on a remote host, like a scheduled backup or a crontab job. This is still a "local" service, but, unlike disk space usage or system load, it would probably sound more logical to make it the responsibility of the job itself to notify Nagios of its exit status. That's the perfect job for the Nagios Service Check Acceptor (NSCA), which is a daemon program, meant to run on the Nagios server, designed to accept passive service check results from clients.

NSCA is similar to NRPE in that it is made up of a daemon process and a client application, but now the roles are inverted: the daemon process runs on the Nagios server while remote hosts use the send_nsca utility to communicate their status to the daemon. NSCA then forwards the check results to Nagios through the external command interface (so make sure you have enabled external commands in the main configuration file).

5.2.1 Server configuration

NSCA can run either as a standalone daemon or under inetd(8). To install the server component we need to add the following packages on the Nagios server:

/etc/nsca.cfg

# Path to the pid file (ignored if running under inetd)
pid_file=/var/run/nsca.pid

# Address to bind to (optional)
server_address=172.16.0.164
# Port to wait connections on
server_port=5667

# User and group the NSCA daemon should run as (ignored if running under inetd)
nsca_user=_nagios
nsca_group=_nagios

# chroot(2) directory for the NSCA daemon
nsca_chroot=/var/www/var/nagios/rw

# Don't log debugging messages to the syslog facility
debug=0

# Path to the command file (relative to the chroot directory)
command_file=nagios.cmd
# File where to dump service check results if the command file does not exist
alternate_dump_file=nsca.dump

# Do not aggregate writes to the external command file
aggregate_writes=0
# Open the external command file in write mode
append_to_file=0

# Maximum packet age (in seconds)
max_packet_age=30

# Password to use to decrypt incoming packets
password=password
# Decryption method (16 = RIJNDAEL-256). It must match the encryption method
# used by the client
decryption_method=16

You should set restrictive permissions (600) on the configuration file in order to keep the decryption password protected. To run NSCA as a standalone daemon, simply type:

# /usr/local/sbin/nsca -c /etc/nsca.cfg

and add nsca to the pkg_scripts variable in /etc/rc.conf.local(8) to start it automatically after reboot:

/etc/rc.local/

pkg_scripts="nagios nsca"

/etc/inetd.conf

nsca	stream	tcp	wait	_nagios:_nagios	/usr/local/sbin/nsca	nsca -c /etc/nsca.cfg --inetd

/etc/services

nsca	5667/tcp	# Nagios Service Check Acceptor

/etc/rc.conf.local

inetd_flags=""

# /etc/rc.d/inetd start

5.2.2 Client configuration

and edit the encryption parameters in the /etc/send_nsca.cfg configuration file:

/etc/send_nsca.cfg

# Password to use to encrypt outgoing packets
password=password
# Encryption method (16 = RIJNDAEL-256)
encryption_method=16

The send_nsca utility reads data from standard input and expects, for service checks, a tab separated sequence of host name, service description (i.e. the value of the service_description directive in the service definition), return code and output; e.g.:

echo "www1\tbackup\t0\tBackup completed successfully" | /usr/local/libexec/nagios/send_nsca -H nagios.kernel-panic.it

and, for host checks, a tab separated sequence of host name, return code and output; e.g.:

echo "router1\t2\tRouter #1 is down" | /usr/local/libexec/nagios/send_nsca -H nagios.kernel-panic.it

You can override the default delimiter (tab) with send_nsca's "-d" option. Now, if everything is working fine, each message received by the NSCA daemon should produce a line like the following in the Nagios log file:

/var/www/var/log/nagios/nagios.log

[1167325538] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;www1;backup;0;Backup completed successfully

5.3 NagVis and NDO

NagVis is a visualization addon for Nagios; it can be used to give users a graphical view of Nagios data. It requires the installation of MySQL and a few graphics libraries:

Apache is already up and running, so we only need to enable the php modules we have just installed and restart Apache:

# ln -sf /etc/php-5.3.sample/mysql.ini /etc/php-5.3/mysql.ini
# ln -sf /etc/php-5.3.sample/gd.ini /etc/php-5.3/gd.ini
# apachectl restart
/usr/sbin/apachectl restart: httpd restarted

5.3.1 Installing NDO and MySQL

Prior to version 1.0, NagVis was able to pull data from Nagios directly from its web interface; now this is not supported anymore and NagVis expects monitoring data to be stored in a MySQL database, thus requiring the intallation of the Nagios Data Output Utils (NDOUTILS) addon.

The NDOUTILS addon allows you to export current and historical data from one or more Nagios instances to a MySQL database, thus providing the interface between Nagios and MySQL. This addon consists of several parts, but we will need only two of them:

# tar -zxvf ndoutils-x.x.x.tar.gz
[ ... ]
# cd ndoutils-x.x.x
# ./configure --enable-mysql --with-mysql-lib=/usr/local/lib \
>   --with-ndo2db-user=_nagios --with-ndo2db-group=_nagios
[ ... ]
# make

Note: if make fails to compile the dbhandlers.c file, try installing this patch (applies to version 1.5.2) by running the following command from outside the ndoutils source tree:

# patch -p0 < ndo-openbsd.patch

Now we can start MySQL, assign a password to the root account and create the appropriate database and user. The database creation script can be found in the db/ directory of the extracted tarball.

# cp /usr/local/share/mysql/my-medium.cnf /etc/my.cnf
# /usr/local/bin/mysql_install_db
[ ... ]
# mysqld_safe &
[1] 26984
131123 17:33:56 mysqld_safe Logging to '/var/mysql/nagios.kernel-panic.it.err'.
131123 17:33:56 mysqld_safe Starting mysqld daemon with databases from /var/mysql
# /usr/local/bin/mysql_secure_installation
[ ... ]
Enter current password for root (enter for none): <enter>
[ ... ]
Set root password? [Y/n] Y
New password: root
Re-enter new password: root
[ ... ]
Remove anonymous users? [Y/n] Y
[ ... ]
Disallow root login remotely? [Y/n] Y
[ ... ]
Remove test database and access to it? [Y/n] Y
[ ... ]
Reload privilege tables now? [Y/n] Y
[ ... ]
# mysql -u root -p
password: root
Welcome to the MySQL monitor.  Commands end with ; or \g.
Server version: 5.0.51a-log OpenBSD port: mysql-server-5.0.51a

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> create database nagios;
Query OK, 1 row affected (0.02 sec)

mysql> use nagios;
Database changed
mysql> \.  db/mysql.sql
[...]
mysql> GRANT SELECT, INSERT, UPDATE, DELETE ON nagios.* TO 'ndouser'@'localhost' IDENTIFIED BY 'ndopasswd';
mysql> FLUSH PRIVILEGES;
mysql> \q

# cp src/ndomod-3x.o /usr/local/libexec/nagios/ndomod.o
# cp config/ndomod.cfg-sample /var/www/etc/nagios/ndomod.cfg
# cp src/ndo2db-3x /usr/local/sbin/ndo2db
# cp config/ndo2db.cfg-sample /var/www/etc/nagios/ndo2db.cfg

/var/www/etc/nagios/ndomod.cfg

instance_name=default
output_type=unixsocket
output=/var/nagios/rw/ndo.sock

output_buffer_items=5000
buffer_file=/var/nagios/rw/ndomod.tmp

file_rotation_interval=14400
file_rotation_timeout=60

reconnect_interval=15
reconnect_warning_interval=15
data_processing_options=-1
config_output_options=3

/var/www/etc/nagios/ndo2db.cfg

lock_file=/var/run/nagios/ndo2db.lock

ndo2db_user=_nagios
ndo2db_group=_nagios

socket_type=unix
socket_name=/var/www/var/nagios/rw/ndo.sock

db_servertype=mysql
db_host=localhost
db_port=3306
db_name=nagios
db_prefix=nagios_
db_user=ndouser
db_pass=ndopasswd

max_timedevents_age=1440
max_systemcommands_age=10080
max_servicechecks_age=10080
max_hostchecks_age=10080
max_eventhandlers_age=44640

debug_level=0
debug_verbosity=1
debug_file=/var/www/var/log/nagios/ndo2db.debug
max_debug_file_size=1000000

Then we have to specify the event broker module that Nagios must load at startup, by adding the following line to the main configuration file:

/var/www/etc/nagios/nagios.cfg

broker_module=/usr/local/libexec/nagios/ndomod.o config_file=/var/www/etc/nagios/ndomod.cfg

# /usr/local/sbin/ndo2db -c /var/www/etc/nagios/ndo2db.cfg
# chmod 660 /var/www/var/nagios/rw/ndo.sock
# /etc/rc.d/nagios restart

To start the NDO2DB daemon on boot, we need to create an additional rc.d(8) script:

/etc/rc.d/ndo2db

#!/bin/sh

daemon="/usr/local/sbin/ndo2db"
daemon_flags="-c /var/www/etc/nagios/ndo2db.cfg"

. /etc/rc.d/rc.subr

rc_reload=NO

rc_pre() {
        rm -f /var/www/var/nagios/rw/ndo.sock
}

rc_start() {
        ${daemon} ${daemon_flags} && chmod 660 /var/www/var/nagios/rw/ndo.sock
}

rc_cmd $1

/etc/rc.local/

pkg_scripts="mysqld nagios ndo2db nsca"

5.3.2 Configuring NagVis

Now that we have installed all the necessary prerequisites, we can download and extract the NagVis tarball (the install script is Linux-based, so we'll have to do things manually):

# tar -zxvf nagvis-x.x.x.tar.gz -C /var/www/nagios/
[ ... ]
# mv /var/www/nagios/nagvis-x.x.x /var/www/nagios/nagvis
# mkdir -p /var/www/nagios/nagvis/var/tmpl/{cache,compile}
# chown -R www /var/www/nagios/nagvis/{etc,var}
# mv /var/www/nagios/nagvis/docs /var/www/nagios/nagvis/share

Below is a sample NagVis configuration file; please refer to the documentation for a detailed description of each parameter:

/var/www/nagios/nagvis/etc/nagvis.ini.php

; <?php return 1; ?>

[global]
language               = "en_US"
refreshtime            = 60
dateformat             = "Y-m-d H:i:s"

[defaults]
backend                = "ndomy_1"
; Default icons' size (icons can be found in
; /var/www/nagios/nagvis/images/iconsets)
icons                  = "std_medium"
recognizeservices      = 1
onlyhardstates         = 0
backgroundcolor        = "#fff"
contextmenu            = 1
eventbackground        = 0
eventhighlight         = 1
eventhighlightduration = 10000
eventhighlightinterval = 500
eventlog               = 0
eventloglevel          = "info"
eventlogheight         = 75
eventloghidden         = 1
eventscroll            = 1
eventsound             = 1
headermenu             = 1
headertemplate         = "default"
hovermenu              = 1
hovertemplate          = "default"
hoverdelay             = 0
hoverchildsshow        = 1
hoverchildslimit       = 10
hoverchildsorder       = "asc"
hoverchildssort        = "s"
icons                  = "std_medium"
onlyhardstates         = 0
recognizeservices      = 1
showinlists            = 1
urltarget              = "_self"
hosturl                = "[htmlcgi]/status.cgi?host=[host_name]"
hostgroupurl           = "[htmlcgi]/status.cgi?hostgroup=[hostgroup_name]"
serviceurl             = "[htmlcgi]/extinfo.cgi?type=2&host=[host_name]&service=[service_description]"
servicegroupurl        = "[htmlcgi]/status.cgi?servicegroup=[servicegroup_name]&style=detail"

[wui]
autoupdatefreq         = 25
maplocktime            = 5
allowedforconfig       = nagiosadmin

[paths]
base                   = "/nagios/nagvis/"
htmlbase               = "/nagios/nagvis"
htmlcgi                = "/cgi-bin/nagios"

[index]
backgroundcolor        = #fff
cellsperrow            = 4
headermenu             = 1
headertemplate         = "default"
showrotations          = 1

[automap]
defaultparams          = "&maxLayers=2"
showinlists            = 0

[worker]
interval               = 10
requestmaxparams       = 0
requestmaxlength       = 1900
updateobjectstates     = 30

[backend_ndomy_1]
backendtype            = "ndomy"
dbhost                 = "127.0.0.1"
dbport                 = 3306
dbname                 = "nagios"
dbuser                 = "ndouser"
dbpass                 = "ndopasswd"
dbprefix               = "nagios_"
dbinstancename         = "default"
maxtimewithoutupdate   = 180
htmlcgi                = "/cgi-bin/nagios"

; In this example, the browser switches between the 'dmz' and 'lan' maps every
; 15 seconds. The rotation is enabled by specifying the URL:
; https://your.nagios.server/nagios/nagvis/index.php?rotation=kp
[rotation_kp]
maps                   = "dmz,lan"
interval               = 15

5.3.3 Maps definition

Now we have to create the images for NagVis to use as the background for each map and put them in the /var/www/nagios/nagvis/images/maps/ directory. You can find a few examples here.

Once the map images are ready, we can tell NagVis where to place objects on the map by creating and editing the maps configuration files. Each map must have a corresponding configuration file (in /var/www/nagios/nagvis/etc/maps/) with the same name, plus the ".cfg" extension. Below is a sample map configuration file; syntax is rather simple, so you can easily tweak it to include your own hosts and services (please refer to the documentation for further details).

/var/www/nagios/nagvis/etc/maps/dmz.cfg

# The 'global' statement sets some default values that will be inherited by all
# other objects
define global {
# List of users allowed to view this map
    allowed_user=nagiosadmin,operator
# List of users allowed to modify this map via the web interface
    allowed_for_config=nagiosadmin
# Defaul iconset (if omitted, it is inherited from the main configuration file)
    iconset=std_medium
# Background image
    map_image=dmz.png
}

# Display the status of our 'www1' web server
define host {
    host_name=www1
# Coordinates of the host on the map
    x=268
    y=166
# Set this to '1' if you want the host status to also include the status
# of its services
    recognize_services=0
}

# Display the status of the 'WWW' service on the 'www1' web server
define service {
    host_name=www1
    service_description=WWW
    x=588
    y=165
# As you can see, 'global' options can be overridden in subsequent objects
    iconset=std_small
}

# Display the worst state of hosts in the 'WWW' hostgroup
define hostgroup {
    hostgroup_name=WWW
    x=298
    y=363
    recognize_services=1
}

# Display the worst state of services in the 'www-services' servicegroup
define servicegroup {
    servicegroup_name=www-services
    x=609
    y=363
}

# Display the worst state of objects represented in another NagVis map
define map {
    map_name=lan
    x=406
    y=323
}

# Draw a textfield on the map
define textbox {
# Text may include HTML
    text="This is the DMZ network"
    x=490
    y=394
    w=117
}

To allow the web interface to modify NagVis' configuration, make sure that all configuration files belong to, and are writable by, the www user.