One of Nagios' key features is its extensibility; new functionality can be easily added thanks to its plugin-based architecture, the external command interface and the Apache web server. In this chapter, we will take a look at a few common issues that can be addressed with some of the most popular addons for Nagios.
Suppose you want Nagios to monitor local services on remote hosts, such as disk space usage, system load or the number of users currently logged in. These are not network services, so they can't be directly checked out with standard plugins: what we would need is some kind of agent to install on remote systems and that Nagios could periodically query for the status of local services.
Well, that's exactly what the Nagios Remote Plugin Executor (NRPE) does: it allows you to execute local plugins on remote hosts
! It is made up of two components:
Both the agent and the plugin are available from the following package:
In addition, the Nagios plugins package will be installed on the monitored host as a dependency: this will allow the NRPE agent to take advantage of the standard Nagios plugins to perform local checks. The package installation automatically creates the _nrpe user and group that the daemon will run as and copy a sample nrpe.cfg configuration file to /etc/:
# The syslog facility that should be used for logging purposes log_facility=daemon # Path to the pid file (ignored if running under inetd) pid_file=/var/run/nrpe.pid # Address to bind to, to avoid binding on all interfaces (ignored if running # under inetd) server_address=172.16.0.170 # Port to wait connections on (ignored if running under inetd) server_port=5666 # User and group the NRPE daemon should run as (ignored if running under inetd) nrpe_user=_nrpe nrpe_group=_nrpe # Comma-delimited list of IP addresses or hostnames that are allowed to connect # to the NRPE daemon (ignored if running under inetd) allowed_hosts=127.0.0.1,172.16.0.164 # Don't allow clients to specify arguments to commands that are executed dont_blame_nrpe=0 # Uncomment the following option to prefix all commands with a specific string #command_prefix=/usr/bin/sudo # Don't log debugging messages to the syslog facility debug=0 # Maximum length (in seconds) of executed plugins command_timeout=60 # Command definitions are in the form # # command[<command_name>]=<command_line> # # Thus, when the NRPE daemon receives a request to execute the command # 'command_name', it will run the *local* script specified by 'command_line'. # Note: macros are NOT allowed within command definitions command[check_users]=/usr/local/libexec/nagios/check_users -w 5 -c 10 command[check_load]=/usr/local/libexec/nagios/check_load -w 15,10,5 -c 30,25,20 command[check_disk1]=/usr/local/libexec/nagios/check_disk -w 20 -c 10 -p /dev/wd0a command[check_total_procs]=/usr/local/libexec/nagios/check_procs -w 150 -c 200
To run NRPE as a standalone daemon, simply type:
# /usr/local/sbin/nrpe -c /etc/nrpe.cfg -d
and add nrpe to the pkg_scripts variable in /etc/rc.conf.local(8) to start it automatically after reboot:
pkg_scripts="nrpe"
Alternatively, you can run NRPE under inetd(8) by adding the following line in /etc/inetd.conf(8):
nrpe stream tcp wait _nrpe:_nrpe /usr/local/sbin/nrpe nrpe -c /etc/nrpe.cfg -i
and by adding the nrpe service in /etc/services(5):
nrpe 5666/tcp # Nagios Remote Plugin Executor
Then enable the inetd(8) daemon, which is disabled by default starting from OpenBSD 5.4:
inetd_flags=""
and start it:
# /etc/rc.d/inetd start
Now, on the Nagios server, you can perform checks using NRPE simply by defining commands such as the following (only make sure that the command name passed to the "-c" option has a corresponding command definition in the nrpe.cfg file on the remote host!):
define command { command_name check-disk1-nrpe command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c check_disk1 }
Now suppose you want to monitor the correct execution of a process on a remote host, like a scheduled backup or a crontab job. This is still a "local" service, but, unlike disk space usage or system load, it would probably sound more logical to make it the responsibility of the job itself to notify Nagios of its exit status. That's the perfect job for the Nagios Service Check Acceptor (NSCA), which is a daemon program, meant to run on the Nagios server, designed to accept passive service check results from clients
.
NSCA is similar to NRPE in that it is made up of a daemon process and a client application, but now the roles are inverted: the daemon process runs on the Nagios server while remote hosts use the send_nsca utility to communicate their status to the daemon. NSCA then forwards the check results to Nagios through the external command interface (so make sure you have enabled external commands in the main configuration file).
NSCA can run either as a standalone daemon or under inetd(8). To install the server component we need to add the following packages on the Nagios server:
Next, we need to edit the /etc/nsca.cfg configuration file:
# Path to the pid file (ignored if running under inetd) pid_file=/var/run/nsca.pid # Address to bind to (optional) server_address=172.16.0.164 # Port to wait connections on server_port=5667 # User and group the NSCA daemon should run as (ignored if running under inetd) nsca_user=_nagios nsca_group=_nagios # chroot(2) directory for the NSCA daemon nsca_chroot=/var/www/var/nagios/rw # Don't log debugging messages to the syslog facility debug=0 # Path to the command file (relative to the chroot directory) command_file=nagios.cmd # File where to dump service check results if the command file does not exist alternate_dump_file=nsca.dump # Do not aggregate writes to the external command file aggregate_writes=0 # Open the external command file in write mode append_to_file=0 # Maximum packet age (in seconds) max_packet_age=30 # Password to use to decrypt incoming packets password=password # Decryption method (16 = RIJNDAEL-256). It must match the encryption method # used by the client decryption_method=16
You should set restrictive permissions (600) on the configuration file in order to keep the decryption password protected. To run NSCA as a standalone daemon, simply type:
# /usr/local/sbin/nsca -c /etc/nsca.cfg
and add nsca to the pkg_scripts variable in /etc/rc.conf.local(8) to start it automatically after reboot:
pkg_scripts="nagios nsca"
Alternatively, you can run it under inetd(8) by adding the following line in /etc/inetd.conf(8):
nsca stream tcp wait _nagios:_nagios /usr/local/sbin/nsca nsca -c /etc/nsca.cfg --inetd
and by adding the nsca service in /etc/services(5):
nsca 5667/tcp # Nagios Service Check Acceptor
Then enable the inetd(8) daemon:
inetd_flags=""
and start it:
# /etc/rc.d/inetd start
On the client side, we need to install the following packages:
and edit the encryption parameters in the /etc/send_nsca.cfg configuration file:
# Password to use to encrypt outgoing packets password=password # Encryption method (16 = RIJNDAEL-256) encryption_method=16
The send_nsca utility reads data from standard input and expects, for service checks, a tab separated sequence of host name, service description (i.e. the value of the service_description directive in the service definition), return code and output; e.g.:
echo "www1\tbackup\t0\tBackup completed successfully" | /usr/local/libexec/nagios/send_nsca -H nagios.kernel-panic.it
and, for host checks, a tab separated sequence of host name, return code and output; e.g.:
echo "router1\t2\tRouter #1 is down" | /usr/local/libexec/nagios/send_nsca -H nagios.kernel-panic.it
You can override the default delimiter (tab) with send_nsca's "-d" option. Now, if everything is working fine, each message received by the NSCA daemon should produce a line like the following in the Nagios log file:
[1167325538] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;www1;backup;0;Backup completed successfully
NagVis is a visualization addon for Nagios; it can be used to give users a graphical view of Nagios data. It requires the installation of MySQL and a few graphics libraries:
Apache is already up and running, so we only need to enable the php modules we have just installed and restart Apache:
# ln -sf /etc/php-5.3.sample/mysql.ini /etc/php-5.3/mysql.ini # ln -sf /etc/php-5.3.sample/gd.ini /etc/php-5.3/gd.ini # apachectl restart /usr/sbin/apachectl restart: httpd restarted
Prior to version 1.0, NagVis was able to pull data from Nagios directly from its web interface; now this is not supported anymore and NagVis expects monitoring data to be stored in a MySQL database, thus requiring the intallation of the Nagios Data Output Utils (NDOUTILS) addon.
The NDOUTILS addon allows you to export current and historical data from one or more Nagios instances to a MySQL database
, thus providing the interface between Nagios and MySQL. This addon consists of several parts, but we will need only two of them:
Next, we need to download, extract and compile the NDOUTILS tarball:
# tar -zxvf ndoutils-x.x.x.tar.gz [ ... ] # cd ndoutils-x.x.x # ./configure --enable-mysql --with-mysql-lib=/usr/local/lib \ > --with-ndo2db-user=_nagios --with-ndo2db-group=_nagios [ ... ] # make
Note: if make fails to compile the dbhandlers.c file, try installing this patch (applies to version 1.5.2) by running the following command from outside the ndoutils source tree:
# patch -p0 < ndo-openbsd.patch
Now we can start MySQL, assign a password to the root account and create the appropriate database and user. The database creation script can be found in the db/ directory of the extracted tarball.
# cp /usr/local/share/mysql/my-medium.cnf /etc/my.cnf # /usr/local/bin/mysql_install_db [ ... ] # mysqld_safe & [1] 26984 131123 17:33:56 mysqld_safe Logging to '/var/mysql/nagios.kernel-panic.it.err'. 131123 17:33:56 mysqld_safe Starting mysqld daemon with databases from /var/mysql # /usr/local/bin/mysql_secure_installation [ ... ] Enter current password for root (enter for none): <enter> [ ... ] Set root password? [Y/n] Y New password: root Re-enter new password: root [ ... ] Remove anonymous users? [Y/n] Y [ ... ] Disallow root login remotely? [Y/n] Y [ ... ] Remove test database and access to it? [Y/n] Y [ ... ] Reload privilege tables now? [Y/n] Y [ ... ] # mysql -u root -p password: root Welcome to the MySQL monitor. Commands end with ; or \g. Server version: 5.0.51a-log OpenBSD port: mysql-server-5.0.51a Type 'help;' or '\h' for help. Type '\c' to clear the buffer. mysql> create database nagios; Query OK, 1 row affected (0.02 sec) mysql> use nagios; Database changed mysql> \. db/mysql.sql [...] mysql> GRANT SELECT, INSERT, UPDATE, DELETE ON nagios.* TO 'ndouser'@'localhost' IDENTIFIED BY 'ndopasswd'; mysql> FLUSH PRIVILEGES; mysql> \q
Now we need to manually copy the binaries and configuration files:
# cp src/ndomod-3x.o /usr/local/libexec/nagios/ndomod.o # cp config/ndomod.cfg-sample /var/www/etc/nagios/ndomod.cfg # cp src/ndo2db-3x /usr/local/sbin/ndo2db # cp config/ndo2db.cfg-sample /var/www/etc/nagios/ndo2db.cfg
and edit the NDOMOD configuration file:
instance_name=default output_type=unixsocket output=/var/nagios/rw/ndo.sock output_buffer_items=5000 buffer_file=/var/nagios/rw/ndomod.tmp file_rotation_interval=14400 file_rotation_timeout=60 reconnect_interval=15 reconnect_warning_interval=15 data_processing_options=-1 config_output_options=3
and the NDO2DB configuration file:
lock_file=/var/run/nagios/ndo2db.lock ndo2db_user=_nagios ndo2db_group=_nagios socket_type=unix socket_name=/var/www/var/nagios/rw/ndo.sock db_servertype=mysql db_host=localhost db_port=3306 db_name=nagios db_prefix=nagios_ db_user=ndouser db_pass=ndopasswd max_timedevents_age=1440 max_systemcommands_age=10080 max_servicechecks_age=10080 max_hostchecks_age=10080 max_eventhandlers_age=44640 debug_level=0 debug_verbosity=1 debug_file=/var/www/var/log/nagios/ndo2db.debug max_debug_file_size=1000000
Then we have to specify the event broker module that Nagios must load at startup, by adding the following line to the main configuration file:
broker_module=/usr/local/libexec/nagios/ndomod.o config_file=/var/www/etc/nagios/ndomod.cfg
and, finally, we can start the NDO2DB daemon and restart Nagios:
# /usr/local/sbin/ndo2db -c /var/www/etc/nagios/ndo2db.cfg # chmod 660 /var/www/var/nagios/rw/ndo.sock # /etc/rc.d/nagios restart
To start the NDO2DB daemon on boot, we need to create an additional rc.d(8) script:
#!/bin/sh daemon="/usr/local/sbin/ndo2db" daemon_flags="-c /var/www/etc/nagios/ndo2db.cfg" . /etc/rc.d/rc.subr rc_reload=NO rc_pre() { rm -f /var/www/var/nagios/rw/ndo.sock } rc_start() { ${daemon} ${daemon_flags} && chmod 660 /var/www/var/nagios/rw/ndo.sock } rc_cmd $1
pkg_scripts="mysqld nagios ndo2db nsca"
Now that we have installed all the necessary prerequisites, we can download and extract the NagVis tarball (the install script is Linux-based, so we'll have to do things manually):
# tar -zxvf nagvis-x.x.x.tar.gz -C /var/www/nagios/ [ ... ] # mv /var/www/nagios/nagvis-x.x.x /var/www/nagios/nagvis # mkdir -p /var/www/nagios/nagvis/var/tmpl/{cache,compile} # chown -R www /var/www/nagios/nagvis/{etc,var} # mv /var/www/nagios/nagvis/docs /var/www/nagios/nagvis/share
Below is a sample NagVis configuration file; please refer to the documentation for a detailed description of each parameter:
; <?php return 1; ?> [global] language = "en_US" refreshtime = 60 dateformat = "Y-m-d H:i:s" [defaults] backend = "ndomy_1" ; Default icons' size (icons can be found in ; /var/www/nagios/nagvis/images/iconsets) icons = "std_medium" recognizeservices = 1 onlyhardstates = 0 backgroundcolor = "#fff" contextmenu = 1 eventbackground = 0 eventhighlight = 1 eventhighlightduration = 10000 eventhighlightinterval = 500 eventlog = 0 eventloglevel = "info" eventlogheight = 75 eventloghidden = 1 eventscroll = 1 eventsound = 1 headermenu = 1 headertemplate = "default" hovermenu = 1 hovertemplate = "default" hoverdelay = 0 hoverchildsshow = 1 hoverchildslimit = 10 hoverchildsorder = "asc" hoverchildssort = "s" icons = "std_medium" onlyhardstates = 0 recognizeservices = 1 showinlists = 1 urltarget = "_self" hosturl = "[htmlcgi]/status.cgi?host=[host_name]" hostgroupurl = "[htmlcgi]/status.cgi?hostgroup=[hostgroup_name]" serviceurl = "[htmlcgi]/extinfo.cgi?type=2&host=[host_name]&service=[service_description]" servicegroupurl = "[htmlcgi]/status.cgi?servicegroup=[servicegroup_name]&style=detail" [wui] autoupdatefreq = 25 maplocktime = 5 allowedforconfig = nagiosadmin [paths] base = "/nagios/nagvis/" htmlbase = "/nagios/nagvis" htmlcgi = "/cgi-bin/nagios" [index] backgroundcolor = #fff cellsperrow = 4 headermenu = 1 headertemplate = "default" showrotations = 1 [automap] defaultparams = "&maxLayers=2" showinlists = 0 [worker] interval = 10 requestmaxparams = 0 requestmaxlength = 1900 updateobjectstates = 30 [backend_ndomy_1] backendtype = "ndomy" dbhost = "127.0.0.1" dbport = 3306 dbname = "nagios" dbuser = "ndouser" dbpass = "ndopasswd" dbprefix = "nagios_" dbinstancename = "default" maxtimewithoutupdate = 180 htmlcgi = "/cgi-bin/nagios" ; In this example, the browser switches between the 'dmz' and 'lan' maps every ; 15 seconds. The rotation is enabled by specifying the URL: ; https://your.nagios.server/nagios/nagvis/index.php?rotation=kp [rotation_kp] maps = "dmz,lan" interval = 15
Now we have to create the images for NagVis to use as the background for each map and put them in the /var/www/nagios/nagvis/images/maps/ directory. You can find a few examples here.
Once the map images are ready, we can tell NagVis where to place objects on the map by creating and editing the maps configuration files. Each map must have a corresponding configuration file (in /var/www/nagios/nagvis/etc/maps/) with the same name, plus the ".cfg" extension. Below is a sample map configuration file; syntax is rather simple, so you can easily tweak it to include your own hosts and services (please refer to the documentation for further details).
# The 'global' statement sets some default values that will be inherited by all # other objects define global { # List of users allowed to view this map allowed_user=nagiosadmin,operator # List of users allowed to modify this map via the web interface allowed_for_config=nagiosadmin # Defaul iconset (if omitted, it is inherited from the main configuration file) iconset=std_medium # Background image map_image=dmz.png } # Display the status of our 'www1' web server define host { host_name=www1 # Coordinates of the host on the map x=268 y=166 # Set this to '1' if you want the host status to also include the status # of its services recognize_services=0 } # Display the status of the 'WWW' service on the 'www1' web server define service { host_name=www1 service_description=WWW x=588 y=165 # As you can see, 'global' options can be overridden in subsequent objects iconset=std_small } # Display the worst state of hosts in the 'WWW' hostgroup define hostgroup { hostgroup_name=WWW x=298 y=363 recognize_services=1 } # Display the worst state of services in the 'www-services' servicegroup define servicegroup { servicegroup_name=www-services x=609 y=363 } # Display the worst state of objects represented in another NagVis map define map { map_name=lan x=406 y=323 } # Draw a textfield on the map define textbox { # Text may include HTML text="This is the DMZ network" x=490 y=394 w=117 }
To allow the web interface to modify NagVis' configuration, make sure that all configuration files belong to, and are writable by, the www user.
# chown www /var/www/nagios/nagvis/etc/maps/*.cfg # chmod 644 /var/www/nagios/nagvis/etc/maps/*.cfg