E-Handbook: Balance the power and challenges of IT monitoring Article 4 of 4

Maksim Kabakou - Fotolia

Use this Nagios monitoring tutorial for proactive IT monitoring

Learn how to install and run Nagios to monitor your organization's IT assets. Follow these steps so you're prepared to catch problems before they get out of hand.

IT administrators today must be proactive -- rather than reactive -- through aggressive and continuous monitoring of IT infrastructure. It's their job to catch potential issues early, and save businesses from costly extended outages, data loss -- or both.

Nagios, an IT system monitoring tool, enables admins to catch issues before they become full-blown catastrophes. Learn more about the monitoring tool and how to get started with this tutorial, which covers installation and configuration of the following:

  • All prerequisite software needed by Nagios Core on a Debian-based Linux server;
  • Nagios Core on a Debian-based Linux server and the Nagios server;
  • Nagios Remote Plugin Executor (NRPE) on a separate Debian-based Linux server, the Nagios host and the Nagios server; and
  • Nagios Plugins on the Nagios host as well as the Nagios server.

We'll run tests for each stage in the process to ensure the example installations and configurations succeeded. By the end, we'll have a Nagios server that's able to monitor a reporting Nagios host.

A brief overview of Nagios

Nagios, released in 2002, is the standard foundation for all present-day infrastructure monitoring systems. While initially designed to run strictly under Linux, Nagios now runs under Unix variants such as FreeBSD, Solaris, Apple OS X and IBM Power.

Nagios comes in two flavors: Nagios Core and Nagios XI. Nagios Core -- the open source version -- is ideal for small- to mid-sized businesses and startups. Nagios XI -- the paid proprietary version -- offers additional features such as graphs, capacity planning and detailed reports. It's a good choice for larger organizations and businesses with strict reporting and auditing requirements, such as financial institutions and companies that deal with HIPAA data.

Nagios handles core metrics such as disk space, network activity, memory and other basic services on servers, as well as specific services and applications such as Secure Socket Shell (SSH), Apache, SMTP, CRM and disaster recovery devices.

IT admins new to Nagios are often unsure which IT components, services and network devices they should monitor in their infrastructure. To prevent feeling overwhelmed, start with mission-critical IT components. With Nagios, IT admins can easily add, modify and remove components.

Nagios project prerequisites

The components required to successfully perform the steps outlined in this Nagios monitoring tutorial are:

  • two working Debian-based servers (must have root access);
  • internet access; and
  • at least a passing familiarity of the Linux command line.

How to install and configure Nagios Core

First, install the Nagios Core server. While Nagios can monitor multiple OSes, the server must reside on a Linux or Unix variant such as FreeBSD or Solaris. In this tutorial, we'll install Nagios on an Ubuntu 19.10 server, but these steps should work on any Debian-based distro.

Next, update the repository cache index and install the Nagios dependencies.

# sudo apt update
# sudo apt install -y build-essential apache2 php openssl perl make php-gd libgd-dev libapache2-mod-php libperl-dev libssl-dev daemon wget apache2-utils unzip

Now, create the nagios user and group, and the nagcmd group. We'll also add the Apache www-data user to the nagios and nagcmd groups.

# sudo useradd nagios
# sudo groupadd nagcmd
# sudo usermod -a -G nagcmd nagios
# sudo usermod -a -G nagios,nagcmd www-data

Download the latest version of Nagios Core, which at the time of publication is version 4.4.5, and extract it.

# cd /tmp
# wget [https://assets.nagios.com/downloads/nagioscore/releases/nagios-4.4.5.tar.gz](https://assets.nagios.com/downloads/nagioscore/releases/nagios-4.4.5.tar.gz)
# tar -zxvf nagios-4.4.5.tar.gz

Then, compile Nagios from source.

# cd /tmp/nagios-4.4.5/
# sudo ./configure --with-nagios-group=nagios --with-command-group=nagcmd --with-httpd_conf=/etc/apache2/sites-enabled/

Once complete, you'll see a configuration summary.

Nagios configuration summary

From here, build the Nagios files and install them.

# sudo make all
# sudo make install

Next, install init and the external command configuration files.

# sudo make install-init
# sudo make install-config
# sudo make install-commandmode
# sudo make install-webconf
# sudo /usr/bin/install -c -m 644 sample-config/httpd.conf /etc/apache2/sites-available/nagios.conf

To receive alerts, edit the contacts.cfg file -- /usr/local/nagios/etc/objects/contacts.cfg -- and change nagios@localhost to the desired email address.

define contact { 
contact_name nagiosadmin ; Short name of user 
use generic-contact ; Inherit default values from generic-contact template (defined above) 
alias Nagios Admin ; Full name of user 
email nagios@localhost ; <<***** CHANGE THIS TO YOUR EMAIL ADDRESS ****** 
}

Next, set and verify the nagiosadmin password, which you will use to log into the web interface.

# sudo htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin

Uncomment the line in /usr/local/nagios/etc/nagios.cfg to enable monitoring of the remote servers.

# cfg_dir=/usr/local/nagios/etc/servers

Then, create a server directory.

# sudo mkdir -p /usr/local/nagios/etc/servers

Use the code below to enable the Nagios server Apache modules.

# sudo a2enmod rewrite
# sudo a2enmod cgi

Restart the Apache server and launch the Nagios Core server.

# service apache2 restart
# service nagios start

Test Nagios

Test the Nagios installation from the command line.

# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

The output should look similar to the one below.

Nagios test code output

Access the Nagios Core website

Once the Nagios Core server is installed and configured, for basic local monitoring, we can look at the reports.

Open the web browser and enter localhost/nagios in the URL bar.

Nagios Core website login

Enter nagiosadmin and the password you created earlier. Click Sign In.

Nagios Core splash screen

The Nagios Core splash screen will appear. Notice the green check mark in the middle shows Nagios is successfully running, along with the process identification number, or PID. The left-hand frame shows a glimpse of the various options, services and settings that Nagios Core offers.

Even though a remote server is not configured, Nagios automatically configures some basic monitors on the Nagios server, our localhost. To take a look, click Hosts from the left frame.

Nagios hosts dashboard

The dashboard shows one host is up and eight services are up and monitored. It also displays the date of the last check, how long the server has been up and some status information. To look at the Services, click on the link in the menu on the left.

Nagios services

Here is the Hosts dashboard, which displays the status details for the eight default services that Nagios Core configured: Current Load, Current Users, HTTP, PING, Root Partition, SSH, Swap Usage and Total Processes. To see more status details for a service, click the actual service link.

Install and configure NRPE

It's crucial that NRPE and Nagios Plugins are installed on all servers and workstations you plan to monitor, including the Nagios server itself. Nagios uses NRPE to execute plugins on remote client systems. The Nagios server receives the results and populates the dashboard.

Let's install NRPE on one of the remote Linux machines. First, download the NRPE source on the remote machine, or host.

#cd /tmp
# sudo wget --no-check-certificate -O nrpe.tar.gz https://github.com/NagiosEnterprises/nrpe/archive/nrpe-3.2.1.tar.gz
# tar xzf nrpe.tar.gz

Next, compile the source code.

# cd /tmp/nrpe-nrpe-3.2.1/
# sudo ./configure --enable-command-args --with-ssl-lib=/usr/lib/x86_64-linux-gnu/

After the NRPE source code compiles successfully, you'll see a configuration summary.

*** Configuration summary for nrpe 3.2.1 2017-09-01 ***:

General Options:
-------------------------
NRPE port: 5666
NRPE user: nagios
NRPE group: nagios
Nagios user: nagios
Nagios group: nagios

Review the options above for accuracy. If they look okay,
type 'make all' to compile the NRPE daemon and client
or type 'make' to get a list of make options.

Then, finish the compile, create the groups and users, and install the binaries and configuration files.

# sudo make all
# sudo make install-groups-users
# sudo make install
# sudo make install-config

Next, update the services file so Nagios and any related components translate service names to a port number. In this case, 5666.

# sudo sh -c "echo >> /etc/services"
# sudo sh -c "sudo echo '# Nagios services' >> /etc/services"
# sudo sh -c "sudo echo 'nrpe 5666/tcp' >> /etc/services"

Install the service/daemon.

# sudo make install-init
# sudo systemctl enable nrpe.service

Then, tweak the NRPE configuration file -- /usr/local/nagios/etc/nrpe.cfg -- on the host. Specifically, ensure that you add the server IP address after 127.0.0.1 to the line.

allowed_hosts=127.0.0.1,10.25.5.2

If there is more than one Nagios server, enter the IP addresses for each one. Use a comma as a separator.

Next, we want the value in dont_blame_nrpe=0 changed to dont_blame_nrpe=1.

This change enables clients to specify arguments to commands, which in turn enables more advanced NRPE configurations.

And for our final edit to /etc/nagios/nrpe.cfg, ensure that the following commands are uncommented.

command[check_users]=/usr/lib/nagios/plugins/check_users -w 5 -c 10
command[check_load]=/usr/lib/nagios/plugins/check_load -w 15,10,5 -c 30,25,20
command[check_hda1]=/usr/lib/nagios/plugins/check_disk -w 20% -c 10% -p /dev/hda1
command[check_zombie_procs]=/usr/lib/nagios/plugins/check_procs -w 5 -c 10 -s Z
command[check_total_procs]=/usr/lib/nagios/plugins/check_procs -w 150 -c 200

Plugins use these commands to help monitor basic services such as number of users logged in, system load, root file system usage, swap usage and process number total.

Finally, start the NRPE daemon.

# sudo /etc/init.d/nagios-nrpe-server start

Install and configure the Nagios Plugins

NRPE will not work properly without the Nagios Plugins. Download the Nagios Plugin source to the host and extract our tarball.

# cd /tmp
# sudo wget --no-check-certificate -O nagios-plugins.tar.gz https://github.com/nagios-plugins/nagios-plugins/releases/download/release-2.3.1/nagios-plugins-2.3.1.tar.gz
# sudo tar -zxvf /tmp/nagios-plugins-2.3.1.tar.gz

Next, compile the source code and install the binaries and configuration files.

# cd /tmp/nagios-plugins-2.3.1/
# sudo ./configure --with-nagios-user=nagios --with-nagios-group=nagios
# sudo make
# sudo make install

Note that while the Nagios Plugin package we just installed contains most of the plugins, there are some plugins that require other libraries not included. To install those, see the Nagios website.

Monitoring your Linux host

Once the Nagios server and NRPE are installed, make the host visible to the Nagios server and include the services you wish to monitor. To accomplish this, create host configuration files in the /usr/local/nagios/etc/servers directory on the Nagios server.

The host configuration file defines the host, along with the defined services you wish to monitor on the host machine, such as PING.

A single file can contain all the hosts and services, but it's not recommended. Instead, use separate files for each host you wish to monitor, along with specific definitions of the services you want to monitor on that host. For the Nagios server to monitor the host and the services defined in the file, there must be a file extension of .cfg so the monitoring tool can recognize the host configuration file. A common practice is to name the file the same as the server name, plus the cfg extension -- for example, debian-server.cfg.

To get you started, a sample host file is included. Cut and paste the sample and save it to a template file in /usr/local/nagios/etc/servers to use as a template for host configuration files.

# Nagios Host configuration file template
define host {
        use                          linux-server
        host_name                    mtr-ubuntu
        alias                        Ubuntu Host
        address                      192.168.1.6
        register                     1
}

define service {
      host_name                       mtr-ubuntu
      service_description             PING
      check_command                   check_ping!100.0,20%!500.0,60%
      max_check_attempts              2
      check_interval                  2
      retry_interval                  2
      check_period                    24x7
      check_freshness                 1
      contact_groups                  admins
      notification_interval           2
      notification_period             24x7
      notifications_enabled           1
      register                        1
}

define service {
      host_name                       mtr-ubuntu
      service_description             Check Users
      check_command           check_local_users!20!50
      max_check_attempts              2
      check_interval                  2
      retry_interval                  2
      check_period                    24x7
      check_freshness                 1
      contact_groups                  admins
      notification_interval           2
      notification_period             24x7
      notifications_enabled           1
      register                        1
}

define service {
      host_name                       mtr-ubuntu
      service_description             Local Disk
      check_command                   check_local_disk!20%!10%!/
      max_check_attempts              2
      check_interval                  2
      retry_interval                  2
      check_period                    24x7
      check_freshness                 1
      contact_groups                  admins
      notification_interval           2
      notification_period             24x7
      notifications_enabled           1
      register                        1
}

define service {
      host_name                       mtr-ubuntu
      service_description             Check SSH
      check_command                   check_ssh
      max_check_attempts              2
      check_interval                  2
      retry_interval                  2
      check_period                    24x7
      check_freshness                 1
      contact_groups                  admins
      notification_interval           2
      notification_period             24x7
      notifications_enabled           1
      register                        1
}

define service {
      host_name                       mtr-ubuntu
      service_description             Total Process
      check_command                   check_local_procs!250!400!RSZDT
      max_check_attempts              2
      check_interval                  2
      retry_interval                  2
      check_period                    24x7
      check_freshness                 1
      contact_groups                  admins
      notification_interval           2
      notification_period             24x7
      notifications_enabled           1
      register                        1
}

IT administrators will find a treasure trove of sample host check_commands in /usr/local/nagios/etc/objects/commands.cfg to use as examples of how to add more services.

Once the first host configuration file is complete, check for mistakes.

# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

If there is a mistake in the hosts config file, Nagios will return an error.

Now, restart all services. On the remote machine (host), restart the NRPE service.

# sudo /etc/init.d/nagios-nrpe-server start

On the Nagios server, restart both Apache and Nagios.

# sudo service apache2 restart
# sudo service nagios restart

To test the tool, open a web browser and enter /host.

host test

Note that the remote server now shows on the dashboard along with the Nagios server (localhost).

Nagios server dashboard

See that both the monitored servers are up, as are all services.

Services dashboard

Note the dashboard for the remote host and the detailed information provided.

The newly built Nagios server can also monitor Windows Servers, and the Nagios website offers more Nagios Plugins, or IT admins can even write their own.

Next Steps

4 monitoring and alerting best practices for IT ops

Dig Deeper on IT systems management and monitoring

Software Quality
App Architecture
Cloud Computing
SearchAWS
TheServerSide.com
Data Center
Close