谨参照,记录一下centOS7下nagios配置步骤,完整的步骤以及原理请参照原文,原文非常详细!

想要实现的效果是,把一台linux配置为监控主机,这台监控主机监控本机的一些服务,另外还监控一台linux机器和一台windows机器的服务。

nagios的安装:

安装基础支持套件:

[root@nagios-a ~]# yum install gcc glibc glibc-common gd gd-devel xinetd openssl-devel -y

创建nagios和nagios用户组,创建nagios目录并授权

[root@nagios-a ~]# useradd -s /sbin/nologin nagios
[root@nagios-a ~]# mkdir /usr/local/nagios
[root@nagios-a ~]# chown -R nagios.nagios /usr/local/nagios

查看授权是否成功

[root@nagios-a ~]# ll -d /usr/local/nagios/
drwxr-xr-x. 2 nagios nagios 6 Mar 13 16:35 /usr/local/nagios/

进入nagios网站下载nagios的压缩包,注意,www.nagios.org是开源的nagios core,而www.nagios.com是需要付费的功能更强大的nagios XI,这里讲的nagios是开源的nagios core。

wget https://assets.nagios.com/downloads/nagioscore/releases/nagios-4.4.3.tar.gz#_ga=2.261504205.1512090010.1552467729-531967639.1552467729

tar -zxvf nagios-4.4.3.tar.gz 

cd nagios-4.4.3/

 ./configure --prefix=/usr/local/nagios/

make all

make install

make install -init

make install-commandmode

make install-config

进入nagios目录检查安装文件是否完整,检查nagios是否安装成功:

cd /usr/local/nagios/
ls
[root@nagios-a nagios]# ls
bin  etc  libexec  sbin  share  var
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

Nagios Core 4.4.3
Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 2019-01-15
License: GPL

Website: https://www.nagios.org
Reading configuration data...
   Read main config file okay...
   Read object config files okay...

Running pre-flight check on configuration data...

Checking objects...
        Checked 8 services.
        Checked 1 hosts.
        Checked 1 host groups.
        Checked 0 service groups.
        Checked 1 contacts.
        Checked 1 contact groups.
        Checked 24 commands.
        Checked 5 time periods.
        Checked 0 host escalations.
        Checked 0 service escalations.
Checking for circular paths...
        Checked 1 hosts
        Checked 0 service dependencies
        Checked 0 host dependencies
        Checked 5 timeperiods
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...

Total Warnings: 0
Total Errors:   0

Things look okay - No serious problems were detected during the pre-flight check

安装配置php

yum install php
vim /etc/httpd/conf/httpd.conf

User daemon 
Group daemon  
改为
User nagios 
Group nagios 


<IfModule dir_module> 
  DirectoryIndex index.html 
</IfModule> 
改为
<IfModule dir_module> 
  DirectoryIndex index.html index.php 
</IfModule> 

ServerName 192.200.1.121:80

Listen 80

在这个文件最后增加以下代码,使nagios的web页面必须经过授权才能访问

#setting for nagios 
ScriptAlias /nagios/cgi-bin "/usr/local/nagios/sbin" 
<Directory "/usr/local/nagios/sbin"> 
     AuthType Basic 
     Options ExecCGI 
     AllowOverride None 
     Order allow,deny 
     Allow from all 
     AuthName "Nagios Access" 
     #用于此目录访问身份验证的文件
     AuthUserFile /usr/local/nagios/etc/htpasswd              
     Require valid-user 
</Directory> 
Alias /nagios "/usr/local/nagios/share" 
<Directory "/usr/local/nagios/share"> 
     AuthType Basic 
     Options None 
     AllowOverride None 
     Order allow,deny 
     Allow from all 
     AuthName "nagios Access" 
     AuthUserFile /usr/local/nagios/etc/htpasswd 
     Require valid-user 
</Directory>

在上面的配置中,指定了目录验证文件htpasswd,下面要创建这个文件:

 

/usr/bin/htpasswd -c /usr/local/nagios/etc/htpasswd nagios

修改 vim /usr/local/nagios/etc/cgi.cfg

default_user_name=nagios
authorized_for_system_information=nagiosadmin,nagios
authorized_for_configuration_information=nagiosadmin,nagios
authorized_for_system_commands=nagios
authorized_for_all_services=nagiosadmin,nagios
authorized_for_all_hosts=nagiosadmin,nagios
authorized_for_all_service_commands=nagiosadmin,nagios
authorized_for_all_host_commands=nagiosadmin,nagios

之后启动服务:

systemctl httpd.service start
/usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg

到此,nagios安装完成了,已经可以通过网页来访问了。

需要注意的是,nagios只是一个空壳,真正起到监控功能的是nagios的各种插件,现在安装nagios的插件,依然是那个网址,nagios.org:

centos7 nagios centOS7 nagios 安装_centos7 nagios

下载后,解压,安装:

tar -zxvf nagios-plugins-2.2.1.tar.gz 
cd nagios-plugins-2.2.1/
./configure --with-nagios-user=nagios --with-nagios-group=nagios
make
make all
make install

ls /usr/local/nagios/libexec/|wc -l
#查看插件的个数

插件装好了,下面修改一下配置文件,设置监控的主机,首先设置监控本机:

在/usr/local/nagios/etc/object/下,有一个localhost.cfg文件,这个文件本来就是存在的,是不需要修改的。

define host {

    use                     linux-server            ; Name of host template to use
                                                    ; This host definition will inherit all variables that are defined
                                                    ; in (or inherited by) the linux-server host template definition.
    host_name               localhost
    alias                   1.121
    address                 127.0.0.1
}

这样,本机就修改完了,接下来,进行监控windows的配置:

在/usr/local/nagios/etc/object/下,有一个windows.cfg文件,这个文件也是本来就存在的,修改这个文件,使nagios可以监控windows主机,修改这个文件也非常简单,只需要把windows主机的ip地址放在address对应位置就可以了:

define host {

    use                     windows-server          ; Inherit default values from a template
    host_name               winserver               ; The name we're giving to this host
    alias                   My Windows Server       ; A longer name associated with the host
    address                 192.200.10.202          ; IP address of the host
}

当然,alias可以改成任意的名字。

重启nagios,可以发现localhost的服务被正常监控,windows主机也可以被ping通,但是windows的服务是监控不到的,这是怎么回事呢,是因为windows没有装nagios插件呀,这样nagios就无法监控到windows的服务了,下面为windows安装nagios插件:

windows下的nagios监控插件名叫NSClient++,访问官网http://www.nsclient.org,下载安装包,这里我下载的是NSCP-0.5.2.35-x64 .msi,下载完成以后,双击进行安装,

centos7 nagios centOS7 nagios 安装_nagios_02

centos7 nagios centOS7 nagios 安装_ios_03

这里我选的是comlete:

centos7 nagios centOS7 nagios 安装_nagios_04

centos7 nagios centOS7 nagios 安装_nagios_05

centos7 nagios centOS7 nagios 安装_ios_06

centos7 nagios centOS7 nagios 安装_ci_07

安装完成后,记得在服务中把NSClient++服务启动起来:

centos7 nagios centOS7 nagios 安装_centos7 nagios_08

然后看到有些服务已经起来了,但是有些服务看起来并不能正常监控,下面修改一下windows插件的配置文件,默认安装的位置是在C:\Program Files\NSClient++下,nsclient.ini这个文件就是配置文件了

# If you want to fill this file with all available options run the following command:
#   nscp settings --generate --add-defaults --load-all
# If you want to activate a module and bring in all its options use:
#   nscp settings --activate-module <MODULE NAME> --add-defaults
# For details run: nscp settings --help


; in flight - TODO
[/settings/default]

; Undocumented key
allowed hosts = 192.200.1.121


; in flight - TODO
[/settings/NRPE/server]

; Undocumented key
verify mode = none

; Undocumented key
insecure = true


; in flight - TODO
[/modules]

; Undocumented key
CheckExternalScripts = enabled

; Undocumented key
CheckHelpers = enabled

; Undocumented key
CheckNSCP = enabled

; Undocumented key
CheckDisk = enabled

; Undocumented key
WEBServer = enabled

; Undocumented key
CheckSystem = enabled

; Undocumented key
NSClientServer = enabled

; Undocumented key
CheckEventLog = enabled

; Undocumented key
NSCAClient = enabled

; Undocumented key
NRPEServer = enabled

把 diable改为enabled,这样服务就能被监控了,改完之后记得去把服务里把NSClient++服务重启一下,然后就能看到监控正常了呢

当然,如果有些服务你不想让他监控,那就去配置文件修改一下监控的内容,比如不想监控一些windows服务,那么就在:windows.cfg文件里把相应的服务注释掉:

#define service {
#
#    use                     generic-service
#    host_name               winserver
#    service_description     W3SVC
#    check_command           check_nt!SERVICESTATE!-d SHOWALL -l W3SVC
#}



# Create a service for monitoring the Explorer.exe process
# Change the host_name to match the name of the host you defined above

#define service {

#    use                     generic-service
#    host_name               winserver
#    service_description     Explorer
#    check_command           check_nt!PROCSTATE!-d SHOWALL -l Explorer.exe
#}

同理,linux的本机localhost.cfg文件,注释掉相应的服务:

#define service {
#
#    use                     local-service           ; Name of service template to use
#    host_name               localhost
#   service_description     HTTP
#    check_command           check_http
#    notifications_enabled   0
#}

之后重启nagios服务,可以看到注释掉的服务都不见了:

centos7 nagios centOS7 nagios 安装_ios_09

至此,已经配置好了nagios对本机和windows主机的监控,下面进行配置nagios监控其他linux机器:

监控其他linux机器,一般来说需要手动添加两个文件,在安装nagios目录下的 /nagios/etc/object/下,分别是hosts.cfg文件和services.cfg文件,其中hosts.cfg文件主要指定被监控主机的相关情况,包括ip地址以及主机名等;而services.cfg则是定义需要监控的服务等,下面给出简单的配置示例:

hosts.cfg

define host{
        use                     linux-server
        host_name               1.131
        address                 192.200.1.131
        }

define host{
        use                     linux-server
        host_name               2.76
        address                 192.200.2.76
        }
#####################################################################
define hostgroup{
        hostgroup_name          test-servers
        alias                   test servers
        members                 1.131,2.76
        }

在hosts.cfg文件中简单定义了两台需要监控的linux主机以及一个主机组,use项是引用了local-service服务的属性值,这个local-service在template.cfg文件中有定义,定义如下:

define host {

    name                            linux-server            ; The name of this host template
    use                             generic-host            ; This template inherits other values from the generic-host template
    check_period                    24x7                    ; By default, Linux hosts are checked round the clock
    check_interval                  5                       ; Actively check the host every 5 minutes
    retry_interval                  1                       ; Schedule host check retries at 1 minute intervals
    max_check_attempts              10                      ; Check each Linux host 10 times (max)
    check_command                   check-host-alive        ; Default command to check Linux hosts
    notification_period             workhours               ; Linux admins hate to be woken up, so we only notify during the day
                                                            ; Note that the notification_period variable is being overridden from
                                                            ; the value that is inherited from the generic-host template!
    notification_interval           120                     ; Resend notifications every 2 hours
    notification_options            d,u,r                   ; Only send notifications for specific host states
    contact_groups                  admins                  ; Notifications get sent to the admins by default
    register                        0                       ; DON'T REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
}


在services.cfg文件中简单定义了一个服务,用于测试监控主机是否能够正常被监控到:


define service{
        use                     local-service
        host_name               1.131,2.76
        service_description     check-host-alive
        check_command           check-host-alive
     }

现在两个配置文件写好了,然后需要在nagios.cfg(nagios.cfg文件在nagios目录下的etc下)文件中添加对这两个文件的引用,只需要添加以下两行:

cfg_file=/usr/local/nagios/etc/objects/hosts.cfg

cfg_file=/usr/local/nagios/etc/objects/services.cfg

至此,对远端linux主机的简单监控就完成了,重启nagios以后可以看到两台linux主机已经up:

centos7 nagios centOS7 nagios 安装_centos7 nagios_10

如果想要监控远端主机的多个服务,当然还是要装linux插件的,以远端主机1.131为例,安装nagios插件和nrpe,依然是去nagios官网下载,先增加nagios用户,之后获取nagios插件包和nrpe包,解压后安装:

useradd nagios

wget https://nagios-plugins.org/download/nagios-plugins-2.2.1.tar.gz
wget https://github.com/NagiosEnterprises/nrpe/releases/download/nrpe-3.2.1/nrpe-3.2.1.tar.gz
tar -zxvf nagios-plugins-2.2.1.tar.gz
tar -zxvf nrpe-3.2.1.tar.gz
cd nagios-plugins-2.2.1/
./configure --prefix=/usr/local/nagios
make
make all
make install

chown nagios.nagios /usr/local/nagios
chown -R nagios.nagios /usr/local/nagios/libexec

cd nrpe-3.2.1/
./configure --prefix=/user/local/nagios
make all
make install

然后,修改一下配置文件,允许监控主机的监控,需要修改的是/usr/local/nagios/etc目录下的nrpe.cfg文件:

allowed_hosts=127.0.0.1,192.200.1.121

这里,只需要在allowed_hosts后加上监控主机的ip就可以了,然后开启nrpe服务,确认端口开启以后去监控主机上测试一下:

/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d
[root@nagios-b etc]# netstat -tunlp | grep nrpe
tcp        0      0 0.0.0.0:5666            0.0.0.0:*               LISTEN      12545/nrpe          
tcp6       0      0 :::5666                 :::*                    LISTEN      12545/nrpe     

可以看到,端口已经在被监听了,下面去监控主机上测试一下:

[root@nagios-a ~]# /usr/local/nagios/libexec/check_nrpe -H 192.200.1.131
NRPE v3.2.1

正常显示了NRPE的版本

之后可以配置被监控主机的其他服务了,修改监控主机的/usr/local/nagios/etc/objects下的services.cfg,比如修改成这样:

define service{
        use                     local-service
        host_name               1.131
        service_description     check-host-alive
        check_command           check-host-alive
     }

define service {

    use                     local-service
    host_name               1.131
    service_description     PING
    check_command           check_ping!100.0,20%!500.0,60%
}



# Define a service to check the disk space of the root partition
# on the local machine.  Warning if < 20% free, critical if
# < 10% free space on partition.

define service {

    use                     local-service
    host_name               1.131
    service_description     Root Partition
    check_command           check_local_disk!20%!10%!/
}



# Define a service to check the number of currently logged in
# users on the local machine.  Warning if > 20 users, critical
# if > 50 users.

define service {

    use                     local-service
    host_name               1.131
    service_description     Current Users
    check_command           check_local_users!20!50
}



# Define a service to check the number of currently running procs
# on the local machine.  Warning if > 250 processes, critical if
# > 400 processes.

define service {

    use                     local-service
    host_name               1.131
    service_description     Total Processes
    check_command           check_local_procs!250!400!RSZDT
}



# Define a service to check the load on the local machine.

define service {

    use                     local-service
    host_name               1.131
    service_description     Current Load
    check_command           check_local_load!5.0,4.0,3.0!10.0,6.0,4.0
}



# Define a service to check the swap usage the local machine.
# Critical if less than 10% of swap is free, warning if less than 20% is free

define service {

    use                     local-service
    host_name               1.131
    service_description     Swap Usage
    check_command           check_local_swap!20%!10%
}



# Define a service to check SSH on the local machine.
# Disable notifications for this service by default, as not all users may have SSH enabled.

define service {

    use                     local-service
    host_name               1.131
    service_description     SSH
    check_command           check_ssh
    notifications_enabled   0
}

之后检查一下配置文件有没有错误:

/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
Nagios Core 4.4.3
Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 2019-01-15
License: GPL

Website: https://www.nagios.org
Reading configuration data...
   Read main config file okay...
   Read object config files okay...

Running pre-flight check on configuration data...

Checking objects...
        Checked 20 services.
        Checked 4 hosts.
        Checked 3 host groups.
        Checked 0 service groups.
        Checked 1 contacts.
        Checked 1 contact groups.
        Checked 24 commands.
        Checked 5 time periods.
        Checked 0 host escalations.
        Checked 0 service escalations.
Checking for circular paths...
        Checked 4 hosts
        Checked 0 service dependencies
        Checked 0 host dependencies
        Checked 5 timeperiods
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...

Total Warnings: 0
Total Errors:   0

Things look okay - No serious problems were detected during the pre-flight check

如图输出证明没有问题,之后放心的重启nagios就好了

centos7 nagios centOS7 nagios 安装_ios_11

 可以看到这些服务都可以正常监控了呢!

到这里配置已经完成了,还可以设置一个邮件报警的功能,这个功能的配置是在contacts.cfg这个配置文件中配置的,内容如下:

define contact {

    contact_name            xiaobai                 ; Short name of user
    use                     generic-contact         ; Inherit default values from generic-contact template (defined above)
    alias                   Nagios Admin            ; Full name of user
    email                   *****@***.com           ;<<***** CHANGE THIS TO YOUR EMAIL ADDRESS ****** 
}



###############################################################################
#
# CONTACT GROUPS
#
###############################################################################

# We only have one contact in this simple configuration file, so there is
# no need to create more than one contact group.

define contactgroup {

    contactgroup_name       admins
    alias                   Nagios Administrators
    members                 xiaobai
}

 在email出配置上你的地址,就可以收到邮箱报警啦!

至此,对我来说nagios的配置已经全部完成啦!

以上。