其实上篇Nogios安装只是安装了Nagios基本组件,虽然能够打开主页,但是如果不配置相关配置文件文件,那么左边菜单很多页面都打不开,相当于只是一个空壳子。接下来,我们来学习研究一下Nagios的配置,了解一下基本的配置和了解各类配置文件。

 

Nagios配置目录

Nagios的配置文件位于etc目录下(/usr/local/nagios/etc)如下图所示:

Nagios配置_配置文件

 

配置文件简介

 

配置文件名

功能描述

cgi.cfg

控制CGI访问的配置文件

nagios.cfg

主配置文件:主配置文件包括了一系列的设置,它们会影响Nagios守护进程

resource.cfg

资源配置文件:资源文件可以保存用户自定义的宏。资源文件的一个主要用途是保存一些敏感的配置信息,不能让CGIS程序模块获取到的信息

objects

objects是一个目录,在此目录下有很多配置文件,用于定义Nagios对象:commands.cfg、contacts.cfg、localhost.cfg

objects目录下的配置文件描述

配置文件名

功能描述

commands.cfg

命令定义配置文件,其中定义的命令可以被其他配置文件引用

contacts.cfg

定义联系人和联系人组的配置文件

localhost.cfg

定义监控本地主机的配置文件

printer.cfg

定义监控打印机的一个配置文件模板,默认没有启用此文件

switch.cfg

定义监控路由器的一个配置文件模板,默认没有启用此文件

templates.cfg

定义主机和服务的一个模板配置文件,可以在其他配置文件中引用

timeperiods.cfg

定义Nagios 监控时间段的配置文件

windows.cfg

监控Windows 主机的一个配置文件模板,默认没有启用此文件

 

实践配置步骤

下面修改配置信息,首先让Nagios监控本机的各种资源消耗。修改下面配置文件前,首先将各类配置文件备份一份,以免修改过程出现重大问题时,还能回滚到修改前版本(修改前先将配置文件copy一份,命名为xxxx.bak  如下所示)

[root@bogon etc]# cd /usr/local/nagios/etc/
[root@bogon etc]# ls
cgi.cfg htpasswd nagios.cfg objects resource.cfg
[root@bogon etc]# cd objects/
[root@bogon objects]# ls
commands.cfg contacts.cfg localhost.cfg printer.cfg switch.cfg templates.cfg timeperiods.cfg windows.cfg
[root@bogon objects]#

Nagios配置_配置文件_02

1)先修改cgi.cfg

在cgi.cfg文件中,找到下面一些参数配置:
default_user_name=guest
authorized_for_system_information=nagiosadmin
authorized_for_configuration_information=nagiosadmin
authorized_for_system_commands=nagiosadmin
authorized_for_all_services=nagiosadmin
authorized_for_all_hosts=nagiosadmin
authorized_for_all_service_commands=nagiosadmin
authorized_for_all_host_commands=nagiosadmin
将这些参数配置修改为如下:(如果不清楚为什么是kerry,参见上篇博客Nagios学习实践系列——基本安装篇解说)
default_user_name=kerry
authorized_for_system_information=nagiosadmin,kerry
authorized_for_configuration_information=nagiosadmin,kerry
authorized_for_system_commands=nagiosadmin,kerry
authorized_for_all_services=nagiosadmin,kerry
authorized_for_all_hosts=nagiosadmin,kerry
authorized_for_all_service_commands=nagiosadmin,kerry
authorized_for_all_host_commands=nagiosadmin,kerry

Nagios配置_apache_03

 

2)修改resource.cfg配置文件。

如图所示,找到$USER1$=/usr/local/nagios//libexec 将其改为$USER1$=/usr/local/nagios/libexec

Nagios配置_apache_04

3)修改nagios.cfg配置文件

修改一系列的参数配置,将那些多余的/去掉

Nagios配置_配置文件_05

log_file=/usr/local/nagios/var/nagios.log
cfg_file=/usr/local/nagios/etc/objects/commands.cfg
cfg_file=/usr/local/nagios/etc/objects/contacts.cfg
cfg_file=/usr/local/nagios/etc/objects/timeperiods.cfg
cfg_file=/usr/local/nagios/etc/objects/templates.cfg
cfg_file=/usr/local/nagios/etc/objects/localhost.cfg
cfg_file=/usr/local/nagios/etc/objects/services.cfg #此参数暂不配置
object_cache_file=/usr/local/nagios/var/objects.cache
precached_object_file=/usr/local/nagios/var/objects.precache
resource_file=/usr/local/nagios/etc/resource.cfg
status_file=/usr/local/nagios/var/status.dat
command_check_interval=1 #此参数暂时不配置
command_file=/usr/local/nagios/var/rw/nagios.cmd
lock_file=/usr/local/nagios/var/nagios.lock
temp_file=/usr/local/nagios/var/nagios.tmp
log_archive_path=/usr/local/nagios/var/archives
check_result_path=/usr/local/nagios/var/spool/checkresults
state_retention_file=/usr/local/nagios/var/retention.dat
4)修改localhost.cfg配置文件
首先通过命令 hostname查看你监控主机的机器名,例如这次测试环境的主机名为bogon,进入localhost.cfg配置文件,将相应的
host_name或member等配置修改过来。

Nagios配置_apache_06

Nagios配置_配置文件_07

 

localhost.cfg文件的内容如下:

###############################################################################
# LOCALHOST.CFG - SAMPLE OBJECT CONFIG FILE FOR MONITORING THIS MACHINE
#
# Last Modified: 05-31-2007
#
# NOTE: This config file is intended to serve as an *extremely* simple
# example of how you can create configuration entries to monitor
# the local (Linux) machine.
#
###############################################################################




###############################################################################
###############################################################################
#
# HOST DEFINITION
#
###############################################################################
###############################################################################

# Define a host for the local machine

define host{
use linux-server ; Name of host template
to use
; This host definition w
ill inherit all variables that are defined
; in (or inherited by) t
he linux-server host template definition.
host_name bogon
alias bogon
address 127.0.0.1
}



###############################################################################
###############################################################################
#
# HOST GROUP DEFINITION
#
###############################################################################
###############################################################################

# Define an optional hostgroup for Linux machines

define hostgroup{
hostgroup_name linux-servers ; The name of the hostgroup
alias Linux Servers ; Long name of the group
members bogon ; Comma separated list of hosts that belong to
this group
}



###############################################################################
###############################################################################
#
# SERVICE DEFINITIONS
#
###############################################################################
###############################################################################


# Define a service to "ping" the local machine

define service{
use local-service ; Name of service
template to use
host_name bogon
service_description PING
check_command check_ping!100.0,20%!500.0,60%
}


# Define a service to check the disk space of the root partition
# on the local machine. Warning if < 20% free, critical if
# < 10% free space on partition.

define service{
use local-service ; Name of service
template to use
host_name bogon
service_description Root Partition
check_command check_local_disk!20%!10%!/
}



# Define a service to check the number of currently logged in
# users on the local machine. Warning if > 20 users, critical
# if > 50 users.

define service{
use local-service ; Name of service
template to use
host_name bogon
service_description Current Users
check_command check_local_users!20!50
}


# Define a service to check the number of currently running procs
# on the local machine. Warning if > 250 processes, critical if
# > 400 users.

define service{
use local-service ; Name of service
template to use
host_name bogon
service_description Total Processes
check_command check_local_procs!250!400!RSZDT
}



# Define a service to check the load on the local machine.

define service{
use local-service ; Name of service
template to use
host_name bogon
service_description Current Load
check_command check_local_load!5.0,4.0,3.0!10.0,6.0,4.
}



# Define a service to check the swap usage the local machine.
# Critical if less than 10% of swap is free, warning if less than 20% is free

define service{
use local-service ; Name of service
template to use
host_name bogon
service_description Swap Usage
check_command check_local_swap!20!10
}



# Define a service to check SSH on the local machine.
# Disable notifications for this service by default, as not all users may have S
SH enabled.

define service{
use local-service ; Name of service
template to use
host_name bogon
service_description SSH
check_command check_ssh
notifications_enabled 0
}



# Define a service to check HTTP on the local machine.
# Disable notifications for this service by default, as not all users may have H
TTP enabled.

define service{
use local-service ; Name of service
template to use
host_name bogon
service_description HTTP
check_command check_http
notifications_enabled 0
}

 

 

 

基本配置完成后,我们启动Nagios、Apache服务

    启动Apache服务

    [root@bogon conf]# /usr/local/apache/bin/apachectl start

    启动Nagios服务

    [root@bogon conf]# service nagios start

如图所示,就可监控当前服务器的负载、当前用户数、HTTP服务、SSH服务….

Nagios配置_ios_08

Nagios配置_apache_09

 

 

 

配置问题汇总:

在配置Nagios的过程中、总会碰到千奇百怪、各式各样的问题,下面我慢慢收集整理碰到过得的一些问题,当然这是我碰到,没有碰到过得问题,不做收录。

 

问题1:Nagios配置好后,启动了Apache、Nagios服务后,进入Hosts、Services等界面时,出现乱码,如下图所示:

Nagios配置_ios_10

这个问题是由于Apache没有开启cgi脚本的缘故,进入apache的主配置文件目录,修改配置文件httpd.conf,将下面两行的注释取消,重启服务即可解决问题。

#LoadModule cgid_module modules/mod_cgid.so

#LoadModule alias_module modules/mod_alias.so

#LoadModule actions_module modules/mod_actions.so   #暂未确定

在最后一行增加

AddDefaultCharset utf-8           #解决中文乱码问题

问题2:点击Map页面,出现下面错误信息(红色部分):

Not Found

The requested URL /nagios/cgi-bin/statusmap.cgi was not found on this server.

出现这个错误,是因为没有安装gd-devel包导致,需要安装gd-devel包。

问题3:Error: Could not open command file '/usr/local/nagios/var/rw/nagios.cmd'

关于这部分在nagios.cfg中有下面的内容

# EXTERNAL COMMAND FILE# This is the file that Nagios checks for external command requests.# It is also where the command CGI will write commands that are submitted# by users, so it must be writeable by the user that the web server# is running as (usually 'nobody').  Permissions should be set at the# directory level instead of on the file, as the file is deleted every# time its contents are processed.

 这段话的核心意思是apache的运行用户要有对文件写的权限.权限应该设置在目录上,因为每次文件的内容被处理后文件就会被删掉

 command_file=/usr/local/nagios/var/rw/nagios.cmd

首先,看一下你的进程,apache的进程,是什么用户运行,我的机器是daemon

#ps -ef | grep http

root 50297 1 0 21:42 ? 00:00:00 /usr/local/apache//bin/httpd -k start
daemon 50298 50297 0 21:42 ? 00:00:00 /usr/local/apache//bin/httpd -k start
daemon 50299 50297 0 21:42 ? 00:00:00 /usr/local/apache//bin/httpd -k start
daemon 50300 50297 0 21:42 ? 00:00:00 /usr/local/apache//bin/httpd -k start
daemon 50301 50297 0 21:42 ? 00:00:00 /usr/local/apache//bin/httpd -k start
daemon 50425 50297 0 21:43 ? 00:00:00 /usr/local/apache//bin/httpd -k start
root 50909 3194 0 22:02 pts/1 00:00:00 grep http

注意,这里指的是守护进程,而不是root运行的那个起始进程。

然后怎么做呢,如果你运行的nagios进程的用户是nagios,组也是nagios,则:

usermod -G nagios daemon
chmod g+s /path/to/nagiosdir/var/rw