一、Puppet master/agent模型
puppet通过在master端启动puppetmaster服务来接受agent客户端的请,在/etc/puppet/manifest/site.pp中通过客户端的FQDN来定义每个agent所有应用的class,首次启动puppet守护进程时,其会自动进行运行环境的初始化,创建一个本地CA及服务器端相关的证书和密钥等。初始化操作完成后,puppet就会监听指定的套接字并等待客户端的连接请求。默认情况下,其证书和密钥等文件位于/var/lib/puppet/ssl/目录中。
MCollective简介
MCollective是一个调度器,可以解决多个puppet agent同时向master提出请求造成性能,速度下降的问题,它可以根据不同的属性对节点进行分类,对不同的分类执行不同的任务;它是一个控制终端,可以使用它控制客户端和服务器,因此不需要puppet agent定时运行了。
MCollective也是C/S架构,而且client和server使用Midware(中间件)进行通信
Puppet架构与集群
Puppet通常部署为C/S架构,当agent过多时会面临性能问题
常见的集群方案:
puppet + nginx
puppet + passenger + apache
Puppet集群的构建机制
puppetmaster集群:
Active/Active模式高可用集群,分摊puppetmaster上来自于agent的请求压力
反向代理模式,将针对于8140端口请求分散到多台puppetmaster
下面是master/agent模型的原理图:
二、实验环境
192.168.30.116 OS:CentOS 6.4 x86_64 node1.luojianlong.com
192.168.30.117 OS:CentOS 6.4 x86_64 node2.luojianlong.com
192.168.30.119 OS:CentOS 6.4 x86_64 node3.luojianlong.com
需要的软件包:
puppet-2.7.23-1.el6.noarch.rpm
puppet-server-2.7.23-1.el6.noarch.rpm
facter-1.7.3-1.el6.x86_64.rpm
puppet-dashboard-1.2.23-1.el6.noarch.rpm
mysql-5.5.33-linux2.6-x86_64.tar.gz
首先在node1安装master端
#设置各节点的hosts文件 [root@node1 ~]# cat /etc/hosts 192.168.30.116 node1.luojianlong.com 192.168.30.117 node2.luojianlong.com 192.168.30.119 node3.luojianlong.com # 更新facter [root@node1 ~]# rpm -Uvh facter-1.7.3-1.el6.x86_64.rpm # 配置epel源 [root@node1 ~]# cat /etc/yum.repos.d/epel.repo [epel] name=epel baseurl=http://mirrors.sohu.com/fedora-epel/6/$basearch/ gpgcheck=1 gpgkey=http://mirrors.sohu.com/fedora-epel/RPM-GPG-KEY-EPEL-6 # 安装puppet,puppet-server [root@node1 ~]# yum -y localinstall puppet-2.7.23-1.el6.noarch.rpm [root@node1 ~]# yum -y localinstall puppet-server-2.7.23-1.el6.noarch.rpm
在node2,node3安装puppet-agent
[root@node2 ~]# rpm -Uvh facter-1.7.3-1.el6.x86_64.rpm [root@node2 ~]# yum -y localinstall puppet-2.7.23-1.el6.noarch.rpm [root@node3 ~]# rpm -Uvh facter-1.7.3-1.el6.x86_64.rpm [root@node3 ~]# yum -y localinstall puppet-2.7.23-1.el6.noarch.rpm
在node1创建并配置模块
[root@node1 ~]# mkdir -pv /etc/puppet/modules/nginx/{manifests,files,lib,templates,tests,spec} mkdir: created directory `/etc/puppet/modules/nginx' mkdir: created directory `/etc/puppet/modules/nginx/manifests' mkdir: created directory `/etc/puppet/modules/nginx/files' mkdir: created directory `/etc/puppet/modules/nginx/lib' mkdir: created directory `/etc/puppet/modules/nginx/templates' mkdir: created directory `/etc/puppet/modules/nginx/tests' mkdir: created directory `/etc/puppet/modules/nginx/spec' [root@node1 ~]# puppet module list /etc/puppet/modules └── nginx (???) /usr/share/puppet/modules (no modules installed) # 在nginx模块中定义init.pp [root@node1 ~]# vi /etc/puppet/modules/nginx/manifests/init.pp class nginx { package {'nginx': ensure => installed, } } # 定义nginx_web.pp文件 [root@node1 ~]# vi /etc/puppet/modules/nginx/manifests/nginx_web.pp class nginx::nginx_web inherits nginx { file {'/etc/nginx/nginx.conf': ensure => file, source => 'puppet:///modules/nginx/nginx-web.conf', mode => '0644', owner => 'root', group => 'root', notify => Service['nginx'], require => Package['nginx'], } service {'nginx': ensure => running, } } # 准备source文件 [root@node1 ~]# cp /tmp/nginx.conf /etc/puppet/modules/nginx/files/nginx-web.conf # 创建site.pp文件调用前面定义的class [root@node1 ~]# vi /etc/puppet/manifests/site.pp node 'node2.luojianlong' { include nginx::nginx_web } node 'node3.luojianlong' { include nginx::nginx_web }
首次启动puppet服务进程可以以非守护进程方式进行,并让其输出详解信息以便于观察初始化过程。如下所示过程,其逐步展示了创建本地CA、作为puppet服务器的本地主机向CA申请证书、获得证书以及CA移出证书签署请求的过程等,而后启动服务进程并准备接受各agent的连接请求。为下面的命令额外使用--debug选项,还可以获得更为详细的输出信息。
[root@node1 ~]# puppet master --verbose --no-daemonize info: Creating a new SSL key for ca info: Creating a new SSL certificate request for ca info: Certificate Request fingerprint (md5): E0:74:ED:BA:83:EC:6E:A7:1A:1F:89:B1:CC:81:C3:CE notice: Signed certificate request for ca notice: Rebuilding inventory file info: Creating a new certificate revocation list info: Creating a new SSL key for node1.luojianlong.com info: Creating a new SSL certificate request for node1.luojianlong.com info: Certificate Request fingerprint (md5): 05:F1:37:DE:6E:13:CA:32:46:5B:07:2A:05:DE:D1:12 notice: node1.luojianlong.com has a waiting certificate request notice: Signed certificate request for node1.luojianlong.com notice: Removing file Puppet::SSL::CertificateRequest node1.luojianlong.com at '/var/lib/puppet/ssl/ca/requests/node1.luojianlong.com.pem' notice: Removing file Puppet::SSL::CertificateRequest node1.luojianlong.com at '/var/lib/puppet/ssl/certificate_requests/node1.luojianlong.com.pem' notice: Starting Puppet master version 2.7.23
注意:如果此前曾以其它主机名或各种原因启动过puppet客户端进程并完成过初始化,其证书文件将无法符合本此启动的需要;此时,需要先清空/var/lib/puppet/ssl/目录方可完成后续的初始化操作。
如果上述的测试启动没有问题,可中止当前的启动后将之启动守护进程了,在CentOS6上,通常会使用如下命令进行。
[root@node1 ~]# service puppetmaster start Starting puppetmaster: [ OK ] [root@node1 ~]# chkconfig puppetmaster on
启动puppet客户端
puppet agent在首次启动时,会向为其指定的puppet server申请证书,并完成后续连接请求。同样地理由,出于测试的目的,接入当前puppet集群中的首个agent节点可以以非守护进程的方式进行,以观察其初始化过程,如下面的命令所示
[root@node2 ~]# puppet agent --server=node1.luojianlong.com --no-daemonize --verbose info: Creating a new SSL key for node2.luojianlong.com info: Caching certificate for ca info: Creating a new SSL certificate request for node2.luojianlong.com info: Certificate Request fingerprint (md5): 11:56:36:0D:A5:92:11:69:AC:66:46:1B:86:D9:B4:ED [root@node3 ~]# puppet agent --server=node1.luojianlong.com --no-daemonize --verbose info: Creating a new SSL key for node3.luojianlong.com info: Caching certificate for ca info: Creating a new SSL certificate request for node3.luojianlong.com info: Certificate Request fingerprint (md5): A3:70:BF:52:F9:11:DA:0F:09:8B:35:C6:FC:EB:87:14
此时,在puppet服务器端使用puppet cert命令管理客户端的证书请求,其--list选项能够查看等待签署证书的客户端列表,而--sign选项可用于为指定指定节点签署证书,如果要一次性地为多个节点的证书申请进行签署可同时使用--all选项。
[root@node1 ~]# puppet cert --list "node2.luojianlong.com" (11:56:36:0D:A5:92:11:69:AC:66:46:1B:86:D9:B4:ED) "node3.luojianlong.com" (A3:70:BF:52:F9:11:DA:0F:09:8B:35:C6:FC:EB:87:14) [root@node1 ~]# puppet cert --sign node2.luojianlong.com notice: Signed certificate request for node2.luojianlong.com notice: Removing file Puppet::SSL::CertificateRequest node2.luojianlong.com at '/var/lib/puppet/ssl/ca/requests/node2.luojianlong.com.pem' [root@node1 ~]# puppet cert --sign node3.luojianlong.com notice: Signed certificate request for node3.luojianlong.com notice: Removing file Puppet::SSL::CertificateRequest node3.luojianlong.com at '/var/lib/puppet/ssl/ca/requests/node3.luojianlong.com.pem'
一旦agent节点收到签署过的证书时,其就会显示类似如下信息。
[root@node2 ~]# puppet agent --server=node1.luojianlong.com --no-daemonize --verbose info: Creating a new SSL key for node2.luojianlong.com info: Caching certificate for ca info: Creating a new SSL certificate request for node2.luojianlong.com info: Certificate Request fingerprint (md5): 11:56:36:0D:A5:92:11:69:AC:66:46:1B:86:D9:B4:ED info: Caching certificate for node2.luojianlong.com notice: Starting Puppet client version 2.7.23 info: Caching certificate_revocation_list for ca info: Caching catalog for node2.luojianlong.com info: Applying configuration version '1389325340' notice: /Stage[main]/Nginx/Package[nginx]/ensure: created notice: /Stage[main]/Nginx::Nginx_web/Service[nginx]/ensure: ensure changed 'stopped' to 'running' info: Creating state file /var/lib/puppet/state/state.yaml notice: Finished catalog run in 10.22 seconds [root@node3 ~]# puppet agent --server=node1.luojianlong.com --no-daemonize --verbose info: Creating a new SSL key for node3.luojianlong.com info: Caching certificate for ca info: Creating a new SSL certificate request for node3.luojianlong.com info: Certificate Request fingerprint (md5): A3:70:BF:52:F9:11:DA:0F:09:8B:35:C6:FC:EB:87:14 info: Caching certificate for node3.luojianlong.com notice: Starting Puppet client version 2.7.23 info: Caching certificate_revocation_list for ca info: Caching catalog for node3.luojianlong.com info: Applying configuration version '1389325340' notice: /Stage[main]/Nginx/Package[nginx]/ensure: created notice: /Stage[main]/Nginx::Nginx_web/Service[nginx]/ensure: ensure changed 'stopped' to 'running' info: Creating state file /var/lib/puppet/state/state.yaml notice: Finished catalog run in 17.83 seconds
确保上述agent相关的操作不存在问题后,便可以将--server选项指定的信息存储于agent的配置文件中,并以服务进程的方式启动puppet agent了。其配置文件为/etc/puppet/puppet.conf,server指令定义于[main]段中。配置完成,即可以服务方式启动puppet。
[root@node2 ~]# vi /etc/puppet/puppet.conf server = node1.luojianlong.com [root@node3 ~]# vi /etc/puppet/puppet.conf server = node1.luojianlong.com [root@node2 ~]# service puppet start Starting puppet: [ OK ] [root@node3 ~]# service puppet start Starting puppet: [ OK ]
再次通过客户端测试。
[root@node2 ~]# puppet agent --server=node1.luojianlong.com --no-daemonize --verbose --test info: Caching catalog for node2.luojianlong.com info: Applying configuration version '1389325340' notice: Finished catalog run in 0.97 seconds [root@node3 ~]# puppet agent --server=node1.luojianlong.com --no-daemonize --verbose --test info: Caching catalog for node3.luojianlong.com info: Applying configuration version '1389325340' notice: Finished catalog run in 0.95 seconds
如上的信息显示其已经能正常与master建立连接
查看node2,node3 nginx是否安装并启动
[root@node2 ~]# rpm -q nginx nginx-1.0.15-5.el6.x86_64 [root@node2 ~]# ps aux | grep nginx root 19233 0.0 0.0 96432 1968 ? Ss 12:18 0:00 nginx: master process /usr/sbin/nginx -c /etc/nginx/nginx.conf nginx 19234 0.0 0.0 96780 2612 ? S 12:18 0:00 nginx: worker process root 19515 0.0 0.0 103248 820 pts/0 S+ 12:22 0:00 grep nginx [root@node3 ~]# rpm -q nginx nginx-1.0.15-5.el6.x86_64 [root@node3 ~]# ps aux | grep nginx root 3082 0.0 0.0 96432 1968 ? Ss 12:18 0:00 nginx: master process /usr/sbin/nginx -c /etc/nginx/nginx.conf nginx 3083 0.0 0.0 96780 2612 ? S 12:18 0:00 nginx: worker process root 3242 0.0 0.0 103248 824 pts/0 S+ 12:22 0:00 grep nginx
正常安装启动
自动签发证书
可以设置master自动签发所有的证书,我们只需要在/etc/puppet 目录下创建 autosign.conf 文件,修改 /etc/puppet/puppet.conf文件
[root@node1 ~]# cat > /etc/puppet/autosign.conf << EOF > *.luojianlong.com > EOF
[root@node1 ~]# vi /etc/puppet/puppet.conf # 添加[master] [master] autosign = /etc/puppet/autosign.conf [root@node1 ~]# service puppetmaster restart Stopping puppetmaster: [ OK ] Starting puppetmaster: [ OK ]
这样就会对所有来自luojianlong.com 的机器的请求,都自动签名,puppet每半个小时检查一次更新,如果想修改检查时间,可以修改客户端配置文件/etc/puppet/puppet.conf,在[agent]中添加runinterval的值,然后重启puppet默认为600,单位秒。
在node1上安装配置puppet-dashboard:
[root@node1 ~]# yum -y install rubygem-rake ruby-mysql [root@node1 ~]# yum localinstall puppet-dashboard-1.2.23-1.el6.noarch.rpm -y [root@node1 ~]# gem install rake
在node1上安装mysql
[root@node1 ~]# tar zxvf mysql-5.5.33-linux2.6-x86_64.tar.gz -C /usr/local/ [root@node1 ~]# ln -s /usr/local/mysql-5.5.33-linux2.6-x86_64 /usr/local/mysql [root@node1 ~]# cd /usr/local/mysql [root@node1 mysql]# useradd -r mysql [root@node1 mysql]# mkdir /mydata/data -p [root@node1 mysql]# chown -R root.mysql ./* [root@node1 mysql]# chown -R mysql.mysql /mydata/data/ [root@node1 mysql]# cp support-files/mysql.server /etc/rc.d/init.d/mysqld [root@node1 mysql]# chkconfig --add mysqld [root@node1 mysql]# chkconfig mysqld on [root@node1 mysql]# cp support-files/my-large.cnf /etc/my.cnf [root@node1 mysql]# ./scripts/mysql_install_db --user=mysql --datadir=/mydata/data [root@node1 mysql]# vi /etc/profile.d/mysql.sh export PATH=/usr/local/mysql/bin:$PATH [root@node1 mysql]# . /etc/profile.d/mysql.sh [root@node1 mysql]# vi /etc/my.cnf datadir = /mydata/data innodb_file_per_table = 1 [root@node1 mysql]# service mysqld start Starting MySQL..... SUCCESS!
创建数据库并完成授权
mysql> create database dashboard character set utf8; Query OK, 1 row affected (0.00 sec) mysql> grant all privileges on dashboard.* to 'dashboard'@'localhost' identified by '123456'; Query OK, 0 rows affected (0.00 sec) mysql> flush privileges; Query OK, 0 rows affected (0.00 sec)
修改/usr/share/puppet-dashboard/config/database.yml中的production段。
[root@node1 ~]# vi /usr/share/puppet-dashboard/config/database.yml production: host: 127.0.0.1 database: dashboard username: dashboard password: 123456 encoding: utf8 adapter: mysql [root@node1 ~]# cd /usr/share/puppet-dashboard/ [root@node1 puppet-dashboard]# rake gems:refresh_specs # 为dashboard依赖的数据库导入所需要的表: [root@node1 puppet-dashboard]# rake RAILS_ENV=production db:migrate
测试服务器是否能正常工作
[root@node1 ~]# /usr/share/puppet-dashboard/script/server -e production => Booting WEBrick => Rails 2.3.17 application starting on http://0.0.0.0:3000 => Call with -d to detach => Ctrl-C to shutdown server [2014-01-10 12:37:34] INFO WEBrick 1.3.1 [2014-01-10 12:37:34] INFO ruby 1.8.7 (2011-06-30) [x86_64-linux] [2014-01-10 12:37:34] INFO WEBrick::HTTPServer#start: pid=20641 port=3000
打开浏览器访问http://192.168.30.116:3000
配置puppet服务端和客户端
[root@node1 ~]# vi /etc/puppet/puppet.conf #在[master]段中添加 reports = store, http reporturl = http://192.168.30.116:3000/reports/upload [root@node1 ~]# service puppetmaster restart Stopping puppetmaster: [ OK ] Starting puppetmaster: [ OK ] [root@node2 ~]# vi /etc/puppet/puppet.conf # 在[agent]段中添加 report = true [root@node2 ~]# service puppet restart Stopping puppet: [ OK ] Starting puppet: [ OK ] # node3也一样,添加并重启puppet
然后启动dashboard
[root@node1 ~]# /usr/share/puppet-dashboard/script/server -e production -d
打开浏览器访问http://192.168.30.116:3000/
看到“# pending task”类的信息,且数字大于0,则表示已经正常接收报告了,一旦有用户任务延迟就会记录在dashboard中。
puppet kick 功能实现
puppet客户端默认每30分钟跟服务器通讯一次,但是有时,我们希望服务端能给客户端紧急推送一些任务,于是就有了puppet kick(puppet 2.6以前叫puppetrun)。
编辑客户端/etc/puppet/puppet.conf
[root@node2 ~]# vi /etc/puppet/puppet.conf # 在[agent]段中添加 listen = true # 编辑或新建文件/etc/puppet/namespaceauth.conf [root@node2 ~]# vi /etc/puppet/namespaceauth.conf [puppetrunner] allow *.luojianlong.com
编辑文件auth.conf
[root@node2 ~]# vi /etc/puppet/auth.conf # 添加如下几行 path /run method save allow node1.luojianlong.com [root@node2 ~]# service puppet restart Stopping puppet: [ OK ] Starting puppet: [ OK ] [root@node2 ~]# netstat -anptl | grep ruby tcp 0 0 0.0.0.0:8139 0.0.0.0:* LISTEN 27053/ruby
node3做上述一样的操作
在服务端运行命令
[root@node1 ~]# puppet kick -a --host=node2.luojianlong.com Triggering node2.luojianlong.com Getting status status is success node2.luojianlong.com finished with exit code 0 Finished [root@node1 ~]# puppet kick -a --host=node3.luojianlong.com Triggering node3.luojianlong.com Getting status status is success node3.luojianlong.com finished with exit code 0 Finished
发现可以正常推送