前期准备
配置hosts
192.168.245.105 scm-node1
 192.168.245.106 scm-node2
 192.168.245.107 scm-node3设置hostname
在192.168.245.105上执行
sudo hostnamectl --static --transient set-hostname scm-node1
在192.168.245.106上执行
sudo hostnamectl --static --transient set-hostname scm-node2
在192.168.245.107上执行
sudo hostnamectl --static --transient set-hostname scm-node3
关闭防火墙
CentOS-6
在scm-node2、scm-node3两台主机上执行以下命令:
sudo chkconfig iptables off
sudo service iptables stop
CentOS-7
在scm-node2、scm-node3两台主机上执行以下命令:
sudo chkconfig firewalld off
sudo service firewalld stop
关闭SELinux
在scm-node2、scm-node3两台主机上执行以下命令:
sudo sed -i ‘/SELINUX/s/enforcing/disabled/’ /etc/selinux/config
安装软件
在scm-node2、scm-node3两台主机上已安装CDH、MySQL。
其他配置
已设置NTP时钟同步和双机互信。
安装nfs
在线安装nfs
在scm-node1主机上执行以下命令:
sudo yum -y install nfs-utils rpcbind
离线安装nfs
CentOS-6安装包
keyutils-1.4-5.el6.x86_64.rpm
 libevent-1.4.13-4.el6.x86_64.rpm
 libgssglue-0.1-11.el6.x86_64.rpm
 libtirpc-0.2.1-15.el6.x86_64.rpm
 nfs-utils-1.2.3-78.el6_10.1.x86_64.rpm
 nfs-utils-lib-1.1.5-13.el6.x86_64.rpm
 python-argparse-1.2.1-2.1.el6.noarch.rpm
 rpcbind-0.2.0-16.el6.x86_64.rpmCentOS-7安装包
gssproxy-0.7.0-17.el7.x86_64.rpm
 keyutils-1.5.8-3.el7.x86_64.rpm
 libbasicobjects-0.1.1-29.el7.x86_64.rpm
 libevent-2.0.21-4.el7.x86_64.rpm
 libini_config-1.3.1-29.el7.x86_64.rpm
 libcollection-0.7.0-29.el7.x86_64.rpm
 libpath_utils-0.2.1-29.el7.x86_64.rpm
 libnfsidmap-0.25-19.el7.x86_64.rpm
 libref_array-0.1.5-29.el7.x86_64.rpm
 libtirpc-0.2.4-0.10.el7.x86_64.rpm
 libverto-libevent-0.2.5-4.el7.x86_64.rpm
 nfs-utils-1.3.0-0.54.el7.x86_64.rpm
 quota-nls-4.01-17.el7.noarch.rpm
 rpcbind-0.2.0-44.el7.x86_64.rpm
 quota-4.01-17.el7.x86_64.rpm
 tcp_wrappers-7.6-77.el7.x86_64.rpm安装所有RPM包
在scm-node1主机上进入安装包所在目录,然后执行以下命令:
sudo rpm -ivh *.rpm
参数说明
ro:共享目录只读;
 rw:共享目录可读可写;
 all_squash:所有访问用户都映射为匿名用户或用户组;
 no_all_squash(默认):访问用户先与本机用户匹配,匹配失败后再映射为匿名用户或用户组;
 root_squash(默认):将来访的root用户映射为匿名用户或用户组;
 no_root_squash:来访的root用户保持root帐号权限;
 anonuid=<UID>:指定匿名访问用户的本地用户UID,默认为nfsnobody(65534);
 anongid=<GID>:指定匿名访问用户的本地用户组GID,默认为nfsnobody(65534);
 secure(默认):限制客户端只能从小于1024的tcp/ip端口连接服务器;
 insecure:允许客户端从大于1024的tcp/ip端口连接服务器;
 sync:将数据同步写入内存缓冲区与磁盘中,效率低,但可以保证数据的一致性;
 async:将数据先保存在内存缓冲区中,必要时才写入磁盘;
 wdelay(默认):检查是否有相关的写操作,如果有则将这些写操作一起执行,这样可以提高效率;
 no_wdelay:若有写操作则立即执行,应与sync配合使用;
 subtree_check(默认)
 :若输出目录是一个子目录,则nfs服务器将检查其父目录的权限;
 no_subtree_check
 :即使输出目录是一个子目录,nfs服务器也不检查其父目录的权限,这样可以提高效率;启动nfs
sudo service rpcbind start
sudo service nfs start
sudo chkconfig rpcbind on
sudo chkconfig nfs on
安装corosync+pacemaker
在线安装corosync+pacemaker
在scm-node2、scm-node3两台主机上执行以下命令:
sudo yum install -y corosync pacemaker
离线安装corosync+pacemaker
CentOS-6安装包
corosync-1.4.7-6.el6.x86_64.rpm
 corosynclib-1.4.7-6.el6.x86_64.rpm
 ConsoleKit-0.4.1-6.el6.x86_64.rpm
 ConsoleKit-libs-0.4.1-6.el6.x86_64.rpm
 avahi-libs-0.6.25-17.el6.x86_64.rpm
 cifs-utils-4.8.1-20.el6.x86_64.rpm
 clusterlib-3.0.12.1-84.el6.x86_64.rpm
 cman-3.0.12.1-84.el6.x86_64.rpm
 cvs-1.11.23-16.el6.x86_64.rpm
 cyrus-sasl-md5-2.1.23-15.el6_6.2.x86_64.rpm
 dbus-1.2.24-9.el6.x86_64.rpm
 dmidecode-2.12-7.el6.x86_64.rpm
 eggdbus-0.6-3.el6.x86_64.rpm
 fence-agents-4.0.15-13.el6_9.2.x86_64.rpm
 fence-virt-0.2.3-24.el6.x86_64.rpm
 gettext-0.17-18.el6.x86_64.rpm
 gnutls-2.12.23-22.el6.x86_64.rpm
 gnutls-utils-2.12.23-22.el6.x86_64.rpm
 hal-0.5.14-14.el6.x86_64.rpm
 hal-info-20090716-5.el6.noarch.rpm
 hal-libs-0.5.14-14.el6.x86_64.rpm
 hdparm-9.43-4.el6.x86_64.rpm
 ipmitool-1.8.15-2.el6.x86_64.rpm
 libgomp-4.4.7-23.el6.x86_64.rpm
 libqb-0.17.1-2.el6.x86_64.rpm
 libtalloc-2.1.5-1.el6_7.x86_64.rpm
 libtdb-1.3.8-3.el6_8.2.x86_64.rpm
 libtevent-0.9.26-2.el6_7.x86_64.rpm
 libvirt-client-0.10.2-64.el6.x86_64.rpm
 libxslt-1.1.26-2.el6_3.1.x86_64.rpm
 libibverbs-1.1.8-4.el6.x86_64.rpm
 libnl-1.1.4-2.el6.x86_64.rpm
 librdmacm-1.0.21-0.el6.x86_64.rpm
 lm_sensors-libs-3.1.1-17.el6.x86_64.rpm
 modcluster-0.16.2-35.el6.x86_64.rpm
 nc-1.84-24.el6.x86_64.rpm
 net-snmp-utils-5.5-60.el6.x86_64.rpm
 net-snmp-libs-5.5-60.el6.x86_64.rpm
 numactl-2.0.9-2.el6.x86_64.rpm
 oddjob-0.30-6.el6.x86_64.rpm
 openais-1.1.1-7.el6.x86_64.rpm
 openaislib-1.1.1-7.el6.x86_64.rpm
 pacemaker-1.1.18-3.el6.x86_64.rpm
 pacemaker-cli-1.1.18-3.el6.x86_64.rpm
 pacemaker-cluster-libs-1.1.18-3.el6.x86_64.rpm
 pacemaker-libs-1.1.18-3.el6.x86_64.rpm
 parted-2.1-29.el6.x86_64.rpm
 pciutils-3.1.10-4.el6.x86_64.rpm
 perl-Net-Telnet-3.03-11.el6.noarch.rpm
 perl-TimeDate-1.16-13.el6.noarch.rpm
 pexpect-2.3-6.el6.noarch.rpm
 pm-utils-1.2.5-11.el6.x86_64.rpm
 polkit-0.96-11.el6.x86_64.rpm
 pyOpenSSL-0.13.1-2.el6.x86_64.rpm
 python-suds-0.4.1-3.el6.noarch.rpm
 quota-3.17-23.el6.x86_64.rpm
 rdma-6.9_4.1-3.el6.noarch.rpm
 resource-agents-3.9.5-46.el6.x86_64.rpm
 ricci-0.16.2-87.el6.x86_64.rpm
 samba-common-3.6.23-51.el6.x86_64.rpm
 samba-winbind-3.6.23-51.el6.x86_64.rpm
 samba-winbind-clients-3.6.23-51.el6.x86_64.rpm
 sg3_utils-1.28-13.el6.x86_64.rpm
 sg3_utils-libs-1.28-13.el6.x86_64.rpm
 tcp_wrappers-7.6-58.el6.x86_64.rpm
 telnet-0.17-48.el6.x86_64.rpm
 yajl-1.0.7-3.el6.x86_64.rpmCentOS-7安装包
corosync-2.4.3-2.el7_5.1.x86_64.rpm
 corosynclib-2.4.3-2.el7_5.1.x86_64.rpm
 bc-1.06.95-13.el7.x86_64.rpm
 cifs-utils-6.2-10.el7.x86_64.rpm
 cups-libs-1.6.3-35.el7.x86_64.rpm
 libldb-1.2.2-1.el7.x86_64.rpm
 libtalloc-2.1.10-1.el7.x86_64.rpm
 libtevent-0.9.33-2.el7.x86_64.rpm
 libtdb-1.3.15-1.el7.x86_64.rpm
 libwbclient-4.7.1-9.el7_5.x86_64.rpm
 libcgroup-0.41-15.el7.x86_64.rpm
 libxslt-1.1.28-5.el7.x86_64.rpm
 libqb-1.0.1-6.el7.x86_64.rpm
 pacemaker-cluster-libs-1.1.18-11.el7_5.3.x86_64.rpm
 perl-TimeDate-2.30-2.el7.noarch.rpm
 pacemaker-1.1.18-11.el7_5.3.x86_64.rpm
 pacemaker-cli-1.1.18-11.el7_5.3.x86_64.rpm
 psmisc-22.20-15.el7.x86_64.rpm
 samba-common-4.7.1-9.el7_5.noarch.rpm
 resource-agents-3.9.5-124.el7.x86_64.rpm
 samba-common-libs-4.7.1-9.el7_5.x86_64.rpm
 pacemaker-libs-1.1.18-11.el7_5.3.x86_64.rpm
 samba-client-libs-4.7.1-9.el7_5.x86_64.rpm安装所有RPM包
在scm-node2、scm-node3两台主机进入安装包所在目录,然后执行以下命令:
sudo rpm -ivh *.rpm
安装crmsh
下载crmsh
CentOS-6下载地址
http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-6/noarch/
CentOS-7下载地址
http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-7/noarch/
离线安装crmsh
CentOS-6安装包
crmsh-3.0.0-6.1.noarch.rpm
 crmsh-scripts-3.0.0-6.1.noarch.rpm
 python-parallax-1.0.1-28.1.noarch.rpm
 python-lxml-2.2.3-1.1.el6.x86_64.rpm
 python-six-1.9.0-2.el6.noarch.rpm
 python-dateutil-1.4.1-7.el6.noarch.rpm
 redhat-rpm-config-9.0.3-51.el6.centos.noarch.rpmCentOS-7安装包
libxslt-1.1.28-5.el7.x86_64.rpm
 python-dateutil-1.5-7.el7.noarch.rpm
 python-lxml-3.2.1-4.el7.x86_64.rpm
 python-parallax-1.0.1-29.1.noarch.rpm
 crmsh-scripts-3.0.0-6.2.noarch.rpm
 crmsh-3.0.0-6.2.noarch.rpm安装所有RPM包
在scm-node2、scm-node3两台主机进入安装包所在目录,然后执行以下命令:
sudo rpm -ivh *.rpm
配置corosync+pacemaker集群
配置corosync
配置version1.x(CentOS-6)
在scm-node2主机上执行以下命令:
sudo vi /etc/corosync/corosync.conf
/etc/corosync/corosync.conf内容:
compatibility: whitetank
 totem {
 version: 2
 secauth: off
 interface {
 member {
 memberaddr: scm-node2
 }
 member {
 memberaddr: scm-node3
 }
 ringnumber: 0
 bindnetaddr: scm-node2
 mcastport: 5405
 }
 transport: udpu
 }
 logging {
 fileline: off
 to_logfile: yes
 to_syslog: yes
 logfile: /var/log/cluster/corosync.log
 debug: off
 timestamp: on
 logger_subsys {
 subsys: AMF
 debug: off
 }
 }
 service {
 name: pacemaker
 ver: 0
 use_mgmtd: yes
 }在scm-node3主机上执行以下命令:
sudo vi /etc/corosync/corosync.conf
/etc/corosync/corosync.conf内容:
compatibility: whitetank
 totem {
 version: 2
 secauth: off
 interface {
 member {
 memberaddr: scm-node2
 }
 member {
 memberaddr: scm-node3
 }
 ringnumber: 0
 bindnetaddr: scm-node3
 mcastport: 5405
 }
 transport: udpu
 }
 logging {
 fileline: off
 to_logfile: yes
 to_syslog: yes
 logfile: /var/log/cluster/corosync.log
 debug: off
 timestamp: on
 logger_subsys {
 subsys: AMF
 debug: off
 }
 }
 service {
 name: pacemaker
 ver: 0
 use_mgmtd: yes
 }配置version2.x(CentOS-7)
在scm-node2、scm-node3两台主机上执行以下命令:
sudo vi /etc/corosync/corosync.conf
/etc/corosync/corosync.conf内容:
totem {
 version: 2
 secauth: off
 cluster_name: cmf
 transport: udpu
 }
 nodelist {
 node {
 ring0_addr: scm-node2
 nodeid: 1
 }
 node {
 ring0_addr: scm-node3
 nodeid: 2
 }
 }
 logging {
 fileline: off
 to_logfile: yes
 to_syslog: yes
 logfile: /var/log/cluster/corosync.log
 debug: off
 timestamp: on
 logger_subsys {
 subsys: AMF
 debug: off
 }
 }
 quorum {
 provider: corosync_votequorum
 two_node: 1
 }启动集群
启动version1.x(CentOS-6)
启动服务:
sudo service corosync start
开机启动:
sudo chkconfig corosync on
启动version2.x(CentOS-7)
启动服务:
sudo service pacemaker start
sudo service corosync start
开机启动:
sudo chkconfig pacemaker on
sudo chkconfig corosync on
配置集群
禁用仲裁检查
在scm-node2主机上执行以下命令:
sudo crm configure property no-quorum-policy=ignore
在两个节点中,当节点达不到法定票数时(节点数不是奇数),即两个节点一个坏了,没法投票,正常的节点达不到法定票数,此时如果是默认参数,即正常的机器不能工作,所以需要该为ignore,使正常机器接管。
禁用stonith
在scm-node2主机上执行以下命令:
sudo crm configure property stonith-enabled=false
因为我们这里没有stonith设备所有要禁用。
修改默认粘性值
在scm-node2主机上执行以下命令:
sudo crm configure rsc_defaults resource-stickiness= 100
一些环境中会要求尽量避免资源在节点之间移动。移动资源通常一位置一段时间内无法提供服务,某些负载的服务,比如Oracle数据库,这个时间可能会很长。为了达到这个效果,pacemaker有一个叫做资源粘性值的概念,它能够控制一个服务(资源)有多想待在它正在运行的节点上。你可以把它认为是无法提供服务的“代价”。pacemaker为了达到最优分部各个资源的目的,默认设置这个值为0.我们可以为每个资源定义不同的粘性值,但一般来说更改默认粘性值就够了。
检查配置是否正确
在scm-node2、scm-node3两台主机上执行以下命令:
sudo crm_verify -L -V
假若没有输出任何则配置正确。
常用操作
查看集群状态
在scm-node2主机上执行以下命令:
sudo crm status
查看资源配置
在scm-node2主机上执行以下命令:
sudo crm configure show
查看资源代理
在scm-node2主机上执行以下命令:
sudo crm ra classes
移动资源位置
在scm-node2主机上执行以下命令:
sudo crm resource move cloudera-scm-server scm-node3
定义克隆集群
定义2个克隆资源:
sudo crm configure clone mysql-cluster mysql clone-max=2 clone-node-max=2 notify=true
停止克隆:
sudo crm resource stop mysql-cluster
删除克隆:
sudo crm configure delete mysql-cluster
注意:
1.mysql-cluster是自定义集群名
2.必须先定义原生(primitive)mysql资源,详情见6.1.2
定义资源分组
定义分组,可以保证mysql和cloudera-scm-server在同一节点上:
sudo crm configure group server-group mysql cloudera-scm-server
停止分组:
sudo crm resource stop server-group
删除分组:
sudo crm configure delete server-group
注意:proxy-group是自定义组名。
定义资源约束
定义排列约束,mysql和cloudera-scm-server必须在同一节点上:
sudo crm configure colocation mysql-with-cloudera-scm-server inf: mysql
 cloudera-scm-server定义顺序约束,先启动mysql之后才启动cloudera-scm-server:
sudo crm configure order mysql_before_cloudera-scm-server mandatory: mysql cloudera-scm-server
定义位置约束,资源vip固定在scm-node2:
sudo crm configure location vip_pref_node2 vip inf: scm-node2
删除约束:
sudo crm configure delete mysql-with-cloudera-scm-server
sudo crm configure delete mysql_before_cloudera-scm-server
sudo crm configure delete vip_pref_node2
注意:mysql-with-cloudera-scm-server、mysql_before_cloudera-scm-server、vip_pref_node2是自定义约束名
删除集群资源
sudo crm resource stop vip
sudo crm configure delete vip
注意:必须先停止资源,才能删除。
配置CDH HA集群
配置nfs
配置mysql
在scm-node1主机上创建文件夹:
sudo mkdir -p /media/mysql
在scm-node1主机上给文件夹授权:
sudo chmod 666 /media/mysql
在scm-node1主机上配置exports:
sudo vi /etc/exports
/etc/exports添加内容:
/media/mysql scm-node*(rw,async,no_root_squash,no_subtree_check)
注意:scm-node*表示允许主机名以scm-node开头的主机访问/media/mysql
配置生效命令:
sudo exportfs -r
在scm-node2、scm-node3两台主机上挂载目录:
sudo vi /etc/fstab
在/etc/fstab文件最后添加内容:
scm-node1:/media/mysql /var/lib/mysql nfs auto,noatime,nolock,intr,tcp,actimeo=1800 0 0
配置cloudera-scm-server
在scm-node1主机上创建文件夹:
sudo mkdir -p /media/cloudera-scm-server
在scm-node1主机上给文件夹授权:
sudo chmod 666 /media/ cloudera-scm-server
配置exports:
sudo vi /etc/exports
/etc/exports添加内容:
/media/ cloudera-scm-server scm-node*(rw,async,no_root_squash,no_subtree_check)
配置生效命令:
sudo exportfs -r
在scm-node2、scm-node3两台主机上文件夹授权:
sudo chown -R cloudera-scm:cloudera-scm /var/lib/cloudera-scm-server
在scm-node2、scm-node3两台主机上挂载目录:
sudo vi /etc/fstab
在/etc/fstab最后添加内容:
scm-node1:/media/cloudera-scm-server /var/lib/cloudera-scm-server nfs auto,noatime,nolock,intr,tcp,actimeo=1800 0 0
配置cloudera-scm-agent
在scm-node1主机上创建文件夹:
sudo mkdir -p /media/cloudera-scm-agent
sudo mkdir -p /media/cloudera-host-monitor
sudo mkdir -p /media/cloudera-scm-eventserver
sudo mkdir -p /media/cloudera-service-monitor
在scm-node1主机上给文件夹授权:
sudo chmod 666 /media/ cloudera-scm-agent
sudo chmod 666 /media/ cloudera-host-monitor
sudo chmod 666 /media/ cloudera-scm-eventserver
sudo chmod 666 /media/ cloudera-service-monitor
配置exports:
sudo vi /etc/exports
/etc/exports添加内容:
/media/ cloudera-scm-agent scm-node*(rw,async,no_root_squash,no_subtree_check)
/media/ cloudera-host-monitor scm-node*(rw,async,no_root_squash,no_subtree_check)
/media/ cloudera-scm-eventserver scm-node*(rw,async,no_root_squash,no_subtree_check)
/media/ cloudera-service-monitor scm-node*(rw,async,no_root_squash,no_subtree_check)
配置生效命令:
sudo exportfs -r
在scm-node2、scm-node3两台主机上文件夹授权:
sudo chown -R cloudera-scm:cloudera-scm /var/lib/cloudera-scm-agent
sudo chown -R cloudera-scm:cloudera-scm /var/lib/cloudera-host-monitor
sudo chown -R cloudera-scm:cloudera-scm /var/lib/cloudera-scm-eventserver
sudo chown -R cloudera-scm:cloudera-scm /var/lib/cloudera-service-monitor
在scm-node2、scm-node3两台主机上挂载目录:
sudo vi /etc/fstab
在/etc/fstab最后添加内容:
scm-node1:/media/cloudera-scm-agent /var/lib/cloudera-scm-agent nfs
 auto,noatime,nolock,intr,tcp,actimeo=1800 0 0scm-node1:/media/cloudera-host-monitor /var/lib/cloudera- host-monitor nfs
 auto,noatime,nolock,intr,tcp,actimeo=1800 0 0scm-node1:/media/cloudera-scm-eventserver /var/lib/cloudera-scm-eventserver
 nfs auto,noatime,nolock,intr,tcp,actimeo=1800 0 0scm-node1:/media/cloudera-service-monitor /var/lib/cloudera- service-monitor
 nfs auto,noatime,nolock,intr,tcp,actimeo=1800 0 0在CDH管理界面执行以下操作:
1.停止Cloudera Management Service,然后删除
2.关闭HTTP Referer Check
添加资源
添加VIP资源
在scm-node2主机上执行以下命令:
sudo crm configure primitive vip ocf?IPaddr2 params
 ip=‘192.168.245.165’ op monitor interval=5s timeout=20s on-fail=restart注意:vip是自定义资源名,ip必须和当前主机在同一网段。
添加mysql资源
在scm-node2、scm-node3主机上执行以下命令(CentOS-6):
sudo chkconfig mysql off
在scm-node2、scm-node3主机上执行以下命令(CentOS-7):
sudo chkconfig mysqld off
在scm-node2主机上执行以下命令(CentOS-6):
sudo crm configure primitive mysql lsb:mysql op monitor interval=20s
 timeout=100s on-fail=restart在scm-node2主机上执行以下命令(CentOS-7):
sudo crm configure primitive mysql systemd:mysqld op monitor interval=20s
 timeout=100s on-fail=restart添加cloudera-scm-server资源
在scm-node2、scm-node3主机上执行以下命令(CentOS-6):
sudo chkconfig cloudera-scm-server off
在scm-node2、scm-node3主机上执行以下命令(CentOS-7):
sudo chkconfig cloudera-scm-server off
在scm-node2、scm-node3主机上修改db.propertie:
sudo vi /etc/cloudera-scm-server/db.properties
/etc/cloudera-scm-server/db.properties变更内容:
com.cloudera.cmf.db.host=192.168.245.165
在scm-node2主机上添加cloudera-scm-server资源:
sudo crm configure primitive cloudera-scm-server lsb:cloudera-scm-server op
 monitor interval=20s timeout=40s on-fail=restart添加cloudera-scm-agent资源
在scm-node2、scm-node3主机上执行以下命令(CentOS-6):
sudo chkconfig cloudera-scm-agent off
在scm-node2、scm-node3主机上执行以下命令(CentOS-7):
sudo chkconfig cloudera-scm-agent off
在scm-node2、scm-node3主机上创建文件夹:
sudo mkdir -p /usr/lib/ocf/resource.d/cm
在scm-node2、scm-node3主机上创建文件:
sudo vi /usr/lib/ocf/resource.d/cm/cloudera-scm-agent
/usr/lib/ocf/resource.d/cm/cloudera-scm-agent内容(CentOS-6):
#!/bin/sh
 #######################################################################
 # CM Agent OCF script
 #######################################################################
 #######################################################################
 # Initialization:
 : ${__OCF_ACTION=$1}
 OCF_SUCCESS=0
 OCF_ERROR=1
 OCF_STOPPED=7
 #######################################################################
 meta_data() {
 cat <<END
 <?xml versinotallow=“1.0”?>
 <!DOCTYPE resource-agent SYSTEM “ra-api-1.dtd”>
 <resource-agent name=“Cloudera Manager Agent” versinotallow=“1.0”>
 <version>1.0</version>
 <longdesc lang=“en”>
 This OCF agent handles simple monitoring, start, stop of the Cloudera
 Manager Agent, intended for use with Pacemaker/corosync for failover.
 </longdesc>
 <shortdesc lang=“en”>Cloudera Manager Agent OCF script</shortdesc>
 <parameters />
 <actions>
 <action name=“start” timeout=“20” />
 <action name=“stop” timeout=“20” />
 <action name=“monitor” timeout=“20” interval=“10” depth=“0”/>
 <action name=“meta-data” timeout=“5” />
 </actions>
 </resource-agent>
 END
 }
 #######################################################################
 agent_usage() {
 cat <<END
 usage: $0 {start|stop|monitor|meta-data}
 Cloudera Manager Agent HA OCF script - used for managing Cloudera Manager
 Agent and managed processes lifecycle for use with Pacemaker.
 END
 }
 agent_start() {
 service cloudera-scm-agent start
 if [ $? = 0 ]; then
 return $OCF_SUCCESS
 fi
 return $OCF_ERROR
 }
 agent_stop() {
 service cloudera-scm-agent hard_stop_confirmed
 if [ $? = 0 ]; then
 return $OCF_SUCCESS
 fi
 return $OCF_ERROR
 }
 agent_monitor() {
 # Monitor _MUST!_ differentiate correctly between running
 # (SUCCESS), failed (ERROR) or _cleanly_ stopped (NOT RUNNING).
 # That is THREE states, not just yes/no.
 service cloudera-scm-agent status
 if [ $? = 0 ]; then
 return $OCF_SUCCESS
 fi
 return $OCF_STOPPED
 }
 case $__OCF_ACTION in
 meta-data) meta_data
 exit $OCF_SUCCESS
 ;;
 start) agent_start;;
 stop) agent_stop;;
 monitor) agent_monitor;;
 usage|help) agent_usage
 exit $OCF_SUCCESS
 ;;
 *) agent_usage
 exit $OCF_ERR_UNIMPLEMENTED
 ;;
 esac
 rc=$?
 exit $rc
 /usr/lib/ocf/resource.d/cm/cloudera-scm-agent内容(CentOS-7):
 #!/bin/sh
 #######################################################################
 # CM Agent OCF script
 #######################################################################
 #######################################################################
 # Initialization:
 : ${__OCF_ACTION=$1}
 OCF_SUCCESS=0
 OCF_ERROR=1
 OCF_STOPPED=7
 #######################################################################
 meta_data() {
 cat <<END
 <?xml versinotallow=“1.0”?>
 <!DOCTYPE resource-agent SYSTEM “ra-api-1.dtd”>
 <resource-agent name=“Cloudera Manager Agent” versinotallow=“1.0”>
 <version>1.0</version>
 <longdesc lang=“en”>
 This OCF agent handles simple monitoring, start, stop of the Cloudera
 Manager Agent, intended for use with Pacemaker/corosync for failover.
 </longdesc>
 <shortdesc lang=“en”>Cloudera Manager Agent OCF script</shortdesc>
 <parameters />
 <actions>
 <action name=“start” timeout=“20” />
 <action name=“stop” timeout=“20” />
 <action name=“monitor” timeout=“20” interval=“10” depth=“0”/>
 <action name=“meta-data” timeout=“5” />
 </actions>
 </resource-agent>
 END
 }
 #######################################################################
 agent_usage() {
 cat <<END
 usage: $0 {start|stop|monitor|meta-data}
 Cloudera Manager Agent HA OCF script - used for managing Cloudera Manager
 Agent and managed processes lifecycle for use with Pacemaker.
 END
 }
 agent_start() {
 service cloudera-scm-agent start
 if [ $? = 0 ]; then
 return $OCF_SUCCESS
 fi
 return $OCF_ERROR
 }
 agent_stop() {
 service cloudera-scm-agent next_stop_hard
 service cloudera-scm-agent stop
 if [ $? = 0 ]; then
 return $OCF_SUCCESS
 fi
 return $OCF_ERROR
 }
 agent_monitor() {
 # Monitor _MUST!_ differentiate correctly between running
 # (SUCCESS), failed (ERROR) or _cleanly_ stopped (NOT RUNNING).
 # That is THREE states, not just yes/no.
 service cloudera-scm-agent status
 if [ $? = 0 ]; then
 return $OCF_SUCCESS
 fi
 return $OCF_STOPPED
 }
 case $__OCF_ACTION in
 meta-data) meta_data
 exit $OCF_SUCCESS
 ;;
 start) agent_start;;
 stop) agent_stop;;
 monitor) agent_monitor;;
 usage|help) agent_usage
 exit $OCF_SUCCESS
 ;;
 *) agent_usage
 exit $OCF_ERR_UNIMPLEMENTED
 ;;
 esac
 rc=$?
 exit $rc在scm-node2、scm-node3主机上给文件授权:
sudo chmod 770 /usr/lib/ocf/resource.d/cm/cloudera-scm-agent
在scm-node2、scm-node3主机上修改config.ini:
sudo vi /etc/cloudera-scm-agent/config.ini
/etc/cloudera-scm-agent/config.ini变更内容:
server_host=192.168.245.165
 lib_dir=/var/lib/cloudera-scm-agent在scm-node2主机上添加cloudera-scm-agent资源:
sudo crm configure primitive cloudera-scm-agent ocf:cm:cloudera-scm-agent op
 monitor interval=20s timeout=40s on-fail=restart注意:添加cloudera-scm-agent资源主要是为了Cloudera Management
 Service故障转移,如果不需要可以不加定义约束
在scm-node2主机上定义CDH分组:
sudo crm configure group cdh-group vip mysql cloudera-scm-server
 cloudera-scm-agent在scm-node2主机上定义启动顺序约束:
sudo crm configure order vip_before_mysql mandatory: vip mysql
sudo crm configure order mysql_before_cloudera-scm-server mandatory: mysql
 cloudera-scm-serversudo crm configure order cloudera-scm-server_before_cloudera-scm-agent
 mandatory: cloudera-scm-server cloudera-scm-agent