文档编写目的
在进行CDH集群安装部署的时候,官方提供了三种方式,parcels、packages以及tarball,官方推荐使用parcels的方式进行安装,这也是最常用的安装方式,通常我们使用CM图形化界面的操作方式来安装CDH集群,本文档将介绍的是官方提供的另一种安装方式,使用packages安装,即rpm包的方式进行CDH集群的安装,并且本次安装是使用没有CM的方式进行安装。
环境介绍:
·安装部署使用root用户进行操作
·安装的CDH版本为5.10.0
·服务器的操作系统为RedHat7.2
·安装不使用CM
·CDH集群安装在三个节点
data:image/s3,"s3://crabby-images/55056/550566f4d892bf0febdea7f2d2335bf8d671516b" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hadoop"
安装前置准备
2.1服务器相关设置
安装CDH集群时需要做一些前置的准备,本次安装使用的环境已经做好前置准备,需要做的准备如下:
1.hosts以及hostname配置正确
2.服务器没有启用IPv6且配置了静态IP
3.禁用SELinux
4.关闭防火墙
5.设置swappiness为1
6.关闭透明大页面
7.配置NTP时钟同步
2.2 配置本地Yum源
1.在官网下载好需要的rpm包,地址如下:
http://archive.cloudera.com/cdh5/redhat/7/x86_64/cdh/5.10.0/RPMS/
data:image/s3,"s3://crabby-images/0b16f/0b16f8898d4e2e3592098626044df5b3212e5323" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hdfs_02"
data:image/s3,"s3://crabby-images/4e982/4e9828e0855e91d9147b02917e79401cfb0508ce" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hadoop_03"
将上面所有的rpm包下载到服务器,如下:
data:image/s3,"s3://crabby-images/f3aef/f3aef7a53198503fba680e6acb474ed3cb5438e3" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_mapreduce_04"
在浏览器进行验证
data:image/s3,"s3://crabby-images/a8c05/a8c05d9c506f2073313796b12ae700a8f486f30d" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hadoop_05"
2.执行createrepo命令
data:image/s3,"s3://crabby-images/40fd4/40fd47733f12a6ac3edcf7db738f71345167cc60" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hdfs_06"
3.创建repo文件
[rpmrepo]name = rpm_repobaseurl = http://192.168.0.178/cdh_rpm/enable = truegpgcheck = false
data:image/s3,"s3://crabby-images/8aefd/8aefd329c477b52dca6fcb001c4889e46bdbe807" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hadoop_07"
4.执行yum命令,查看本地yum源是否配置成功
yum clean allyum repolist
data:image/s3,"s3://crabby-images/43816/43816916924d1f75860e99f4d1e718e8cad047c8" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hdfs_08"
上图可以看到,下载的rpm包制作的本地yum源成功
CDH组件安装
3.1 ZooKeeper
1.在所有节点安装Zookeeper
data:image/s3,"s3://crabby-images/7c642/7c64204b01f82a8d0372e8bfbc76ac9c07c14378" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hdfs_09"
2.创建数据目录并修改属主
mkdir -p /var/lib/zookeeperchown -R zookeeper /var/lib/zookeeper
data:image/s3,"s3://crabby-images/46b00/46b002f5b70b9ceee7a4743976dde6c7fbe82d48" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hdfs_10"
3.修改配置文件/etc/zookeeper/conf/zoo.cfg
maxClientCnxns=60tickTime=2000initLimit=10syncLimit=5dataDir=/var/lib/zookeeperclientPort=2181dataLogDir=/var/lib/zookeeperminSessionTimeout=4000maxSessionTimeout=40000server.1=cdh178.macro.com:3181:4181server.2=cdh177.macro.com:3181:4181server.3=cdh176.macro.com:3181:4181
data:image/s3,"s3://crabby-images/cb309/cb309cbf916a22da4fa0e3bb5749dc8334cfdd07" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hadoop_11"
保存修改并同步到所有节点
data:image/s3,"s3://crabby-images/1bbe5/1bbe5375a935e26dc345824108b4510c8b591fe4" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hdfs_12"
4.所有节点创建myid文件并修改属主
data:image/s3,"s3://crabby-images/5f287/5f2871c69322157cd8db36259c82f792924f3ee6" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hdfs_13"
5.所有节点启动Zookeeper
/usr/lib/zookeeper/bin/zkServer.sh start
data:image/s3,"s3://crabby-images/cb6f7/cb6f7bef4b013f6f334d4b5046ee2bccef7b1013" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hdfs_14"
查看所有节点启动状态,三个节点均启动成功
/usr/lib/zookeeper/bin/zkServer.sh status
data:image/s3,"s3://crabby-images/2d0dc/2d0dc19e637d1a3edad09db58f4de811d0903dc8" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hdfs_15"
至此Zookeeper安装完成
3.2 HDFS
1.在所有节点安装HDFS必需的包,由于只有三个节点,所以三个节点都安装DataNode
yum -y install hadoop hadoop-hdfs hadoop-client hadoop-doc hadoop-debuginfo hadoop-hdfs-datanode
data:image/s3,"s3://crabby-images/ee088/ee088f3d8f342b5331f88a2cd30b80dd5894928c" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_mapreduce_16"
data:image/s3,"s3://crabby-images/13e45/13e45dfbbbd3dd14a03b5d7732675e6f900c9883" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hdfs_17"
data:image/s3,"s3://crabby-images/0c09f/0c09fa2bc92aa8b47075f76e7070606b050e598d" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hadoop_18"
2.在一个节点安装NameNode以及SecondaryNameNode
yum -y install hadoop-hdfs-namenode hadoop-hdfs-secondarynamenode
data:image/s3,"s3://crabby-images/d1cf5/d1cf5488b8ecf4b214889577be2858322f6228c0" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hadoop_19"
3.创建数据目录并修改属主和权限
所有节点创建DataNode的目录
mkdir -p /data0/dfs/dnchown -R hdfs:hadoop /data0/dfs/dnchmod 700 /data0/dfs/dn
data:image/s3,"s3://crabby-images/b76c5/b76c537996a9226bf2ceba4f0cc4a9ed61c169f9" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hdfs_20"
data:image/s3,"s3://crabby-images/8ce47/8ce478a0c00cf037299c708430de4a3fb0a33c7f" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hadoop_21"
data:image/s3,"s3://crabby-images/71096/71096f40844f72d003e6b741fb38ab210198d015" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hadoop_22"
data:image/s3,"s3://crabby-images/bbef2/bbef2d6524cfa2db108388804bcdb4f9a5e75fc4" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hadoop_23"
NameNode和SecondaryNameNode节点创建数据目录
mkdir -p /data0/dfs/nnchown -R hdfs:hadoop /data0/dfs/nnchmod 700 /data0/dfs/nnmkdir -p /data0/dfs/snnchown -R hdfs:hadoop /data0/dfs/snnchmod 700 /data0/dfs/snn
data:image/s3,"s3://crabby-images/c2b19/c2b19beee53ac787db635e7d67a38e20a5ffe1e9" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_mapreduce_24"
4.修改配置文件
/etc/hadoop/conf/core-site.xml
<configuration> <property> <name>fs.defaultFS</name> <value>hdfs://cdh178.macro.com:8020</value> </property> <property> <name>fs.trash.interval</name> <value>1</value> </property> <property> <name>io.compression.codecs</name> <value>org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.BZip2Codec,org.apache.hadoop.io.compress.DeflateCodec,org.apache.hadoop.io.compress.SnappyCodec,org.apache.hadoop.io.compress.Lz4Codec</value> </property></configuration>
data:image/s3,"s3://crabby-images/c94d3/c94d33da803e1a62bf0bd81cbd77fea01bae74c4" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_mapreduce_25"
/etc/hadoop/conf/hdfs-site.xml
<configuration> <property> <name>dfs.namenode.name.dir</name> <value>file:///data0/dfs/nn</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:///data0/dfs/dn</value> </property> <property> <name>dfs.namenode.servicerpc-address</name> <value>cdh178.macro.com:8022</value> </property> <property> <name>dfs.https.address</name> <value>cdh178.macro.com:9871</value> </property> <property> <name>dfs.secondary.http.address</name> <value>cdh178.macro.com:50090</value> </property> <property> <name>dfs.https.port</name> <value>9871</value> </property> <property> <name>dfs.namenode.http-address</name> <value>cdh178.macro.com:9870</value> </property> <property> <name>dfs.replication</name> <value>3</value> </property> <property> <name>dfs.blocksize</name> <value>134217728</value> </property> <property> <name>dfs.namenode.checkpoint.dir</name> <value>file:///data0/dfs/snn</value> </property></configuration>
data:image/s3,"s3://crabby-images/52a17/52a171271706ccb68680b64057bf7e2b7e0a4ae7" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hadoop_26"
5.将修改的配置文件保存并同步到所有节点
data:image/s3,"s3://crabby-images/f72ca/f72ca319152f19342c6c0769bf62f88b278bfbed" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hdfs_27"
6.格式化NameNode
data:image/s3,"s3://crabby-images/ebadf/ebadf9eedd225821b510a08f48456fcbc6cceaa3" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hdfs_28"
data:image/s3,"s3://crabby-images/b6cee/b6ceeeea27485e5da09b8dd014edca6b6be58afc" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hadoop_29"
7.在所有节点运行命令启动HDFS
systemctl start hadoop-hdfs-namenodesystemctl start hadoop-hdfs-secondarynamenodesystemctl start hadoop-hdfs-datanodesystemctl status hadoop-hdfs-namenodesystemctl status hadoop-hdfs-secondarynamenodesystemctl status hadoop-hdfs-datanode
data:image/s3,"s3://crabby-images/6976a/6976a73d75f0439076b1df03077487e12398ff8f" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hdfs_30"
data:image/s3,"s3://crabby-images/2308b/2308ba73bbfbd37a3168d4aa1cd1bb7dc8b1b67a" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_mapreduce_31"
data:image/s3,"s3://crabby-images/73f60/73f60e7447143794fac1d4ea0eed5ebbed2b2783" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hadoop_32"
8.创建/tmp临时目录,并设置目录权限,然后使用hadoop命令查看创建的目录成功
sudo -u hdfs hadoop fs -mkdir /tmpsudo -u hdfs hadoop fs -chmod -R 1777 /tmp
data:image/s3,"s3://crabby-images/04613/046135b83e78ee97c8042d8f8f6ea58606d48cda" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hdfs_33"
9.访问NameNode的Web UI
data:image/s3,"s3://crabby-images/612ac/612ac2f2c6be697942772be9a000be38d9d5d621" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hadoop_34"
至此HDFS安装完成
3.3 Yarn
1.安装Yarn的包,在一个节点安装ResourceManager和JobHistory Server,另外两个节点安装NodeManager
yum -y install hadoop-yarn hadoop-yarn-resourcemanager hadoop-mapreduce-historyserver hadoop-yarn-proxyserver hadoop-mapreduce
data:image/s3,"s3://crabby-images/d934d/d934d400b0e21169f621b943403d86f15a8f31f9" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_mapreduce_35"
yum -y install hadoop-yarn hadoop-yarn-nodemanager hadoop-mapreduce
data:image/s3,"s3://crabby-images/4b8d8/4b8d8aaea2b029fe8b1f1c0b1dcc3d5ce575e59e" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hadoop_36"
2.创建目录并修改属主和权限
在所有节点创建本地目录
mkdir -p /data0/yarn/nmchown yarn:hadoop /data0/yarn/nmmkdir -p /data0/yarn/container-logschown yarn:hadoop /data0/yarn/container-logs
data:image/s3,"s3://crabby-images/eed07/eed077e1c0c8782ebb4fa8c7d64b7ff9cc5aca9d" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hadoop_37"
在HDFS上创建logs目录
sudo -u hdfs hdfs dfs -mkdir /tmp/logssudo -u hdfs hdfs dfs -chown mapred:hadoop /tmp/logssudo -u hdfs hdfs dfs -chmod 1777 /tmp/logs
data:image/s3,"s3://crabby-images/da524/da524b20e089bc4f46f431dc52e926d1f1562846" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hadoop_38"
在HDFS上创建/user/history目录
sudo -u hdfs hdfs dfs -mkdir -p /usersudo -u hdfs hdfs dfs -chmod 777 /usersudo -u hdfs hdfs dfs -mkdir -p /user/historysudo -u hdfs hdfs dfs -chown mapred:hadoop /user/historysudo -u hdfs hdfs dfs -chmod 1777 /user/historysudo -u hdfs hdfs dfs -mkdir -p /user/history/donesudo -u hdfs hdfs dfs -mkdir -p /user/history/done_intermediatesudo -u hdfs hdfs dfs -chown -R mapred:hadoop /user/historysudo -u hdfs hdfs dfs -chmod 771 /user/history/donesudo -u hdfs hdfs dfs -chmod 1777 /user/history/done_intermediate
data:image/s3,"s3://crabby-images/6f9d7/6f9d7a67511de512a152ca76e0661fc39f39640d" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_mapreduce_39"
3.修改配置文件
/etc/hadoop/conf/yarn-site.xml
<configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.log-aggregation-enable</name> <value>true</value> </property> <property> <name>yarn.nodemanager.local-dirs</name> <value>file:///data0/yarn/nm</value> </property> <property> <name>yarn.nodemanager.log-dirs</name> <value>file:///data0/yarn/container-logs</value> </property> <property> <name>yarn.nodemanager.remote-app-log-dir</name> <value>/tmp/logs</value> </property> <property> <name>yarn.application.classpath</name> <value>$HADOOP_CONF_DIR,$HADOOP_COMMON_HOME/*,$HADOOP_COMMON_HOME/lib/*,$HADOOP_HDFS_HOME/*,$HADOOP_HDFS_HOME/lib/*,$HADOOP_MAPRED_HOME/*,$HADOOP_MAPRED_HOME/lib/*,$HADOOP_YARN_HOME/*,$HADOOP_YARN_HOME/lib/*</value> </property> <property> <name>yarn.resourcemanager.address</name> <value>cdh178.macro.com:8032</value> </property> <property> <name>yarn.resourcemanager.admin.address</name> <value>cdh178.macro.com:8033</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>cdh178.macro.com:8030</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>cdh178.macro.com:8031</value> </property> <property> <name>yarn.resourcemanager.webapp.address</name> <value>cdh178.macro.com:8088</value> </property> <property> <name>yarn.resourcemanager.webapp.https.address</name> <value>cdh178.macro.com:8090</value> </property></configuration>
data:image/s3,"s3://crabby-images/81c21/81c21d0d66bb8750fb8d6162c9dea0a511276719" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hdfs_40"
/etc/hadoop/conf/mapred-site.xml
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapreduce.jobhistory.address</name> <value>cdh178.macro.com:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>cdh178.macro.com:19888</value> </property> <property> <name>mapreduce.jobhistory.webapp.https.address</name> <value>cdh178.macro.com:19890</value> </property> <property> <name>mapreduce.jobhistory.admin.address</name> <value>cdh178.macro.com:10033</value> </property> <property> <name>yarn.app.mapreduce.am.staging-dir</name> <value>/user</value> </property></configuration>
data:image/s3,"s3://crabby-images/3d342/3d342f2235df3b2c6042a8d4b8bf7a47f0dba409" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hadoop_41"
/etc/hadoop/conf/core-site.xml,下面只贴出修改的部分配置
<property> <name>hadoop.proxyuser.mapred.groups</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.mapred.hosts</name> <value>*</value> </property>
data:image/s3,"s3://crabby-images/7d7ee/7d7ee10b4efab81bccab7cf156a4f1d7870b674a" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hadoop_42"
4.将配置文件保存后同步到所有节点
data:image/s3,"s3://crabby-images/0fa45/0fa450fc34c01e667c4a6c18c51e255d4500548d" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hdfs_43"
5.启动Yarn服务
在JobHistoryServer节点上启动mapred-historyserver
/etc/init.d/hadoop-mapreduce-historyserver start
data:image/s3,"s3://crabby-images/a4395/a4395f2269d14c2b1964415d7100c61550faf646" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hdfs_44"
data:image/s3,"s3://crabby-images/e5057/e50578653ea88ef3f7b10635ed89bf3f754504ff" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hdfs_45"
在RM节点启动ResourceManager
systemctl start hadoop-yarn-resourcemanagersystemctl status hadoop-yarn-resourcemanager
data:image/s3,"s3://crabby-images/511d4/511d40245a071dbae8a7474cecd60d0714e66db5" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hdfs_46"
在NM节点启动NodeManager
systemctl start hadoop-yarn-nodemanagersystemctl status hadoop-yarn-nodemanager
data:image/s3,"s3://crabby-images/782c7/782c7b7c6ce737d54d1c7b2c4b7c7a63fe13db11" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_mapreduce_47"
data:image/s3,"s3://crabby-images/e3035/e30351e1444d646bf13ffcb830728644a3a71607" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_mapreduce_48"
6.访问Yarn服务的Web UI
Yarn的管理页面
data:image/s3,"s3://crabby-images/d4853/d4853000a7c45487eeb416bddcce66f0607e9b0a" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hdfs_49"
JobHistory的管理页面
data:image/s3,"s3://crabby-images/9bcb4/9bcb46e076ab4d0b3c93204e71a356519edbd16b" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_mapreduce_50"
查看在线的节点
data:image/s3,"s3://crabby-images/bfb5d/bfb5d5d0f243711121f792666a759481d12d69f1" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_mapreduce_51"
7.运行MR示例程序
使用root用户运行示例程序,所以要先创建root用户的目录
sudo -u hdfs hdfs dfs -mkdir /user/rootsudo -u hdfs hdfs dfs -chown root:root /user/root
data:image/s3,"s3://crabby-images/b3f70/b3f70d283b8d08f612a42e6a5c9d0ba30180bc5b" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_mapreduce_52"
运行MR示例程序,运行成功
hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar pi 5 5
data:image/s3,"s3://crabby-images/81cfe/81cfe96e886c079bb53d7c5cca591c3e58947610" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hdfs_53"
data:image/s3,"s3://crabby-images/e394a/e394aa71f0eeb2e30ba7085e5ac1236d9c3c213c" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hdfs_54"
至此Yarn服务安装完成
3.4 Spark
1.安装Spark所需的包
yum install spark-core spark-master spark-worker spark-history-server spark-python
data:image/s3,"s3://crabby-images/e05a2/e05a209022fca5c3b1f3a3d6cfb715f942da6f0c" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hadoop_55"
data:image/s3,"s3://crabby-images/43237/432373460a45359dcdcd83da0c2650ec5539f393" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hdfs_56"
2.创建目录并修改属主和权限
sudo -u hdfs hadoop fs -mkdir /user/sparksudo -u hdfs hadoop fs -mkdir /user/spark/applicationHistorysudo -u hdfs hadoop fs -chown -R spark:spark /user/sparksudo -u hdfs hadoop fs -chmod 1777 /user/spark/applicationHistory
data:image/s3,"s3://crabby-images/552a1/552a1503db50f6962476611f5e7728ea878fc31f" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hdfs_57"
3.修改配置文件/etc/spark/conf/spark-defaults.conf
spark.eventLog.enabled=truespark.eventLog.dir=hdfs://cdh178.macro.com:8020/user/spark/applicationHistoryspark.yarn.historyServer.address=http://cdh178.macro.com:18088
data:image/s3,"s3://crabby-images/d3c3e/d3c3e74516f5f55f300b53fc1c6b24b66d2804d8" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_mapreduce_58"
4.启动spark-history-server
systemctl start spark-history-serversystemctl status spark-history-server
data:image/s3,"s3://crabby-images/c27ec/c27ec2af41abb55793857815900c07b4e59ee2f8" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hadoop_59"
访问Web UI
data:image/s3,"s3://crabby-images/6823b/6823b5e71afa8e3774902b71ae375148282e854e" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hadoop_60"
5.修改配置文件并同步到所有节点
data:image/s3,"s3://crabby-images/583d6/583d6f6e702c56644645a89b5b10d79fa2058414" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hdfs_61"
6.启动Spark
在Master节点启动spark-master
systemctl start spark-mastersystemctl status spark-master
data:image/s3,"s3://crabby-images/7e94f/7e94f58a9c8fc0f029c44aa43c65e06d1dbf99d7" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_mapreduce_62"
在所有节点启动spark-worker
systemctl start spark-workersystemctl status spark-worker
data:image/s3,"s3://crabby-images/56826/568266d1be98f56ec61f45272087cec27ce3b67d" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hdfs_63"
7.测试Spark使用
data:image/s3,"s3://crabby-images/43855/438556cd7580344f4add570122fa2d4b1c65937e" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_mapreduce_64"
至此Spark安装完成
3.5 Hive
1.安装Hive服务之前,先安装元数据库MySQL并创建好服务需要的库和用户如下
create database metastore default character set utf8; CREATE USER 'hive'@'%' IDENTIFIED BY 'password'; GRANT ALL PRIVILEGES ON metastore.* TO 'hive'@'%'; FLUSH PRIVILEGES;
data:image/s3,"s3://crabby-images/2287a/2287a52021c6cb28453f2945d2b35cc4443f77dd" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_mapreduce_65"
2.安装Hive服务的包
在NameNode节点hive-metastore
yum -y install hive-metastore
data:image/s3,"s3://crabby-images/137e0/137e0bf98d95979839ab646a40b28484923bcda9" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hadoop_66"
在所有节点安装其他所需的包
yum -y install hive hive-server2 hive-jdbc hive-hbase
data:image/s3,"s3://crabby-images/da006/da0064e87841ad73208ea4422f521bf69c35b3f8" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hdfs_67"
3.创建目录
在HDFS上创建目录并设置权限以及修改属主
sudo -u hdfs hadoop fs -mkdir /user/hivesudo -u hdfs hadoop fs -chown hive:hive /user/hivesudo -u hdfs hadoop fs -mkdir /user/hive/warehousesudo -u hdfs hadoop fs -chmod 1777 /user/hive/warehousesudo -u hdfs hadoop fs -chown hive:hive /user/hive/warehouse
data:image/s3,"s3://crabby-images/79259/79259bb96ca7cc20fabf605395edb490802e3b1a" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_mapreduce_68"
4.修改配置文件
/etc/hive/conf/hive-site.xml
<configuration> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://cdh178.macro.com:3306/metastore?useUnicode=true&characterEncoding=UTF-8</value> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.jdbc.Driver</value> </property> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>hive</value> </property> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>password</value> </property> <property> <name>datanucleus.schema.autoCreateAll</name> <value>false</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>cdh178.macro.com:8031</value> </property> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>hive.exec.reducers.max</name> <value>1099</value> </property> <property> <name>hive.metastore.schema.verification</name> <value>true</value> </property> <property> <name>hive.metastore.warehouse.dir</name> <value>/user/hive/warehouse</value> </property> <property> <name>hive.warehouse.subdir.inherit.perms</name> <value>true</value> </property> <property> <name>hive.metastore.server.min.threads</name> <value>200</value> </property> <property> <name>hive.metastore.server.max.threads</name> <value>100000</value> </property> <property> <name>hive.metastore.client.socket.timeout</name> <value>3600</value> </property> <property> <name>hive.support.concurrency</name> <value>true</value> </property> <property> <name>hive.zookeeper.quorum</name> <value>cdh178.macro.com,cdh177.macro.com,cdh176.macro.com</value> </property> <property> <name>hive.zookeeper.client.port</name> <value>2181</value> </property></configuration>
data:image/s3,"s3://crabby-images/da448/da44824e0992a2cab4a7ce6adf3f93d0a836b1ab" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hadoop_69"
/etc/hadoop/conf/core-site.xml,只贴出修改的部分
<property> <name>hadoop.proxyuser.hive.hosts</name> <value>*</value></property><property> <name>hadoop.proxyuser.hive.groups</name> <value>*</value></property>
data:image/s3,"s3://crabby-images/cd2ac/cd2ac5f45afdfcd4585f855d294c11290fc8ec8f" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hdfs_70"
5.将配置文件同步到所有节点
data:image/s3,"s3://crabby-images/8b35c/8b35c38eb3f857e0e5a4c9b41f479be5e64b026d" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hdfs_71"
6.将MySQL驱动包在Hive服务的lib目录下设置软链
data:image/s3,"s3://crabby-images/b9a06/b9a06392ff4c99fcc7a5d28d6f5f631165ae5423" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_mapreduce_72"
7.启动Hive服务
启动hive-metastore
systemctl start hive-metastoresystemctl status hive-metastore
data:image/s3,"s3://crabby-images/e2e51/e2e51d45426f95e74803c802c68ae376b63fdc89" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hadoop_73"
启动hive-server2
systemctl start hive-server2systemctl status hive-server2
data:image/s3,"s3://crabby-images/427b3/427b35390fbc975cef66b127bde006dbdd0215a9" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_mapreduce_74"
8.测试Hive服务是否正常
连接Hive,建表正常
data:image/s3,"s3://crabby-images/de0ca/de0ca1dec2ab2c4df1c2e273342efd4719ecb95b" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hdfs_75"
插入数据正常
data:image/s3,"s3://crabby-images/cd740/cd740432fa3113634cf4ff806137fcc5c0be604c" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hdfs_76"
查询正常
data:image/s3,"s3://crabby-images/629c6/629c603b87876738d0c5f92dde3a29943d6990bd" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hadoop_77"
至此Hive安装完成
3.6 Oozie
1.在MySQL中创建Oozie服务所需要的库和用户
create database oozie default character set utf8; CREATE USER 'oozie'@'%' IDENTIFIED BY 'password'; GRANT ALL PRIVILEGES ON oozie.* TO 'oozie'@'%'; FLUSH PRIVILEGES;
data:image/s3,"s3://crabby-images/2ba2b/2ba2ba0262fc8a821376d44b2f269e5021760e5b" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hadoop_78"
2.安装Oozie的包
yum -y install oozie oozie-client
data:image/s3,"s3://crabby-images/0a0f0/0a0f042740b000a92289532d6b37653afca72e2d" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hdfs_79"
data:image/s3,"s3://crabby-images/f41bc/f41bcc27822d186132ac4c949f54f898d3dbd921" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_mapreduce_80"
3.配置Oozie
配置Oozie使用Yarn
alternatives --set oozie-tomcat-deployment /etc/oozie/tomcat-conf.http
data:image/s3,"s3://crabby-images/92153/92153ec649b9ae66e4b6285a7d174da07fc57ef4" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hadoop_81"
修改/etc/oozie/conf/oozie-site.xml配置文件
<property> <name>oozie.service.JPAService.jdbc.driver</name> <value>com.mysql.jdbc.Driver</value> </property> <property> <name>oozie.service.JPAService.jdbc.url</name> <value>jdbc:mysql://cdh178.macro.com:3306/oozie</value> </property> <property> <name>oozie.service.JPAService.jdbc.username</name> <value>oozie</value> </property> <property> <name>oozie.service.JPAService.jdbc.password</name> <value>password</value> </property>
data:image/s3,"s3://crabby-images/574f9/574f9698be749cc27f923eeb06484f58885b9d75" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_mapreduce_82"
将MySQL驱动包在Oozie目录下生成软链
data:image/s3,"s3://crabby-images/d9c40/d9c409937e60c140eaf611b6969739c463c570e4" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hdfs_83"
4.运行Oozie数据库工具
sudo -u oozie /usr/lib/oozie/bin/ooziedb.sh create -run
data:image/s3,"s3://crabby-images/02c2a/02c2a1dcc429da6f76c426087ce41ea63af6036a" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_mapreduce_84"
data:image/s3,"s3://crabby-images/eb393/eb393fde3d92c731aadd231b7bf923a382e63200" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hdfs_85"
5.配置Oozie的Web控制台
下载ExtJS library到服务器,地址如下:
https://archive.cloudera.com/gplextras/misc/ext-2.2.zip
data:image/s3,"s3://crabby-images/6755a/6755a3898a141bc47731cbacf5922107e594469b" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_mapreduce_86"
将下载的包解压到/var/lib/oozie
unzip ext-2.2.zip -d /var/lib/oozie/
data:image/s3,"s3://crabby-images/e27db/e27db3f813dbf98af9244168ba284b5e3586047d" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hadoop_87"
data:image/s3,"s3://crabby-images/ca05b/ca05b45b32f00a3887b2ea3f6046d5ff6fd3138e" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hdfs_88"
6.在HDFS中安装Oozie共享库
sudo -u hdfs hadoop fs -mkdir /user/ooziesudo -u hdfs hadoop fs -chown oozie:oozie /user/ooziesudo oozie-setup sharelib create -fs hdfs://cdh178.macro.com:8020 -locallib /usr/lib/oozie/oozie-sharelib-yarn
data:image/s3,"s3://crabby-images/9518c/9518c7e40c8e7b06aa89fc3a96d33eb783b8cd8a" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_mapreduce_89"
7.启动Oozie Server
systemctl start ooziesystemctl status oozie
data:image/s3,"s3://crabby-images/2397b/2397b0073c116f8710d1cabd879b22b05b1a01e2" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hdfs_90"
8.访问Oozie服务的Web UI
data:image/s3,"s3://crabby-images/9eb25/9eb2593573aac88b21be8096a1b344446c4f5709" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_mapreduce_91"
至此Oozie服务安装完成
3.7 Impala
1.安装Impala的包
在一个节点上安装Impala Catalog Server和Impala StateStore
yum -y install impala-state-store impala-catalog
data:image/s3,"s3://crabby-images/cc395/cc39545d1f39e64d6b1f0cf94018e8e8a17e6463" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hdfs_92"
在所有节点安装其他的包
yum -y install impala impala-server
data:image/s3,"s3://crabby-images/04ebe/04ebed42e81e50bb90f6782941980479d22c12e1" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hdfs_93"
2.将Impala需要的配置文件拷贝到Impala的配置文件目录下
data:image/s3,"s3://crabby-images/04ecb/04ecbdf8da90115bbff4ba5b37bbda9b03059b7c" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_mapreduce_94"
3.安装impala-shell
yum -y install impala-shell
data:image/s3,"s3://crabby-images/194aa/194aabab7271306cfd379f2151701041dd18b55c" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hdfs_95"
4.安装完Impala后需要的配置
修改/etc/hadoop/conf/hdfs-site.xml配置文件,启用块位置追踪和短路读取
<property> <name>dfs.datanode.hdfs-blocks-metadata.enabled</name> <value>true</value></property> <property> <name>dfs.client.read.shortcircuit</name> <value>true</value> </property> <property> <name>dfs.domain.socket.path</name> <value>/var/run/hdfs-sockets/dn</value> </property> <property> <name>dfs.client.file-block-storage-locations.timeout.millis</name> <value>10000</value> </property>
data:image/s3,"s3://crabby-images/21ea3/21ea3bb411eb66d72fb1e17c45a30eeeb1f1aa73" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hadoop_96"
将配置同步到所有节点
data:image/s3,"s3://crabby-images/fadc5/fadc539b9e69198f14c8e362966dba2481ad100a" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_mapreduce_97"
重启所有DataNode
data:image/s3,"s3://crabby-images/69d94/69d9443d2d533a3cd3fae7b93da2cc19c1017235" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hadoop_98"
将修改后的hdfs-site.xml复制到Impala的配置文件目录
data:image/s3,"s3://crabby-images/5b8e1/5b8e1389612750520a2e21cf9e1edd6a11d1c787" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hdfs_99"
5.启动Impala服务
启动Impala Catalog Server和Impala StateStore
systemctl start impala-state-storesystemctl status impala-state-storesystemctl start impala-catalogsystemctl status impala-catalog
data:image/s3,"s3://crabby-images/a31c5/a31c5af5e0c7631bbec2a1a89888c7d140ce46bc" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hadoop_100"
所有节点启动impala-server
systemctl start impala-serversystemctl status impala-server
data:image/s3,"s3://crabby-images/c53c6/c53c64b1206fb9b87b0b37e4feefc2d9f5ad140e" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hadoop_101"
6.测试Impala使用
使用impala-shell连接Impala,进行查询操作成功
data:image/s3,"s3://crabby-images/acfd0/acfd0949edda84c7ec093d7572631d06e5973ab4" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_mapreduce_102"
至此Impala安装完成
3.8 Hue
1.安装Hue的包
data:image/s3,"s3://crabby-images/d9d79/d9d79c7d703e6dbcd210113c8a1ad415158039fc" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hadoop_103"
data:image/s3,"s3://crabby-images/b06bd/b06bd9129286789b5242011ecc28ecaf6572cc77" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_mapreduce_104"
data:image/s3,"s3://crabby-images/ae0e6/ae0e6466a24cc84b34272b8843d4182e70dec237" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hadoop_105"
2.为Hue配置CDH组件
·配置Hue访问HDFS
1)修改配置文件
/etc/hadoop/conf/hdfs-site.xml
<property> <name>dfs.webhdfs.enabled</name> <value>true</value> </property>
data:image/s3,"s3://crabby-images/0c5a9/0c5a97964347ac3b89194eb987e314ae1757fbe1" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_mapreduce_106"
/etc/hadoop/conf/core-site.xml
<property> <name>hadoop.proxyuser.hue.hosts</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.hue.groups</name> <value>*</value> </property>
data:image/s3,"s3://crabby-images/7fc16/7fc16436e7c1d2a02fb743585be1a37ee64519ff" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hdfs_107"
/etc/hue/conf/hue.ini
data:image/s3,"s3://crabby-images/cab33/cab33d0243b5f3c470750398991f35bc7db27bae" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hadoop_108"
将修改的HDFS的配置文件同步到所有节点
data:image/s3,"s3://crabby-images/9ce1f/9ce1f87c8c33e48436aeed67d3af509b7d38576c" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hadoop_109"
2)重启HDFS服务
systemctl restart hadoop-hdfs-namenodesystemctl restart hadoop-hdfs-secondarynamenodesystemctl restart hadoop-hdfs-datanode
data:image/s3,"s3://crabby-images/62583/6258395f3b4ce85992a3ce719d39af4d22022825" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hdfs_110"
data:image/s3,"s3://crabby-images/04685/04685e46b40be67de071ad1c243c2dfef182b7be" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_mapreduce_111"
·配置Hue集成Hive
修改配置文件/etc/hue/conf/hue.ini
data:image/s3,"s3://crabby-images/3f875/3f875f3f52ba1f60323686278cd598461db77dc3" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hdfs_112"
3.创建Hue服务所需的数据库和用户
create database hue default character set utf8; CREATE USER 'hue'@'%' IDENTIFIED BY 'password'; GRANT ALL PRIVILEGES ON hue.* TO 'hue'@'%'; FLUSH PRIVILEGES;
data:image/s3,"s3://crabby-images/b10d6/b10d6a618bd1b96ebcc125ccf1e19be44827f0de" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hadoop_113"
4.初始化数据库
/usr/lib/hue/build/env/bin/hue syncdb/usr/lib/hue/build/env/bin/hue migrate
data:image/s3,"s3://crabby-images/ecab4/ecab4e38aca8646d0e293af21ac8138bd27bf3f7" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hdfs_114"
data:image/s3,"s3://crabby-images/95c92/95c921a01f9d6117516ac3e4080bee49bfeaea82" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_mapreduce_115"
data:image/s3,"s3://crabby-images/48ab5/48ab5c4457f84d2333181abdea33cffd8b1c726c" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hadoop_116"
data:image/s3,"s3://crabby-images/5de5f/5de5f4765de698e28a20d60e5b83ebae2806d05a" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hdfs_117"
5.启动Hue服务
systemctl start huesystemctl status hue
data:image/s3,"s3://crabby-images/bd3f1/bd3f1d16af6d04f3678ed3f1a0886a553dbc8c45" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hadoop_118"
6.访问Hue服务的Web UI
data:image/s3,"s3://crabby-images/a83d5/a83d5dc57ec6356b63753f331016a21eb436bcd0" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_mapreduce_119"
在Hue中使用Hive
data:image/s3,"s3://crabby-images/21aca/21aca7d2f35d676a9ba8ec10c280126be3efd724" alt="0719-5.10.0-如何在RedHat7.2使用rpm安装CDH(无CM)_hdfs_120"
至此Hue服务安装完成
总结
1.使用无CM的方式以rpm包的形式安装CDH集群,所有的配置都需要手动进行,与使用CM安装的方式相比要复杂许多。
2.此安装方式需要下载相关的所有rpm包到服务器,然后制作本地的yum源进行安装。
3.在服务安装的过程中也需要注意顺序,需要最先安装Zookeeper。
4.在服务配置的过程中,由于配置文件都是手动配置,所以在服务启动出错时需要及时查看日志,排查错误。