ambari搭建hadoop集群

  • 为何选用ambari
  • 安装前准备
  • 部署
  • 节点间做免密登陆
  • 搭建nginx服务
  • 创建repo源
  • 安装mysql 服务
  • 安装ambari服务
  • 启动ambari服务
  • 登陆ambari-ui 配置


为何选用ambari

Ambari是Apache Software Foundation中的一个顶级项目,它可以创建、管理、监视Hadoop整个生态圈(例如Hive,Hbase,Sqoop,Zookeeper等)的集群,使得Hadoop以及相关的大数据软件更容易使用。

Ambari本身是一个分布式架构的软件,由Ambari Server和Ambari Agent两部分组成,用户可通过Ambari Server通知Ambari Agent安装对应的软件;Ambari Agent会定时地发送各个机器每个软件模块的状态给Ambari Server,最终这些状态信息会呈现在Ambari的GUI,方便用户了解到集群的各种状态,并进行相应的维护。

Ambari跟Hadoop等开源软件一样,也是Apache Software Foundation中的一个项目,并且是顶级项目。本次选用发布版本2.7.3。就Ambari的作用来说,就是创建、管理、监视Hadoop的集群,但是这里的Hadoop是广义,指的是Hadoop整个生态圈(例如Hive,Hbase,Sqoop,Zookeeper等),而并不仅是特指Hadoop。用一句话来说,Ambari就是为了让Hadoop以及相关的大数据软件更容易使用的一个工具。

Ambari主要取得了以下成绩:

通过一步一步的安装向导简化了集群供应。

预先配置好关键的运维指标(metrics),可以直接查看Hadoop Core(HDFS和MapReduce)及相关项目(如HBase、Hive和HCatalog)是否健康。
支持作业与任务执行的可视化与分析,能够更好地查看依赖和性能。
通过一个完整的RESTful API把监控信息暴露出来,集成了现有的运维工具。
用户界面非常直观,用户可以轻松有效地查看信息并控制集群。

最新的CDH已经没有了社区版,也就是说以后使用新版本的Cloudera Manager和CDH都是要收费的,这对于很多小公司来说,可能无法承受。转向Ambari是他们的一个可选项。Ambari是Apache的一个顶级开源项目,开源是其最大的优势,开源也意味着Ambari可以灵活地进行扩展,集成更多的数据组件,对于需要定制化和二次开发的企业来说,Ambari也极具吸引力。

github地址: ambari

安装前准备

1.准备三台机器,规格如下(可以根据实际情况调整):

主机名

ip

cpu

内存

磁盘

role

ambari01

10.180.13.62

16c

32g

300G

ambai-server/namenode01/datanode01

ambari02

10.180.13.25

16c

32g

300G

ambai-agent/namenode02datanode02

ambari03

10.180.13.61

16c

32g

300G

ambai-agent/datanode03

  1. 机器能连外网
    三台机器能连外网,在使用ambai 安装过程中,有些依赖包需要联网安装解决。
  2. 准备HDP包资源
    2021年1月以来,新版本的Cloudera Manager和CDH都是要收费, 所以这里使用了旧的版本包,如下:

名称

版本

操作系统

HDP

3.0.0.0-1634

centos7.x

HDP-UTILS

1.1.0.22

centos7.x

HDP-GPL

3.0.0.0-1634

centos7.x

ambari

2.7.3.0

centos7.x

HDP 包路径:

[root@ambari01 yum.repos.d]# tree /usr/share/nginx/html/hadoop/ -L 4
/usr/share/nginx/html/hadoop/
|-- ambari
|   `-- centos7
|       `-- 2.7.3.0-139
|           |-- ambari
|           |-- ambari.repo
|           |-- artifacts.txt
|           |-- build.id
|           |-- build_metadata.txt
|           |-- hotfix_index.html
|           |-- index.html
|           |-- private_index.html
|           |-- public_index.html
|           |-- repodata
|           |-- RPM-GPG-KEY
|           |-- smartsense
|           `-- tars
|-- HDP
|   `-- centos7
|       `-- 3.0.0.0-1634
|           |-- accumulo
|           |-- artifacts.txt
|           |-- atlas
|           |-- bigtop-jsvc
|           |-- bigtop-tomcat
|           |-- build.id
|           |-- build_metadata.txt
|           |-- datafu
|           |-- druid
|           |-- hadoop
|           |-- hbase
|           |-- HDP-3.0.0.0-1634-MAINT.xml
|           |-- HDP-3.0.0.0-1634.xml
|           |-- hdp.repo
|           |-- hdp-select
|           |-- hive
|           |-- hive_warehouse_connector
|           |-- hotfix_index.html
|           |-- index.html
|           |-- kafka
|           |-- knox
|           |-- livy
|           |-- oozie
|           |-- phoenix
|           |-- pig
|           |-- private_index.html
|           |-- public_index.html
|           |-- ranger
|           |-- repodata
|           |-- RPM-GPG-KEY
|           |-- shc
|           |-- spark2
|           |-- sqoop
|           |-- storm
|           |-- superset
|           |-- tez
|           |-- vrpms
|           |-- zeppelin
|           `-- zookeeper
|-- HDP-GPL
|   `-- centos7
|       `-- 3.0.0.0-1634
|           |-- hadooplzo
|           |-- hdp.gpl.repo
|           |-- repodata
|           |-- RPM-GPG-KEY
|           `-- vrpms
`-- HDP-UTILS
    `-- centos7
        `-- 1.1.0.22
            |-- hdp-utils.repo
            |-- openblas
            |-- repodata
            |-- RPM-GPG-KEY
            `-- snappy

54 directories, 20 files
  1. 本地解析
    /etc/hosts 解析的作用,不用在dns解析,在三台节点都做同样操作。
vi /etc/hosts
10.180.13.62 ambari01
10.180.13.25 ambari02
10.180.13.61 ambari03
127.0.0.1 localhost.localdomain localhost
::1 localhost6.localdomain6 localhost6
  1. 部署ntp、openjdk
    每个节点执行安装ntp、openjdk命令
yum install ntp -y
yum install java-1.8.0-openjdk -y

部署

节点间做免密登陆

[root@ambari01 yum.repos.d]# ssh-keygen
[root@ambari01 yum.repos.d]# ssh-copy-id -i .ssh/id_rsa.pub root@ambari02
[root@ambari01 yum.repos.d]# ssh-copy-id -i .ssh/id_rsa.pub root@ambari03

搭建nginx服务

部署nginx服务,也可部署apache httpd服务,目的把下载HDP包通过http访问, 把HDP包解压释放到
/usr/share/nginx/html/hadoop 目录

[root@ambari01 ~]# yum install -y nginx
[root@ambari01 ~]# vi /etc/nginx/nginx.conf

    server {
        listen       80;
        listen       [::]:80;
        server_name  _;
        location  / {
                root         /usr/share/nginx/html/hadoop;
                autoindex on;
        }
        #root         /usr/share/nginx/html; 
        index index.html index.htm;

        # Load configuration files for the default server block.
        include /etc/nginx/default.d/*.conf;

        error_page 404 /404.html;
        location = /404.html {
        }

        error_page 500 502 503 504 /50x.html;
        location = /50x.html {
        }
    }
[root@ambari01 ~]# nginx -s reload

创建repo源

[root@ambari01 ~]# cd /etc/yum.repos.d/
[root@ambari01 yum.repos.d]# ls
ambari-hdp-1.repo  ambari.repo  CentOS-Base.repo  CentOS-Epel.repo  CentOS-x86_64-kernel.repo  hdp.repo  mysql-community.repo  mysql-community-source.repo
[root@ambari01 yum.repos.d]# vi ambari.repo
[ambari-2.7.3.0]
#json.url = http://public-repo-1.hortonworks.com/HDP/hdp_urlinfo.json
name=ambari Version - ambari-2.7.3.0
baseurl=http://ambari01/ambari/centos7/2.7.3.0-139
gpgcheck=1
gpgkey=http://ambari01/ambari/centos7/2.7.3.0-139/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
enabled=1
priority=1
[root@ambari01 yum.repos.d]# vi hdp.repo
#VERSION_NUMBER=3.0.0.0-1634
[HDP-3.0]
name=HDP
baseurl=http://ambari01/HDP/centos7/3.0.0.0-1634
gpgcheck=1
gpgkey=http://ambari01/HDP/centos7/3.0.0.0-1634/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
enabled=1
priority=1


[HDP-UTILS-1.1.0.22]
name=HDP-UTILS
baseurl=http://ambari01/HDP-UTILS/centos7/1.1.0.22
gpgcheck=1
gpgkey=http://ambari01/HDP-UTILS/centos7/1.1.0.22/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
enabled=1
priority=1

[HDP-3.0-GPL]
name=HDP-GPL
baseurl=http://ambari01/HDP-GPL/centos7/3.0.0.0-1634
gpgcheck=1
gpgkey=http://ambari01/HDP-GPL/centos7/3.0.0.0-1634/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
enabled=1
priority=1
[root@ambari01 yum.repos.d]# yum clean
[root@ambari01 yum.repos.d]# yum makecache fast

拷贝repo源到各个节点,并做repo 缓存

[root@ambari01 yum.repos.d]# rysnc -avzP /etc/yum.repos.d/ambari.repo root@ambari02:/etc/yum.repos.d
[root@ambari01 yum.repos.d]# rysnc -avzP /etc/yum.repos.d/hdp.repo root@ambari02:/etc/yum.repos.d
[root@ambari01 yum.repos.d]# yum clean
[root@ambari01 yum.repos.d]# yum makecache fast

安装mysql 服务

mysql 服务提供给ambair-server 以及hive做存储.

yum localinstall https://dev.mysql.com/get/mysql57-community-release-el7-8.noarch.rpm
yum install mysql-community-server -y
systemctl start mysqld.service
systemctl status mysqld.service
#安装java连接mysql驱动
yum install mysql-connector-java*
# 查找随机root密码
grep 'A temporary password is generated for root@localhost' /var/log/mysqld.log |tail -1

配置数据库

# 重设密码
mysql_secure_installation 
# 创建数据库:
MariaDB [(none)]> create database ambari default character set utf8;
Query OK, 1 row affected (0.00 sec) 
MariaDB [(none)]> grant all on ambari.* to ambari@localhost identified by 'Bigdata_123';
Query OK, 0 rows affected (0.00 sec)
MariaDB [(none)]> grant all on ambari.* to ambari@'%' identified by 'Bigdata_123';
Query OK, 0 rows affected (0.00 sec)
MariaDB [(none)]> create database hive default character set utf8;
Query OK, 1 row affected (0.00 sec)
MariaDB [(none)]> grant all on hive.* to hive@localhost identified by 'Hive_123';
Query OK, 0 rows affected (0.00 sec)
MariaDB [(none)]> grant all on hive.* to hive@'%' identified by 'Hive_123';

安装ambari服务

在ambari01节点部署ambari-server服务

yum install ambari-server -y
ambari-server setup

配置 ambari-server setup

[root@ambari01 yum.repos.d] # ambari-server setup
Using python  /usr/bin/python
Setup ambari-server
Checking SELinux...
SELinux status is 'disabled'
Customize user account for ambari-server daemon [y/n] (n)? y
Enter user account for ambari-server daemon (root):ambari  
Adjusting ambari-server permissions and ownership...
Checking firewall status...
Checking JDK...
[1] Oracle JDK 1.8 + Java Cryptography Extension (JCE) Policy Files 8
[2] Oracle JDK 1.7 + Java Cryptography Extension (JCE) Policy Files 7
[3] Custom JDK
==============================================================================
Enter choice (1): 3
WARNING: JDK must be installed on all hosts and JAVA_HOME must be valid on all hosts.
WARNING: JCE Policy files are required for configuring Kerberos security. If you plan to use Kerberos,please make sure JCE Unlimited Strength Jurisdiction Policy Files are valid on all hosts.
Path to JAVA_HOME: /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.161-0.b14.el7_4.x86_64/jre  # 填写java_home
Validating JDK on Ambari Server...done.
Checking GPL software agreement...
GPL License for LZO: https://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html
Enable Ambari Server to download and install GPL Licensed LZO packages [y/n] (n)? n
Completing setup...
Configuring database...
Enter advanced database configuration [y/n] (n)? y  
Configuring database...
==============================================================================
Choose one of the following options:
[1] - PostgreSQL (Embedded)
[2] - Oracle
[3] - MySQL / MariaDB
[4] - PostgreSQL
[5] - Microsoft SQL Server (Tech Preview)
[6] - SQL Anywhere
[7] - BDB
==============================================================================
Enter choice (1): 3
Hostname (localhost): 
Port (3306): 
Database name (ambari): 
Username (ambari): 
Enter Database Password (bigdata): 
Configuring ambari database...
WARNING: Before starting Ambari Server, you must copy the MySQL JDBC driver JAR file to /usr/share/java and set property "server.jdbc.driver.path=[path/to/custom_jdbc_driver]" in ambari.properties.
Press <enter> to continue.

到上面一步时,提示指定jdbc驱动文件位置:

[root@ambari01 yum.repos.d]# ls -lh /usr/bin/java
lrwxrwxrwx 1 root root 22 Aug 12 21:50 /usr/bin/java -> /etc/alternatives/java
[root@ambari01 yum.repos.d]# ls -lh /etc/alternatives/java
lrwxrwxrwx 1 root root 73 Aug 12 21:50 /etc/alternatives/java -> /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.302.b08-0.el7_9.x86_64/jre/bin/java

所以 java_home 路径为: /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.302.b08-0.el7_9.x86_64/jre

然后提示运行ambari-server

Press <enter> to continue.
Configuring remote database connection properties...
WARNING: Before starting Ambari Server, you must run the following DDL against the database to create the schema: /var/lib/ambari-server/resources/Ambari-DDL-MySQL-CREATE.sql
Proceed with configuring remote database connection properties [y/n] (y)?

到这一步时,需要使用mysql 导入 ddl脚本到ambari库创建数据表

mysql -uroot -p ambari < /var/lib/ambari-server/resources/Ambari-DDL-MySQL-CREATE.sql

启动ambari服务

ambari-server start

登陆ambari-ui 配置

服务启动成功后,会监听8080端口,使用浏览器登录,账号密码admin/admin正常登录,正常登陆后需要配置。

ambari 升级spark ambari安装hadoop_hadoop

ambari 升级spark ambari安装hadoop_ambari 升级spark_02


ambari 升级spark ambari安装hadoop_大数据_03


ambari 升级spark ambari安装hadoop_ambari 升级spark_04


如果安装后,无法顺利完成,一般与配置repo源有关,如果repo源正确,并且ambari数据库有以下信息,都能正确安装,亲测了几次。

mysql> select * from repo_definition;
+----+------------+-----------+--------------------+-----------------------------------------------+--------------+------------+-------------+---------+
| id | repo_os_id | repo_name | repo_id            | base_url                                      | distribution | components | unique_repo | mirrors |
+----+------------+-----------+--------------------+-----------------------------------------------+--------------+------------+-------------+---------+
| 25 |          9 | HDP       | HDP-3.0            | http://ambari01/HDP/centos7/3.0.0.0-1634/     | NULL         | NULL       |           0 | NULL    |
| 26 |          9 | HDP-UTILS | HDP-UTILS-1.1.0.22 | http://ambari01/HDP-UTILS/centos7/1.1.0.22/   | NULL         | NULL       |           0 | NULL    |
| 51 |          9 | HDP-GPL   | HDP-3.0-GPL        | http://ambari01/HDP-GPL/centos7/3.0.0.0-1634/ | NULL         | NULL       |           0 | NULL    |
+----+------------+-----------+--------------------+-----------------------------------------------+--------------+------------+-------------+---------+
3 rows in set (0.00 sec)

参考文献:

使用Ambari搭建Hadoop集群ambari-hdp-1.repo中baseurl无值ambari项目结构