Spark环境搭建
- Spark环境搭建
- 下载spark
- 配置过程
- 复制spark到各个节点
- 启动spark
- 启动timelineserver服务
- 在yarn-site.xml中添加如下配置:
- 重启yarn服务
- 启动timelineserver服务
- 验证
- spark-shell
- spark on hive配置
- 通过spark-sql连接
- 使用thriftserver服务,利用beeline连接
- 代码方式
- 设定Spark动态资源分配
- 配置过程
- 修改yarn配置
- 增加Spark配置
- 启动shtiftserver服务
- 任务执行
Spark环境搭建
搭建过程参考:
配置,参照官方文档:
- https://spark.apache.org/docs/latest/configuration.html#yarn
- https://spark.apache.org/docs/latest/spark-standalone.html#cluster-launch-scripts
下载spark
地址:https://www.apache.org/dyn/closer.lua/spark/spark-3.1.2/spark-3.1.2-bin-hadoop3.2.tgz 下载后,放到/home/spark目录下
解压缩过程省略
配置过程
[hadoop@node1 ~]$ cd $SPARK_HOME/conf
#配置环境变量,参考上面hadoop搭建过程的环境变量配置
#复制编辑工作节点worker
[hadoop@node1 conf]$ cp workers.template workers
[hadoop@node1 conf]$ vim workers
#localhost
node2
node3
node4
[hadoop@node1 conf]$ cp spark-env.sh.template spark-env.sh
[hadoop@node1 conf]$ vim spark-env.sh
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.302.b08-0.el7_9.x86_64/jre
#export SCALA_HOME=/usr/share/scala-2.11
export HADOOP_HOME=/home/hadoop/hadoop-3.3.1
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export SPARK_MASTER_HOST=node1
export SPARK_LOCAL_DIRS=/data/spark/local/data
export SPARK_DRIVER_MEMORY=4g #内存
export SPARK_WORKER_CORES=8 #cpus核心数
export SPARK_WORKER_MEMORY=28g #worker运行spark使用的总内存
export SPARK_EXECUTOR_MEMORY=2g
export SPARK_MASTER_WEBUI_PORT=9080
export SPARK_MASTER_PORT=7077
export SPARK_WORKER_WEBUI_PORT=9090
export SPARK_WORKER_DIR=/data/spark/worker/data
export SPARK_LOCAL_IP=192.168.111.49 #这里很重要,必须得填写能够互通的worker本地ip,网上很多说些0.0.0.0或者127.0.0.1的,是因为网卡本身就在一个网段,或者本身就是伪分布式环境
#修改名称,提高辨识度
[hadoop@node1 conf]$ mv $SPARK_HOME/sbin/start-all.sh $SPARK_HOME/sbin/start-spark.sh
编辑spark-config.sh
[hadoop@node1 conf] cd $SPARK_HOME/sbin
[hadoop@node1 sbin]$ vim spark-config.sh
#开头处加入
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.302.b08-0.el7_9.x86_64/jre
复制spark到各个节点
[hadoop@node1 ~]$ scp -r spark-3.1.2-bin-hadoop3.2/ node2:/home/hadoop/
[hadoop@node1 ~]$ scp -r spark-3.1.2-bin-hadoop3.2/ node3:/home/hadoop/
[hadoop@node1 ~]$ scp -r spark-3.1.2-bin-hadoop3.2/ node4:/home/hadoop/
#在各个节点上面修改spark-env.sh
export SPARK_LOCAL_IP=192.168.111.50#各个节点的ip
启动spark
[hadoop@node1 sbin]$ ./start-spark.sh
starting org.apache.spark.deploy.master.Master, logging to /home/hadoop/spark-3.1.2-bin-hadoop3.2/logs/spark-hadoop-org.apache.spark.deploy.master.Master-1-node1.out
node4: starting org.apache.spark.deploy.worker.Worker, logging to /home/hadoop/spark-3.1.2-bin-hadoop3.2/logs/spark-hadoop-org.apache.spark.deploy.worker.Worker-1-node4.out
node2: starting org.apache.spark.deploy.worker.Worker, logging to /home/hadoop/spark-3.1.2-bin-hadoop3.2/logs/spark-hadoop-org.apache.spark.deploy.worker.Worker-1-node2.out
node3: starting org.apache.spark.deploy.worker.Worker, logging to /home/hadoop/spark-3.1.2-bin-hadoop3.2/logs/spark-hadoop-org.apache.spark.deploy.worker.Worker-1-node3.out
启动timelineserver服务
参考:
在hadoop2.4版本之前对任务执行的监控只开发了针对MR的 Job History Server,它可以提供给用户用户查询已经运行完成的作业的信息,但是后来,随着在YARN上面集成的越来越多的计算框架,比如spark、Tez,也有必要为基于这些计算引擎的技术开发相应的作业任务监控工具,所以hadoop的开发人员就考虑开发一款更加通用的Job History Server,即YARN Timeline Server。
在yarn-site.xml中添加如下配置:
<!--开始配置timeline service-->
<property>
<name>yarn.timeline-service.enabled</name>
<value>true</value>
<description>Indicate to clients whether Timeline service is enabled or not. If enabled, the TimelineClient
library used by end-users will post entities and events to the Timeline server.
</description>
</property>
<property>
<name>yarn.timeline-service.hostname</name>
<value>node1</value>
<description>The hostname of the Timeline service web application.</description>
</property>
<property>
<name>yarn.timeline-service.address</name>
<value>node1:10200</value>
<description>Address for the Timeline server to start the RPC server.</description>
</property>
<property>
<name>yarn.timeline-service.webapp.address</name>
<value>node1:8188</value>
<description>The http address of the Timeline service web application.</description>
</property>
<property>
<name>yarn.timeline-service.webapp.https.address</name>
<value>node1:8190</value>
<description>The https address of the Timeline service web application.</description>
</property>
<property>
<name>yarn.timeline-service.handler-thread-count</name>
<value>10</value>
<description>Handler thread count to serve the client RPC requests.</description>
</property>
<property>
<name>yarn.timeline-service.http-cross-origin.enabled</name>
<value>false</value>
<description>Enables cross-origin support (CORS) for web services where cross-origin web response headers are
needed. For example, javascript making a web services request to the timeline server,是否支持请求头跨域
</description>
</property>
<property>
<name>yarn.timeline-service.http-cross-origin.allowed-origins</name>
<value>*</value>
<description>Comma separated list of origins that are allowed for web services needing cross-origin (CORS) support. Wildcards (*) and patterns allowed,#需要跨域源支持的web服务所允许的以逗号分隔的列表
</description>
</property>
<property>
<name>yarn.timeline-service.http-cross-origin.allowed-methods</name>
<value>GET,POST,HEAD</value>
<description>Comma separated list of methods that are allowed for web services needing cross-origin (CORS) support.,跨域所允许的请求操作
</description>
</property>
<property>
<name>yarn.timeline-service.http-cross-origin.allowed-headers</name>
<value>X-Requested-With,Content-Type,Accept,Origin</value>
<description>Comma separated list of headers that are allowed for web services needing cross-origin (CORS) support.允许用于web的标题的逗号分隔列表
</description>
</property>
<property>
<name>yarn.timeline-service.http-cross-origin.max-age</name>
<value>1800</value>
<description>The number of seconds a pre-flighted request can be cached for web services needing cross-origin (CORS) support.可以缓存预先传送的请求的秒数
</description>
</property>
<property>
<name>yarn.timeline-service.generic-application-history.enabled</name>
<value>true</value>
<description>Indicate to clients whether to query generic application data from timeline history-service or not. If not enabled then application data is queried only from Resource Manager.
向资源管理器和客户端指示是否历史记录-服务是否启用。如果启用,资源管理器将启动
记录工时记录服务可以使用历史数据。同样,当应用程序如果启用此选项,请完成.
</description>
</property>
<property>
<name>yarn.timeline-service.generic-application-history.store-class</name>
<value>org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore</value>
<description>Store class name for history store, defaulting to file system store</description>
</property>
<property>
<description>Store class name for timeline store.</description>
<name>yarn.timeline-service.store-class</name>
<value>org.apache.hadoop.yarn.server.timeline.LeveldbTimelineStore</value>
</property>
<property>
<description>Enable age off of timeline store data.</description>
<name>yarn.timeline-service.ttl-enable</name>
<value>true</value>
</property>
<property>
<description>Time to live for timeline store data in milliseconds.</description>
<name>yarn.timeline-service.ttl-ms</name>
<value>6048000000</value>
</property>
<property>
<name>hadoop.zk.address</name>
<value>node2:2181,node3:2181,node4:2181</value>
</property>
<property>
<name>yarn.resourcemanager.system-metrics-publisher.enabled</name>
<value>true</value>
<description>The setting that controls whether yarn system metrics is published on the timeline server or not by RM.
</description>
</property>
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
<description>解决不能查看日志的问题</description>
</property>
<property>
<name>yarn.nodemanager.remote-app-log-dir</name>
<value>/data/hadoop/tmp/logs</value>
</property>
启动日志收集功能(参考:)
如果不启动下面这些参数,则会导致8188端口越来越慢
yarn.log-aggregation-enable
参数说明:是否启用日志聚合功能,日志聚合开启后保存到HDFS上。
默认值:false
yarn.log-aggregation.retain-seconds
参数说明:聚合后的日志在HDFS上保存多长时间,单位为s。
默认值:-1(不启用日志聚合),例如设置为86400,24小时
yarn.log-aggregation.retain-check-interval-seconds
参数解释:多长时间检查一次日志,并将满足条件的删除,如果是0或者负数,则为上一个(yarn.log-aggregation.retain-seconds)值的1/10。
默认值:-1
yarn.nodemanager.remote-app-log-dir
参数说明:当应用程序运行结束后,日志被转移到的HDFS目录(启用日志聚集功能时有效),修改为保存的日志文件夹。
默认值:/tmp/logs
yarn.nodemanager.remote-app-log-dir-suffix
参数说明:远程日志目录子目录名称(启用日志聚集功能时有效)。
默认值:logs 日志将被转移到目录${yarn.nodemanager.remote-app-log-dir}/${user}/${thisParam}下
重启yarn服务
参照上文中resourcemanager和nodemanager重启过程
启动timelineserver服务
[hadoop@node1 sbin]$ yarn --daemon start timelineserver
验证
访问:http://node1:8188/applicationhistory
spark-shell
参考:
spark on hive配置
参见:
通过spark-sql连接
- 将hive-site.xml复制到$SPARK_HOME/conf/目录下
- 注释掉hive-site中的查询引擎
<!-- <property>
<name>hive.execution.engine</name>
<value>tez</value>
</property>
-->
- 拷贝mysql驱动到$SPARK_HOME/jars目录下
- 将core-site.xml和hdfs-site.xml拷贝到$SPARK_HOME/conf目录下(可选)
- 如果hive中表采用lzo或snappy等压缩格式,则需要配置spark-defaults.conf
- 在脚本中(或者命令行),用
spark-sql --master yarn
来替代hive
,提高运行速度
[hadoop@node3 conf]$ spark-sql --master yarn
2021-08-24 11:02:19,112 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
2021-08-24 11:02:21,129 WARN util.Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
2021-08-24 11:02:21,794 WARN yarn.Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
2021-08-24 11:02:38,502 WARN conf.HiveConf: HiveConf of name hive.stats.jdbc.timeout does not exist
2021-08-24 11:02:38,503 WARN conf.HiveConf: HiveConf of name hive.stats.retries.wait does not exist
Spark master: yarn, Application Id: application_1629688359089_0008
spark-sql (default)> show databases;
namespace
caoyong_test
default
Time taken: 2.195 seconds, Fetched 2 row(s)
spark-sql (default)> use caoyong_test;
2021-08-24 11:03:02,369 WARN metastore.ObjectStore: Failed to get database global_temp, returning NoSuchObjectException
Response code
Time taken: 0.056 seconds
spark-sql (default)> show tables;
database tableName isTemporary
caoyong_test student false
Time taken: 0.235 seconds, Fetched 1 row(s)
spark-sql (default)> select * from student;
2021-08-24 11:03:19,703 WARN session.SessionState: METASTORE_FILTER_HOOK will be ignored, since hive.security.authorization.manager is set to instance of HiveAuthorizerFactory.
id name
1005 zhangsan
1003 node3
1002 node2
101 beeline by spark
102 beeline by spark
1000 aoaoao
100 spark-sql
103 beeline by spark aoaoao
Time taken: 3.672 seconds, Fetched 8 row(s)
spark-sql (default)> insert into student values(111,"spark-sql --master yarn");
2021-08-24 11:04:09,037 WARN conf.HiveConf: HiveConf of name hive.internal.ss.authz.settings.applied.marker does not exist
2021-08-24 11:04:09,037 WARN conf.HiveConf: HiveConf of name hive.stats.jdbc.timeout does not exist
2021-08-24 11:04:09,038 WARN conf.HiveConf: HiveConf of name hive.stats.retries.wait does not exist
Response code
Time taken: 2.724 seconds
spark-sql (default)> select * from student;
id name
1005 zhangsan
1003 node3
1002 node2
101 beeline by spark
102 beeline by spark
1000 aoaoao
111 spark-sql --master yarn
100 spark-sql
103 beeline by spark aoaoao
Time taken: 0.304 seconds, Fetched 9 row(s)
spark-sql (default)>
使用thriftserver服务,利用beeline连接
- 在$SPARK_HOME/conf/hive-site.xml中,配置:
<property>
<name>hive.server2.thrift.port</name>
<value>11240</value>
</property>
<property>
<name>hive.server2.thrift.bind.host</name>
<value>node3</value>
</property>
<!--实际上,上文已有这两个配置-->
- 然后执行如下命令:
[hadoop@node3 ~]$ start-thriftserver.sh --master yarn
[hadoop@node3 ~]$ jps
6288 JournalNode
27985 Worker
4801 SparkSubmit
6856 NodeManager
4954 ExecutorLauncher
14203 Jps
24956 QuorumPeerMain
6605 DataNode
- 使用beeline登录
[hadoop@node2 conf]$ beeline
Beeline version 2.3.7 by Apache Hive
beeline> !connect jdbc:hive2://node2:11240
Connecting to jdbc:hive2://node2:11240
Enter username for jdbc:hive2://node2:11240: hadoop
Enter password for jdbc:hive2://node2:11240: **********
2021-08-24 11:56:42,404 INFO jdbc.Utils: Supplied authorities: node2:11240
2021-08-24 11:56:42,405 INFO jdbc.Utils: Resolved authority: node2:11240
Connected to: Spark SQL (version 3.1.2)
Driver: Hive JDBC (version 2.3.7)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://node2:11240> show databases;
+---------------+
| namespace |
+---------------+
| caoyong_test |
| default |
+---------------+
2 rows selected (0.871 seconds)
0: jdbc:hive2://node2:11240> use caoyong_test;
+---------+
| Result |
+---------+
+---------+
No rows selected (0.08 seconds)
0: jdbc:hive2://node2:11240> select * from student;
+-------+--------------------------+
| id | name |
+-------+--------------------------+
| 1005 | zhangsan |
| 1003 | node3 |
| 1002 | node2 |
| 101 | beeline by spark |
| 102 | beeline by spark |
| 1000 | aoaoao |
| 111 | spark-sql --master yarn |
| 100 | spark-sql |
| 103 | beeline by spark aoaoao |
+-------+--------------------------+
9 rows selected (3.956 seconds)
0: jdbc:hive2://node2:11240> insert into student values(2222,"node2");
+---------+
| Result |
+---------+
+---------+
No rows selected (2.088 seconds)
0: jdbc:hive2://node2:11240> select * from student;
+-------+--------------------------+
| id | name |
+-------+--------------------------+
| 1005 | zhangsan |
| 1003 | node3 |
| 1002 | node2 |
| 101 | beeline by spark |
| 102 | beeline by spark |
| 1000 | aoaoao |
| 111 | spark-sql --master yarn |
| 100 | spark-sql |
| 2222 | node2 |
| 103 | beeline by spark aoaoao |
+-------+--------------------------+
10 rows selected (0.366 seconds)
0: jdbc:hive2://node2:11240>
备注:spark的thriftserver没有做ha,如果需要ha,则需要修改spark源码,添加注册zookeeper
代码方式
- 在依赖中添加hive-jdbc依赖
<!-- https://mvnrepository.com/artifact/org.apache.hive/hive-jdbc -->
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-jdbc</artifactId>
<version>3.1.2</version>
</dependency>
- 实现如下代码
import java.sql.*;
public class ClientTests {
public static void main(String[] args) throws SQLException, ClassNotFoundException {
Connection connection = null;
PreparedStatement ps = null;
ResultSet rs = null;
try {
// Class.forName("org.apache.hive.jdbc.HiveDriver");
//这里的url就是启动beeline的时候用的url sid是hive中的库名
connection = DriverManager.getConnection("jdbc:hive2://node4:11240", "hadoop", "******");
connection.prepareStatement("use caoyong_test").execute();
ps = connection.prepareStatement("SELECT min(auto_index) min_index,max(auto_index) max_index,count(*) count_,count(distinct mtm) mtm_count FROM ship_ib2 ");
rs = ps.executeQuery();
while (rs.next()) {
System.out.println("min_idx:" + rs.getInt("min_index")
+ ", max_index:" + rs.getInt("max_index")
+ ", count_:" + rs.getInt("count_")
+ ", mtm_count:" + rs.getInt("mtm_count")
);
}
} finally {
if (rs != null) {
rs.close();
}
if (ps != null) {
ps.close();
}
if (connection != null) {
connection.close();
}
}
}
执行如上代码,结果如下:
D:\tools\jdk1.8_271\bin\java.exe -javaagent:D:\tools\JetBrains\idear2021.2\lib\idea_rt.jar=52102:D:\tools\JetBrains\idear2021.2\bin -Dfile.encoding=UTF-8 -classpath D:\tools\jdk1.8_271\jre\lib\charsets.jar;D:\tools\jdk1.8_271\jre\lib\deploy.jar;D:\tools\jdk1.8_271\jre\lib\ext\access-bridge-64.jar;D:\tools\jdk1.8_271\jre\lib\ext\cldrdata.jar;D:\tools\jdk1.8_271\jre\lib\ext\dnsns.jar;D:\tools\jdk1.8_271\jre\lib\ext\jaccess.jar;D:\tools\jdk1.8_271\jre\lib\ext\jfxrt.jar;D:\tools\jdk1.8_271\jre\lib\ext\localedata.jar;D:\tools\jdk1.8_271\jre\lib\ext\nashorn.jar;D:\tools\jdk1.8_271\jre\lib\ext\sunec.jar;D:\tools\jdk1.8_271\jre\lib\ext\sunjce_provider.jar;D:\tools\jdk1.8_271\jre\lib\ext\sunmscapi.jar;D:\tools\jdk1.8_271\jre\lib\ext\sunpkcs11.jar;D:\tools\jdk1.8_271\jre\lib\ext\zipfs.jar;D:\tools\jdk1.8_271\jre\lib\javaws.jar;D:\tools\jdk1.8_271\jre\lib\jce.jar;D:\tools\jdk1.8_271\jre\lib\jfr.jar;D:\tools\jdk1.8_271\jre\lib\jfxswt.jar;D:\tools\jdk1.8_271\jre\lib\jsse.jar;D:\tools\jdk1.8_271\jre\lib\management-agent.jar;D:\tools\jdk1.8_271\jre\lib\plugin.jar;D:\tools\jdk1.8_271\jre\lib\resources.jar;D:\tools\jdk1.8_271\jre\lib\rt.jar;E:\engine\supply_chain\workspace\bigdata\target\classes;E:\m2\repository\org\springframework\boot\spring-boot-starter\2.5.0\spring-boot-starter-2.5.0.jar;E:\m2\repository\org\springframework\boot\spring-boot\2.5.0\spring-boot-2.5.0.jar;E:\m2\repository\org\springframework\spring-context\5.3.7\spring-context-5.3.7.jar;E:\m2\repository\org\springframework\spring-aop\5.3.7\spring-aop-5.3.7.jar;E:\m2\repository\org\springframework\spring-beans\5.3.7\spring-beans-5.3.7.jar;E:\m2\repository\org\springframework\spring-expression\5.3.7\spring-expression-5.3.7.jar;E:\m2\repository\org\springframework\boot\spring-boot-autoconfigure\2.5.0\spring-boot-autoconfigure-2.5.0.jar;E:\m2\repository\org\springframework\boot\spring-boot-starter-logging\2.5.0\spring-boot-starter-logging-2.5.0.jar;E:\m2\repository\ch\qos\logback\logback-classic\1.2.3\logback-classic-1.2.3.jar;E:\m2\repository\ch\qos\logback\logback-core\1.2.3\logback-core-1.2.3.jar;E:\m2\repository\org\apache\logging\log4j\log4j-to-slf4j\2.14.1\log4j-to-slf4j-2.14.1.jar;E:\m2\repository\org\apache\logging\log4j\log4j-api\2.14.1\log4j-api-2.14.1.jar;E:\m2\repository\jakarta\annotation\jakarta.annotation-api\1.3.5\jakarta.annotation-api-1.3.5.jar;E:\m2\repository\org\springframework\spring-core\5.3.7\spring-core-5.3.7.jar;E:\m2\repository\org\springframework\spring-jcl\5.3.7\spring-jcl-5.3.7.jar;E:\m2\repository\org\yaml\snakeyaml\1.28\snakeyaml-1.28.jar;E:\m2\repository\net\minidev\json-smart\2.4.7\json-smart-2.4.7.jar;E:\m2\repository\net\minidev\accessors-smart\2.4.7\accessors-smart-2.4.7.jar;E:\m2\repository\org\objenesis\objenesis\3.2\objenesis-3.2.jar;E:\m2\repository\org\apache\spark\spark-core_2.11\2.4.7\spark-core_2.11-2.4.7.jar;E:\m2\repository\com\thoughtworks\paranamer\paranamer\2.8\paranamer-2.8.jar;E:\m2\repository\org\apache\avro\avro\1.8.2\avro-1.8.2.jar;E:\m2\repository\org\codehaus\jackson\jackson-core-asl\1.9.13\jackson-core-asl-1.9.13.jar;E:\m2\repository\org\tukaani\xz\1.5\xz-1.5.jar;E:\m2\repository\org\apache\avro\avro-mapred\1.8.2\avro-mapred-1.8.2-hadoop2.jar;E:\m2\repository\org\apache\avro\avro-ipc\1.8.2\avro-ipc-1.8.2.jar;E:\m2\repository\com\twitter\chill_2.11\0.9.3\chill_2.11-0.9.3.jar;E:\m2\repository\com\esotericsoftware\kryo-shaded\4.0.2\kryo-shaded-4.0.2.jar;E:\m2\repository\com\esotericsoftware\minlog\1.3.0\minlog-1.3.0.jar;E:\m2\repository\com\twitter\chill-java\0.9.3\chill-java-0.9.3.jar;E:\m2\repository\org\apache\xbean\xbean-asm6-shaded\4.8\xbean-asm6-shaded-4.8.jar;E:\m2\repository\org\apache\hadoop\hadoop-client\2.6.5\hadoop-client-2.6.5.jar;E:\m2\repository\org\apache\hadoop\hadoop-mapreduce-client-app\2.6.5\hadoop-mapreduce-client-app-2.6.5.jar;E:\m2\repository\org\apache\hadoop\hadoop-mapreduce-client-common\2.6.5\hadoop-mapreduce-client-common-2.6.5.jar;E:\m2\repository\org\apache\hadoop\hadoop-yarn-client\2.6.5\hadoop-yarn-client-2.6.5.jar;E:\m2\repository\org\apache\hadoop\hadoop-yarn-server-common\2.6.5\hadoop-yarn-server-common-2.6.5.jar;E:\m2\repository\org\apache\hadoop\hadoop-mapreduce-client-shuffle\2.6.5\hadoop-mapreduce-client-shuffle-2.6.5.jar;E:\m2\repository\org\apache\hadoop\hadoop-yarn-api\2.6.5\hadoop-yarn-api-2.6.5.jar;E:\m2\repository\org\apache\hadoop\hadoop-mapreduce-client-core\2.6.5\hadoop-mapreduce-client-core-2.6.5.jar;E:\m2\repository\org\apache\hadoop\hadoop-yarn-common\2.6.5\hadoop-yarn-common-2.6.5.jar;E:\m2\repository\org\apache\hadoop\hadoop-mapreduce-client-jobclient\2.6.5\hadoop-mapreduce-client-jobclient-2.6.5.jar;E:\m2\repository\org\apache\spark\spark-launcher_2.11\2.4.7\spark-launcher_2.11-2.4.7.jar;E:\m2\repository\org\apache\spark\spark-kvstore_2.11\2.4.7\spark-kvstore_2.11-2.4.7.jar;E:\m2\repository\com\fasterxml\jackson\core\jackson-core\2.12.3\jackson-core-2.12.3.jar;E:\m2\repository\org\apache\spark\spark-network-common_2.11\2.4.7\spark-network-common_2.11-2.4.7.jar;E:\m2\repository\org\apache\spark\spark-network-shuffle_2.11\2.4.7\spark-network-shuffle_2.11-2.4.7.jar;E:\m2\repository\org\apache\spark\spark-unsafe_2.11\2.4.7\spark-unsafe_2.11-2.4.7.jar;E:\m2\repository\javax\activation\activation\1.1.1\activation-1.1.1.jar;E:\m2\repository\org\apache\curator\curator-recipes\2.6.0\curator-recipes-2.6.0.jar;E:\m2\repository\org\apache\zookeeper\zookeeper\3.4.6\zookeeper-3.4.6.jar;E:\m2\repository\javax\servlet\javax.servlet-api\4.0.1\javax.servlet-api-4.0.1.jar;E:\m2\repository\org\apache\commons\commons-lang3\3.12.0\commons-lang3-3.12.0.jar;E:\m2\repository\org\apache\commons\commons-math3\3.4.1\commons-math3-3.4.1.jar;E:\m2\repository\com\google\code\findbugs\jsr305\1.3.9\jsr305-1.3.9.jar;E:\m2\repository\org\slf4j\slf4j-api\1.7.30\slf4j-api-1.7.30.jar;E:\m2\repository\org\slf4j\jul-to-slf4j\1.7.30\jul-to-slf4j-1.7.30.jar;E:\m2\repository\org\slf4j\jcl-over-slf4j\1.7.30\jcl-over-slf4j-1.7.30.jar;E:\m2\repository\log4j\log4j\1.2.17\log4j-1.2.17.jar;E:\m2\repository\org\slf4j\slf4j-log4j12\1.7.30\slf4j-log4j12-1.7.30.jar;E:\m2\repository\com\ning\compress-lzf\1.0.3\compress-lzf-1.0.3.jar;E:\m2\repository\org\xerial\snappy\snappy-java\1.1.7.5\snappy-java-1.1.7.5.jar;E:\m2\repository\org\lz4\lz4-java\1.4.0\lz4-java-1.4.0.jar;E:\m2\repository\com\github\luben\zstd-jni\1.3.2-2\zstd-jni-1.3.2-2.jar;E:\m2\repository\org\roaringbitmap\RoaringBitmap\0.7.45\RoaringBitmap-0.7.45.jar;E:\m2\repository\org\roaringbitmap\shims\0.7.45\shims-0.7.45.jar;E:\m2\repository\commons-net\commons-net\3.1\commons-net-3.1.jar;E:\m2\repository\org\scala-lang\scala-library\2.11.12\scala-library-2.11.12.jar;E:\m2\repository\org\json4s\json4s-jackson_2.11\3.5.3\json4s-jackson_2.11-3.5.3.jar;E:\m2\repository\org\json4s\json4s-core_2.11\3.5.3\json4s-core_2.11-3.5.3.jar;E:\m2\repository\org\json4s\json4s-ast_2.11\3.5.3\json4s-ast_2.11-3.5.3.jar;E:\m2\repository\org\json4s\json4s-scalap_2.11\3.5.3\json4s-scalap_2.11-3.5.3.jar;E:\m2\repository\org\scala-lang\modules\scala-xml_2.11\1.0.6\scala-xml_2.11-1.0.6.jar;E:\m2\repository\org\glassfish\jersey\core\jersey-client\2.33\jersey-client-2.33.jar;E:\m2\repository\jakarta\ws\rs\jakarta.ws.rs-api\2.1.6\jakarta.ws.rs-api-2.1.6.jar;E:\m2\repository\org\glassfish\hk2\external\jakarta.inject\2.6.1\jakarta.inject-2.6.1.jar;E:\m2\repository\org\glassfish\jersey\core\jersey-common\2.33\jersey-common-2.33.jar;E:\m2\repository\org\glassfish\hk2\osgi-resource-locator\1.0.3\osgi-resource-locator-1.0.3.jar;E:\m2\repository\org\glassfish\jersey\core\jersey-server\2.33\jersey-server-2.33.jar;E:\m2\repository\jakarta\validation\jakarta.validation-api\2.0.2\jakarta.validation-api-2.0.2.jar;E:\m2\repository\org\glassfish\jersey\containers\jersey-container-servlet\2.33\jersey-container-servlet-2.33.jar;E:\m2\repository\org\glassfish\jersey\containers\jersey-container-servlet-core\2.33\jersey-container-servlet-core-2.33.jar;E:\m2\repository\io\netty\netty-all\4.1.65.Final\netty-all-4.1.65.Final.jar;E:\m2\repository\io\netty\netty\3.9.9.Final\netty-3.9.9.Final.jar;E:\m2\repository\com\clearspring\analytics\stream\2.7.0\stream-2.7.0.jar;E:\m2\repository\io\dropwizard\metrics\metrics-core\4.1.21\metrics-core-4.1.21.jar;E:\m2\repository\io\dropwizard\metrics\metrics-jvm\4.1.21\metrics-jvm-4.1.21.jar;E:\m2\repository\io\dropwizard\metrics\metrics-json\4.1.21\metrics-json-4.1.21.jar;E:\m2\repository\io\dropwizard\metrics\metrics-graphite\4.1.21\metrics-graphite-4.1.21.jar;E:\m2\repository\com\rabbitmq\amqp-client\5.12.0\amqp-client-5.12.0.jar;E:\m2\repository\com\fasterxml\jackson\core\jackson-databind\2.12.3\jackson-databind-2.12.3.jar;E:\m2\repository\com\fasterxml\jackson\module\jackson-module-scala_2.11\2.12.3\jackson-module-scala_2.11-2.12.3.jar;E:\m2\repository\org\apache\ivy\ivy\2.4.0\ivy-2.4.0.jar;E:\m2\repository\oro\oro\2.0.8\oro-2.0.8.jar;E:\m2\repository\net\razorvine\pyrolite\4.13\pyrolite-4.13.jar;E:\m2\repository\net\sf\py4j\py4j\0.10.7\py4j-0.10.7.jar;E:\m2\repository\org\apache\spark\spark-tags_2.11\2.4.7\spark-tags_2.11-2.4.7.jar;E:\m2\repository\org\apache\commons\commons-crypto\1.0.0\commons-crypto-1.0.0.jar;E:\m2\repository\org\spark-project\spark\unused\1.0.0\unused-1.0.0.jar;E:\m2\repository\org\apache\spark\spark-sql_2.11\2.4.7\spark-sql_2.11-2.4.7.jar;E:\m2\repository\com\univocity\univocity-parsers\2.7.3\univocity-parsers-2.7.3.jar;E:\m2\repository\org\apache\spark\spark-sketch_2.11\2.4.7\spark-sketch_2.11-2.4.7.jar;E:\m2\repository\org\apache\spark\spark-catalyst_2.11\2.4.7\spark-catalyst_2.11-2.4.7.jar;E:\m2\repository\org\scala-lang\scala-reflect\2.11.12\scala-reflect-2.11.12.jar;E:\m2\repository\org\codehaus\janino\janino\3.1.4\janino-3.1.4.jar;E:\m2\repository\org\codehaus\janino\commons-compiler\3.1.4\commons-compiler-3.1.4.jar;E:\m2\repository\org\antlr\antlr4-runtime\4.7\antlr4-runtime-4.7.jar;E:\m2\repository\org\apache\orc\orc-core\1.5.5\orc-core-1.5.5-nohive.jar;E:\m2\repository\org\apache\orc\orc-shims\1.5.5\orc-shims-1.5.5.jar;E:\m2\repository\commons-lang\commons-lang\2.6\commons-lang-2.6.jar;E:\m2\repository\io\airlift\aircompressor\0.10\aircompressor-0.10.jar;E:\m2\repository\org\apache\orc\orc-mapreduce\1.5.5\orc-mapreduce-1.5.5-nohive.jar;E:\m2\repository\org\apache\parquet\parquet-column\1.10.1\parquet-column-1.10.1.jar;E:\m2\repository\org\apache\parquet\parquet-common\1.10.1\parquet-common-1.10.1.jar;E:\m2\repository\org\apache\parquet\parquet-encoding\1.10.1\parquet-encoding-1.10.1.jar;E:\m2\repository\org\apache\parquet\parquet-hadoop\1.10.1\parquet-hadoop-1.10.1.jar;E:\m2\repository\org\apache\parquet\parquet-format\2.4.0\parquet-format-2.4.0.jar;E:\m2\repository\org\apache\parquet\parquet-jackson\1.10.1\parquet-jackson-1.10.1.jar;E:\m2\repository\org\apache\arrow\arrow-vector\0.10.0\arrow-vector-0.10.0.jar;E:\m2\repository\org\apache\arrow\arrow-format\0.10.0\arrow-format-0.10.0.jar;E:\m2\repository\org\apache\arrow\arrow-memory\0.10.0\arrow-memory-0.10.0.jar;E:\m2\repository\com\carrotsearch\hppc\0.7.2\hppc-0.7.2.jar;E:\m2\repository\com\vlkan\flatbuffers\1.2.0-3f79e055\flatbuffers-1.2.0-3f79e055.jar;E:\m2\repository\org\apache\spark\spark-streaming_2.11\2.4.7\spark-streaming_2.11-2.4.7.jar;E:\m2\repository\org\apache\spark\spark-hive_2.11\2.4.7\spark-hive_2.11-2.4.7.jar;E:\m2\repository\com\twitter\parquet-hadoop-bundle\1.6.0\parquet-hadoop-bundle-1.6.0.jar;E:\m2\repository\org\spark-project\hive\hive-exec\1.2.1.spark2\hive-exec-1.2.1.spark2.jar;E:\m2\repository\javolution\javolution\5.5.1\javolution-5.5.1.jar;E:\m2\repository\log4j\apache-log4j-extras\1.2.17\apache-log4j-extras-1.2.17.jar;E:\m2\repository\org\antlr\antlr-runtime\3.4\antlr-runtime-3.4.jar;E:\m2\repository\org\antlr\stringtemplate\3.2.1\stringtemplate-3.2.1.jar;E:\m2\repository\antlr\antlr\2.7.7\antlr-2.7.7.jar;E:\m2\repository\org\antlr\ST4\4.0.4\ST4-4.0.4.jar;E:\m2\repository\com\googlecode\javaewah\JavaEWAH\0.3.2\JavaEWAH-0.3.2.jar;E:\m2\repository\org\iq80\snappy\snappy\0.2\snappy-0.2.jar;E:\m2\repository\stax\stax-api\1.0.1\stax-api-1.0.1.jar;E:\m2\repository\net\sf\opencsv\opencsv\2.3\opencsv-2.3.jar;E:\m2\repository\org\spark-project\hive\hive-metastore\1.2.1.spark2\hive-metastore-1.2.1.spark2.jar;E:\m2\repository\com\jolbox\bonecp\0.8.0.RELEASE\bonecp-0.8.0.RELEASE.jar;E:\m2\repository\org\datanucleus\datanucleus-api-jdo\3.2.6\datanucleus-api-jdo-3.2.6.jar;E:\m2\repository\org\datanucleus\datanucleus-rdbms\3.2.9\datanucleus-rdbms-3.2.9.jar;E:\m2\repository\commons-pool\commons-pool\1.6\commons-pool-1.6.jar;E:\m2\repository\commons-dbcp\commons-dbcp\1.4\commons-dbcp-1.4.jar;E:\m2\repository\javax\jdo\jdo-api\3.0.1\jdo-api-3.0.1.jar;E:\m2\repository\javax\transaction\jta\1.1\jta-1.1.jar;E:\m2\repository\commons-httpclient\commons-httpclient\3.1\commons-httpclient-3.1.jar;E:\m2\repository\org\apache\calcite\calcite-avatica\1.2.0-incubating\calcite-avatica-1.2.0-incubating.jar;E:\m2\repository\org\apache\calcite\calcite-core\1.2.0-incubating\calcite-core-1.2.0-incubating.jar;E:\m2\repository\org\apache\calcite\calcite-linq4j\1.2.0-incubating\calcite-linq4j-1.2.0-incubating.jar;E:\m2\repository\net\hydromatic\eigenbase-properties\1.1.5\eigenbase-properties-1.1.5.jar;E:\m2\repository\org\apache\httpcomponents\httpclient\4.5.13\httpclient-4.5.13.jar;E:\m2\repository\org\codehaus\jackson\jackson-mapper-asl\1.9.13\jackson-mapper-asl-1.9.13.jar;E:\m2\repository\commons-codec\commons-codec\1.15\commons-codec-1.15.jar;E:\m2\repository\joda-time\joda-time\2.9.3\joda-time-2.9.3.jar;E:\m2\repository\org\jodd\jodd-core\3.5.2\jodd-core-3.5.2.jar;E:\m2\repository\org\datanucleus\datanucleus-core\3.2.10\datanucleus-core-3.2.10.jar;E:\m2\repository\org\apache\thrift\libthrift\0.9.3\libthrift-0.9.3.jar;E:\m2\repository\org\apache\thrift\libfb303\0.9.3\libfb303-0.9.3.jar;E:\m2\repository\org\apache\derby\derby\10.14.2.0\derby-10.14.2.0.jar;E:\m2\repository\org\apache\spark\spark-mllib_2.11\2.4.7\spark-mllib_2.11-2.4.7.jar;E:\m2\repository\org\scala-lang\modules\scala-parser-combinators_2.11\1.1.0\scala-parser-combinators_2.11-1.1.0.jar;E:\m2\repository\org\apache\spark\spark-graphx_2.11\2.4.7\spark-graphx_2.11-2.4.7.jar;E:\m2\repository\com\github\fommil\netlib\core\1.1.2\core-1.1.2.jar;E:\m2\repository\net\sourceforge\f2j\arpack_combined_all\0.1\arpack_combined_all-0.1.jar;E:\m2\repository\org\apache\spark\spark-mllib-local_2.11\2.4.7\spark-mllib-local_2.11-2.4.7.jar;E:\m2\repository\org\scalanlp\breeze_2.11\0.13.2\breeze_2.11-0.13.2.jar;E:\m2\repository\org\scalanlp\breeze-macros_2.11\0.13.2\breeze-macros_2.11-0.13.2.jar;E:\m2\repository\com\github\rwl\jtransforms\2.4.0\jtransforms-2.4.0.jar;E:\m2\repository\org\spire-math\spire_2.11\0.13.0\spire_2.11-0.13.0.jar;E:\m2\repository\org\spire-math\spire-macros_2.11\0.13.0\spire-macros_2.11-0.13.0.jar;E:\m2\repository\org\typelevel\machinist_2.11\0.6.1\machinist_2.11-0.6.1.jar;E:\m2\repository\com\chuusai\shapeless_2.11\2.3.2\shapeless_2.11-2.3.2.jar;E:\m2\repository\org\typelevel\macro-compat_2.11\1.1.1\macro-compat_2.11-1.1.1.jar;E:\m2\repository\com\alibaba\easyexcel\2.2.10\easyexcel-2.2.10.jar;E:\m2\repository\org\apache\poi\poi\3.17\poi-3.17.jar;E:\m2\repository\org\apache\commons\commons-collections4\4.1\commons-collections4-4.1.jar;E:\m2\repository\org\apache\poi\poi-ooxml\3.17\poi-ooxml-3.17.jar;E:\m2\repository\com\github\virtuald\curvesapi\1.04\curvesapi-1.04.jar;E:\m2\repository\org\apache\poi\poi-ooxml-schemas\3.17\poi-ooxml-schemas-3.17.jar;E:\m2\repository\org\apache\xmlbeans\xmlbeans\2.6.0\xmlbeans-2.6.0.jar;E:\m2\repository\cglib\cglib\3.1\cglib-3.1.jar;E:\m2\repository\org\ow2\asm\asm\4.2\asm-4.2.jar;E:\m2\repository\org\ehcache\ehcache\3.9.3\ehcache-3.9.3.jar;E:\m2\repository\org\apache\hadoop\hadoop-core\1.2.1\hadoop-core-1.2.1.jar;E:\m2\repository\commons-cli\commons-cli\1.2\commons-cli-1.2.jar;E:\m2\repository\xmlenc\xmlenc\0.52\xmlenc-0.52.jar;E:\m2\repository\com\sun\jersey\jersey-core\1.8\jersey-core-1.8.jar;E:\m2\repository\com\sun\jersey\jersey-json\1.8\jersey-json-1.8.jar;E:\m2\repository\org\codehaus\jettison\jettison\1.1\jettison-1.1.jar;E:\m2\repository\com\sun\xml\bind\jaxb-impl\2.2.3-1\jaxb-impl-2.2.3-1.jar;E:\m2\repository\javax\xml\bind\jaxb-api\2.3.1\jaxb-api-2.3.1.jar;E:\m2\repository\javax\activation\javax.activation-api\1.2.0\javax.activation-api-1.2.0.jar;E:\m2\repository\org\codehaus\jackson\jackson-jaxrs\1.7.1\jackson-jaxrs-1.7.1.jar;E:\m2\repository\org\codehaus\jackson\jackson-xc\1.7.1\jackson-xc-1.7.1.jar;E:\m2\repository\com\sun\jersey\jersey-server\1.8\jersey-server-1.8.jar;E:\m2\repository\asm\asm\3.1\asm-3.1.jar;E:\m2\repository\commons-io\commons-io\2.1\commons-io-2.1.jar;E:\m2\repository\org\apache\commons\commons-math\2.1\commons-math-2.1.jar;E:\m2\repository\commons-configuration\commons-configuration\1.6\commons-configuration-1.6.jar;E:\m2\repository\commons-digester\commons-digester\1.8\commons-digester-1.8.jar;E:\m2\repository\commons-beanutils\commons-beanutils-core\1.8.0\commons-beanutils-core-1.8.0.jar;E:\m2\repository\org\mortbay\jetty\jetty\6.1.26\jetty-6.1.26.jar;E:\m2\repository\org\mortbay\jetty\servlet-api\2.5-20081211\servlet-api-2.5-20081211.jar;E:\m2\repository\org\mortbay\jetty\jetty-util\6.1.26\jetty-util-6.1.26.jar;E:\m2\repository\tomcat\jasper-runtime\5.5.12\jasper-runtime-5.5.12.jar;E:\m2\repository\tomcat\jasper-compiler\5.5.12\jasper-compiler-5.5.12.jar;E:\m2\repository\org\mortbay\jetty\jsp-api-2.1\6.1.14\jsp-api-2.1-6.1.14.jar;E:\m2\repository\org\mortbay\jetty\servlet-api-2.5\6.1.14\servlet-api-2.5-6.1.14.jar;E:\m2\repository\org\mortbay\jetty\jsp-2.1\6.1.14\jsp-2.1-6.1.14.jar;E:\m2\repository\ant\ant\1.6.5\ant-1.6.5.jar;E:\m2\repository\commons-el\commons-el\1.0\commons-el-1.0.jar;E:\m2\repository\net\java\dev\jets3t\jets3t\0.6.1\jets3t-0.6.1.jar;E:\m2\repository\hsqldb\hsqldb\1.8.0.10\hsqldb-1.8.0.10.jar;E:\m2\repository\org\eclipse\jdt\core\3.1.1\core-3.1.1.jar;E:\m2\repository\org\apache\hadoop\hadoop-common\3.3.1\hadoop-common-3.3.1.jar;E:\m2\repository\org\apache\hadoop\thirdparty\hadoop-shaded-protobuf_3_7\1.1.1\hadoop-shaded-protobuf_3_7-1.1.1.jar;E:\m2\repository\org\apache\hadoop\hadoop-annotations\3.3.1\hadoop-annotations-3.3.1.jar;E:\m2\repository\org\apache\hadoop\thirdparty\hadoop-shaded-guava\1.1.1\hadoop-shaded-guava-1.1.1.jar;E:\m2\repository\com\google\guava\guava\27.0-jre\guava-27.0-jre.jar;E:\m2\repository\com\google\guava\failureaccess\1.0\failureaccess-1.0.jar;E:\m2\repository\com\google\guava\listenablefuture\9999.0-empty-to-avoid-conflict-with-guava\listenablefuture-9999.0-empty-to-avoid-conflict-with-guava.jar;E:\m2\repository\org\checkerframework\checker-qual\2.5.2\checker-qual-2.5.2.jar;E:\m2\repository\com\google\j2objc\j2objc-annotations\1.1\j2objc-annotations-1.1.jar;E:\m2\repository\org\codehaus\mojo\animal-sniffer-annotations\1.17\animal-sniffer-annotations-1.17.jar;E:\m2\repository\commons-collections\commons-collections\3.2.2\commons-collections-3.2.2.jar;E:\m2\repository\jakarta\activation\jakarta.activation-api\1.2.2\jakarta.activation-api-1.2.2.jar;E:\m2\repository\org\eclipse\jetty\jetty-server\9.4.41.v20210516\jetty-server-9.4.41.v20210516.jar;E:\m2\repository\org\eclipse\jetty\jetty-http\9.4.41.v20210516\jetty-http-9.4.41.v20210516.jar;E:\m2\repository\org\eclipse\jetty\jetty-io\9.4.41.v20210516\jetty-io-9.4.41.v20210516.jar;E:\m2\repository\org\eclipse\jetty\jetty-util\9.4.41.v20210516\jetty-util-9.4.41.v20210516.jar;E:\m2\repository\org\eclipse\jetty\jetty-servlet\9.4.41.v20210516\jetty-servlet-9.4.41.v20210516.jar;E:\m2\repository\org\eclipse\jetty\jetty-security\9.4.41.v20210516\jetty-security-9.4.41.v20210516.jar;E:\m2\repository\org\eclipse\jetty\jetty-webapp\9.4.41.v20210516\jetty-webapp-9.4.41.v20210516.jar;E:\m2\repository\org\eclipse\jetty\jetty-xml\9.4.41.v20210516\jetty-xml-9.4.41.v20210516.jar;E:\m2\repository\javax\servlet\jsp\jsp-api\2.1\jsp-api-2.1.jar;E:\m2\repository\com\sun\jersey\jersey-servlet\1.19\jersey-servlet-1.19.jar;E:\m2\repository\commons-logging\commons-logging\1.1.3\commons-logging-1.1.3.jar;E:\m2\repository\commons-beanutils\commons-beanutils\1.9.4\commons-beanutils-1.9.4.jar;E:\m2\repository\org\apache\commons\commons-configuration2\2.1.1\commons-configuration2-2.1.1.jar;E:\m2\repository\org\apache\commons\commons-text\1.4\commons-text-1.4.jar;E:\m2\repository\com\google\re2j\re2j\1.1\re2j-1.1.jar;E:\m2\repository\com\google\protobuf\protobuf-java\2.5.0\protobuf-java-2.5.0.jar;E:\m2\repository\com\google\code\gson\gson\2.8.6\gson-2.8.6.jar;E:\m2\repository\org\apache\hadoop\hadoop-auth\3.3.1\hadoop-auth-3.3.1.jar;E:\m2\repository\com\nimbusds\nimbus-jose-jwt\9.8.1\nimbus-jose-jwt-9.8.1.jar;E:\m2\repository\com\github\stephenc\jcip\jcip-annotations\1.0-1\jcip-annotations-1.0-1.jar;E:\m2\repository\org\apache\kerby\kerb-simplekdc\1.0.1\kerb-simplekdc-1.0.1.jar;E:\m2\repository\org\apache\kerby\kerb-client\1.0.1\kerb-client-1.0.1.jar;E:\m2\repository\org\apache\kerby\kerby-config\1.0.1\kerby-config-1.0.1.jar;E:\m2\repository\org\apache\kerby\kerb-common\1.0.1\kerb-common-1.0.1.jar;E:\m2\repository\org\apache\kerby\kerb-crypto\1.0.1\kerb-crypto-1.0.1.jar;E:\m2\repository\org\apache\kerby\kerb-util\1.0.1\kerb-util-1.0.1.jar;E:\m2\repository\org\apache\kerby\token-provider\1.0.1\token-provider-1.0.1.jar;E:\m2\repository\org\apache\kerby\kerb-admin\1.0.1\kerb-admin-1.0.1.jar;E:\m2\repository\org\apache\kerby\kerb-server\1.0.1\kerb-server-1.0.1.jar;E:\m2\repository\org\apache\kerby\kerb-identity\1.0.1\kerb-identity-1.0.1.jar;E:\m2\repository\org\apache\kerby\kerby-xdr\1.0.1\kerby-xdr-1.0.1.jar;E:\m2\repository\com\jcraft\jsch\0.1.55\jsch-0.1.55.jar;E:\m2\repository\org\apache\curator\curator-client\4.2.0\curator-client-4.2.0.jar;E:\m2\repository\org\apache\htrace\htrace-core4\4.1.0-incubating\htrace-core4-4.1.0-incubating.jar;E:\m2\repository\org\apache\commons\commons-compress\1.19\commons-compress-1.19.jar;E:\m2\repository\org\apache\kerby\kerb-core\1.0.1\kerb-core-1.0.1.jar;E:\m2\repository\org\apache\kerby\kerby-pkix\1.0.1\kerby-pkix-1.0.1.jar;E:\m2\repository\org\apache\kerby\kerby-asn1\1.0.1\kerby-asn1-1.0.1.jar;E:\m2\repository\org\apache\kerby\kerby-util\1.0.1\kerby-util-1.0.1.jar;E:\m2\repository\org\codehaus\woodstox\stax2-api\4.2.1\stax2-api-4.2.1.jar;E:\m2\repository\com\fasterxml\woodstox\woodstox-core\5.3.0\woodstox-core-5.3.0.jar;E:\m2\repository\dnsjava\dnsjava\2.1.7\dnsjava-2.1.7.jar;E:\m2\repository\org\apache\hadoop\hadoop-hdfs\3.3.1\hadoop-hdfs-3.3.1.jar;E:\m2\repository\org\eclipse\jetty\jetty-util-ajax\9.4.41.v20210516\jetty-util-ajax-9.4.41.v20210516.jar;E:\m2\repository\commons-daemon\commons-daemon\1.0.13\commons-daemon-1.0.13.jar;E:\m2\repository\org\fusesource\leveldbjni\leveldbjni-all\1.8\leveldbjni-all-1.8.jar;E:\m2\repository\com\fasterxml\jackson\core\jackson-annotations\2.12.3\jackson-annotations-2.12.3.jar;E:\m2\repository\org\apache\orc\orc-core\1.5.12\orc-core-1.5.12.jar;E:\m2\repository\org\threeten\threeten-extra\1.5.0\threeten-extra-1.5.0.jar;E:\m2\repository\org\apache\hive\hive-common\2.3.7\hive-common-2.3.7.jar;E:\m2\repository\jline\jline\2.12\jline-2.12.jar;E:\m2\repository\com\tdunning\json\1.8\json-1.8.jar;E:\m2\repository\com\github\joshelser\dropwizard-metrics-hadoop-metrics2-reporter\0.1.2\dropwizard-metrics-hadoop-metrics2-reporter-0.1.2.jar;E:\m2\repository\org\apache\hive\hive-metastore\2.3.7\hive-metastore-2.3.7.jar;E:\m2\repository\com\zaxxer\HikariCP\4.0.3\HikariCP-4.0.3.jar;E:\m2\repository\org\datanucleus\javax.jdo\3.2.0-m3\javax.jdo-3.2.0-m3.jar;E:\m2\repository\javax\transaction\transaction-api\1.1\transaction-api-1.1.jar;E:\m2\repository\org\apache\hive\hive-serde\2.3.7\hive-serde-2.3.7.jar;E:\m2\repository\org\apache\hive\hive-shims\2.3.7\hive-shims-2.3.7.jar;E:\m2\repository\org\apache\hive\shims\hive-shims-common\2.3.7\hive-shims-common-2.3.7.jar;E:\m2\repository\org\apache\hive\shims\hive-shims-0.23\2.3.7\hive-shims-0.23-2.3.7.jar;E:\m2\repository\org\apache\hive\shims\hive-shims-scheduler\2.3.7\hive-shims-scheduler-2.3.7.jar;E:\m2\repository\org\apache\hive\hive-llap-common\2.3.7\hive-llap-common-2.3.7.jar;E:\m2\repository\org\apache\hive\hive-llap-client\2.3.7\hive-llap-client-2.3.7.jar;E:\m2\repository\org\apache\hive\hive-jdbc\3.1.2\hive-jdbc-3.1.2.jar;E:\m2\repository\org\apache\hive\hive-service\3.1.2\hive-service-3.1.2.jar;E:\m2\repository\org\apache\hive\hive-llap-server\3.1.2\hive-llap-server-3.1.2.jar;E:\m2\repository\org\apache\hive\hive-llap-tez\3.1.2\hive-llap-tez-3.1.2.jar;E:\m2\repository\org\apache\hive\hive-llap-common\3.1.2\hive-llap-common-3.1.2-tests.jar;E:\m2\repository\org\apache\hbase\hbase-hadoop2-compat\2.0.0-alpha4\hbase-hadoop2-compat-2.0.0-alpha4.jar;E:\m2\repository\org\apache\hbase\hbase-metrics\2.0.0-alpha4\hbase-metrics-2.0.0-alpha4.jar;E:\m2\repository\org\apache\hbase\hbase-metrics-api\2.0.0-alpha4\hbase-metrics-api-2.0.0-alpha4.jar;E:\m2\repository\org\apache\hbase\thirdparty\hbase-shaded-miscellaneous\1.0.1\hbase-shaded-miscellaneous-1.0.1.jar;E:\m2\repository\org\apache\yetus\audience-annotations\0.5.0\audience-annotations-0.5.0.jar;E:\m2\repository\org\apache\hbase\hbase-client\2.0.0-alpha4\hbase-client-2.0.0-alpha4.jar;E:\m2\repository\org\apache\hbase\thirdparty\hbase-shaded-protobuf\1.0.1\hbase-shaded-protobuf-1.0.1.jar;E:\m2\repository\org\apache\hbase\hbase-protocol-shaded\2.0.0-alpha4\hbase-protocol-shaded-2.0.0-alpha4.jar;E:\m2\repository\org\apache\hbase\hbase-protocol\2.0.0-alpha4\hbase-protocol-2.0.0-alpha4.jar;E:\m2\repository\org\apache\hbase\thirdparty\hbase-shaded-netty\1.0.1\hbase-shaded-netty-1.0.1.jar;E:\m2\repository\org\apache\htrace\htrace-core\3.2.0-incubating\htrace-core-3.2.0-incubating.jar;E:\m2\repository\org\jruby\jcodings\jcodings\1.0.18\jcodings-1.0.18.jar;E:\m2\repository\org\jruby\joni\joni\2.1.11\joni-2.1.11.jar;E:\m2\repository\org\apache\hbase\hbase-server\2.0.0-alpha4\hbase-server-2.0.0-alpha4.jar;E:\m2\repository\org\apache\hbase\hbase-http\2.0.0-alpha4\hbase-http-2.0.0-alpha4.jar;E:\m2\repository\org\apache\hbase\hbase-procedure\2.0.0-alpha4\hbase-procedure-2.0.0-alpha4.jar;E:\m2\repository\org\apache\hbase\hbase-common\2.0.0-alpha4\hbase-common-2.0.0-alpha4-tests.jar;E:\m2\repository\org\apache\hbase\hbase-replication\2.0.0-alpha4\hbase-replication-2.0.0-alpha4.jar;E:\m2\repository\org\apache\hbase\hbase-prefix-tree\2.0.0-alpha4\hbase-prefix-tree-2.0.0-alpha4.jar;E:\m2\repository\org\glassfish\web\javax.servlet.jsp\2.3.2\javax.servlet.jsp-2.3.2.jar;E:\m2\repository\org\glassfish\javax.el\3.0.1-b12\javax.el-3.0.1-b12.jar;E:\m2\repository\javax\ws\rs\javax.ws.rs-api\2.0.1\javax.ws.rs-api-2.0.1.jar;E:\m2\repository\com\lmax\disruptor\3.3.6\disruptor-3.3.6.jar;E:\m2\repository\org\apache\hadoop\hadoop-distcp\2.7.1\hadoop-distcp-2.7.1.jar;E:\m2\repository\org\apache\hbase\hbase-mapreduce\2.0.0-alpha4\hbase-mapreduce-2.0.0-alpha4.jar;E:\m2\repository\org\apache\hbase\hbase-common\2.0.0-alpha4\hbase-common-2.0.0-alpha4.jar;E:\m2\repository\com\github\stephenc\findbugs\findbugs-annotations\1.3.9-1\findbugs-annotations-1.3.9-1.jar;E:\m2\repository\org\apache\hbase\hbase-hadoop-compat\2.0.0-alpha4\hbase-hadoop-compat-2.0.0-alpha4.jar;E:\m2\repository\javax\servlet\jsp\javax.servlet.jsp-api\2.3.1\javax.servlet.jsp-api-2.3.1.jar;E:\m2\repository\net\sf\jpam\jpam\1.1\jpam-1.1.jar;E:\m2\repository\org\eclipse\jetty\jetty-runner\9.3.20.v20170531\jetty-runner-9.3.20.v20170531.jar;E:\m2\repository\org\eclipse\jetty\jetty-plus\9.4.41.v20210516\jetty-plus-9.4.41.v20210516.jar;E:\m2\repository\org\eclipse\jetty\jetty-annotations\9.4.41.v20210516\jetty-annotations-9.4.41.v20210516.jar;E:\m2\repository\javax\annotation\javax.annotation-api\1.3.2\javax.annotation-api-1.3.2.jar;E:\m2\repository\org\ow2\asm\asm-commons\9.0\asm-commons-9.0.jar;E:\m2\repository\org\ow2\asm\asm-tree\9.0\asm-tree-9.0.jar;E:\m2\repository\org\ow2\asm\asm-analysis\9.0\asm-analysis-9.0.jar;E:\m2\repository\org\eclipse\jetty\jetty-jaas\9.4.41.v20210516\jetty-jaas-9.4.41.v20210516.jar;E:\m2\repository\org\eclipse\jetty\websocket\websocket-server\9.4.41.v20210516\websocket-server-9.4.41.v20210516.jar;E:\m2\repository\org\eclipse\jetty\websocket\websocket-common\9.4.41.v20210516\websocket-common-9.4.41.v20210516.jar;E:\m2\repository\org\eclipse\jetty\websocket\websocket-api\9.4.41.v20210516\websocket-api-9.4.41.v20210516.jar;E:\m2\repository\org\eclipse\jetty\websocket\websocket-client\9.4.41.v20210516\websocket-client-9.4.41.v20210516.jar;E:\m2\repository\org\eclipse\jetty\jetty-client\9.4.41.v20210516\jetty-client-9.4.41.v20210516.jar;E:\m2\repository\org\eclipse\jetty\websocket\websocket-servlet\9.4.41.v20210516\websocket-servlet-9.4.41.v20210516.jar;E:\m2\repository\org\eclipse\jetty\jetty-jndi\9.4.41.v20210516\jetty-jndi-9.4.41.v20210516.jar;E:\m2\repository\org\eclipse\jetty\apache-jsp\9.4.41.v20210516\apache-jsp-9.4.41.v20210516.jar;E:\m2\repository\org\eclipse\jetty\toolchain\jetty-schemas\3.1.2\jetty-schemas-3.1.2.jar;E:\m2\repository\org\eclipse\jetty\apache-jstl\9.4.41.v20210516\apache-jstl-9.4.41.v20210516.jar;E:\m2\repository\org\apache\taglibs\taglibs-standard-spec\1.2.5\taglibs-standard-spec-1.2.5.jar;E:\m2\repository\org\apache\taglibs\taglibs-standard-impl\1.2.5\taglibs-standard-impl-1.2.5.jar;E:\m2\repository\org\jamon\jamon-runtime\2.3.1\jamon-runtime-2.3.1.jar;E:\m2\repository\org\apache\hive\hive-service-rpc\3.1.2\hive-service-rpc-3.1.2.jar;E:\m2\repository\org\apache\hive\hive-classification\3.1.2\hive-classification-3.1.2.jar;E:\m2\repository\org\apache\httpcomponents\httpcore\4.4.14\httpcore-4.4.14.jar;E:\m2\repository\org\apache\curator\curator-framework\2.12.0\curator-framework-2.12.0.jar;E:\m2\repository\org\apache\hive\hive-upgrade-acid\3.1.2\hive-upgrade-acid-3.1.2.jar com.lenovo.ai.bigdata.spark.hive.client.ClientTests
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/E:/m2/repository/ch/qos/logback/logback-classic/1.2.3/logback-classic-1.2.3.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/E:/m2/repository/org/slf4j/slf4j-log4j12/1.7.30/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [ch.qos.logback.classic.util.ContextSelectorStaticBinder]
11:18:43.396 [main] DEBUG org.apache.hive.jdbc.Utils - Resolved authority: node4:11240
11:18:44.165 [main] DEBUG org.apache.thrift.transport.TSaslTransport - opening transport org.apache.thrift.transport.TSaslClientTransport@371a67ec
11:18:44.172 [main] DEBUG org.apache.thrift.transport.TSaslClientTransport - Sending mechanism name PLAIN and initial response of length 18
11:18:44.174 [main] DEBUG org.apache.thrift.transport.TSaslTransport - CLIENT: Writing message with status START and payload length 5
11:18:44.175 [main] DEBUG org.apache.thrift.transport.TSaslTransport - CLIENT: Writing message with status COMPLETE and payload length 18
11:18:44.175 [main] DEBUG org.apache.thrift.transport.TSaslTransport - CLIENT: Start message handled
11:18:44.175 [main] DEBUG org.apache.thrift.transport.TSaslTransport - CLIENT: Main negotiation loop complete
11:18:44.176 [main] DEBUG org.apache.thrift.transport.TSaslTransport - CLIENT: SASL Client receiving last message
11:18:44.177 [main] DEBUG org.apache.thrift.transport.TSaslTransport - CLIENT: Received message with status COMPLETE and payload length 0
11:18:44.204 [main] DEBUG org.apache.thrift.transport.TSaslTransport - writing data length: 144
11:18:44.314 [main] DEBUG org.apache.thrift.transport.TSaslTransport - CLIENT: reading data length: 109
11:18:44.475 [main] DEBUG org.apache.thrift.transport.TSaslTransport - writing data length: 134
11:18:44.482 [main] DEBUG org.apache.thrift.transport.TSaslTransport - CLIENT: reading data length: 109
11:18:44.492 [main] DEBUG org.apache.hive.jdbc.logs.InPlaceUpdateStream$EventNotifier - progress bar is complete
11:18:44.497 [main] DEBUG org.apache.thrift.transport.TSaslTransport - writing data length: 104
11:18:44.551 [main] DEBUG org.apache.thrift.transport.TSaslTransport - CLIENT: reading data length: 53
11:18:44.559 [main] DEBUG org.apache.hive.jdbc.logs.InPlaceUpdateStream$EventNotifier - progress bar is complete
11:18:44.571 [main] DEBUG org.apache.thrift.transport.TSaslTransport - writing data length: 102
11:18:44.575 [main] DEBUG org.apache.thrift.transport.TSaslTransport - CLIENT: reading data length: 112
11:18:44.617 [main] DEBUG org.apache.thrift.transport.TSaslTransport - writing data length: 237
11:18:44.622 [main] DEBUG org.apache.thrift.transport.TSaslTransport - CLIENT: reading data length: 109
11:18:44.623 [main] DEBUG org.apache.hive.jdbc.logs.InPlaceUpdateStream$EventNotifier - progress bar is complete
11:18:44.623 [main] DEBUG org.apache.thrift.transport.TSaslTransport - writing data length: 104
11:18:49.625 [main] DEBUG org.apache.thrift.transport.TSaslTransport - CLIENT: reading data length: 53
11:18:49.625 [main] DEBUG org.apache.thrift.transport.TSaslTransport - writing data length: 104
11:18:54.627 [main] DEBUG org.apache.thrift.transport.TSaslTransport - CLIENT: reading data length: 53
11:18:54.627 [main] DEBUG org.apache.thrift.transport.TSaslTransport - writing data length: 104
11:18:59.628 [main] DEBUG org.apache.thrift.transport.TSaslTransport - CLIENT: reading data length: 53
11:18:59.629 [main] DEBUG org.apache.thrift.transport.TSaslTransport - writing data length: 104
11:19:04.631 [main] DEBUG org.apache.thrift.transport.TSaslTransport - CLIENT: reading data length: 53
11:19:04.631 [main] DEBUG org.apache.thrift.transport.TSaslTransport - writing data length: 104
11:19:09.633 [main] DEBUG org.apache.thrift.transport.TSaslTransport - CLIENT: reading data length: 53
11:19:09.634 [main] DEBUG org.apache.thrift.transport.TSaslTransport - writing data length: 104
11:19:14.636 [main] DEBUG org.apache.thrift.transport.TSaslTransport - CLIENT: reading data length: 53
11:19:14.636 [main] DEBUG org.apache.thrift.transport.TSaslTransport - writing data length: 104
11:19:17.979 [main] DEBUG org.apache.thrift.transport.TSaslTransport - CLIENT: reading data length: 53
11:19:17.979 [main] DEBUG org.apache.hive.jdbc.logs.InPlaceUpdateStream$EventNotifier - progress bar is complete
11:19:17.980 [main] DEBUG org.apache.thrift.transport.TSaslTransport - writing data length: 102
11:19:17.982 [main] DEBUG org.apache.thrift.transport.TSaslTransport - CLIENT: reading data length: 277
11:19:17.983 [main] DEBUG org.apache.hive.jdbc.logs.InPlaceUpdateStream$EventNotifier - progress bar is complete
11:19:17.983 [main] DEBUG org.apache.hive.jdbc.logs.InPlaceUpdateStream$EventNotifier - progress bar is complete
11:19:17.990 [main] DEBUG org.apache.thrift.transport.TSaslTransport - writing data length: 112
11:19:18.000 [main] DEBUG org.apache.thrift.transport.TSaslTransport - CLIENT: reading data length: 183
min_idx:1, max_index:588578233, count_:184826995, mtm_count:4735
11:19:18.049 [main] DEBUG org.apache.hive.jdbc.logs.InPlaceUpdateStream$EventNotifier - progress bar is complete
11:19:18.050 [main] DEBUG org.apache.hive.jdbc.logs.InPlaceUpdateStream$EventNotifier - progress bar is complete
11:19:18.050 [main] DEBUG org.apache.thrift.transport.TSaslTransport - writing data length: 112
11:19:18.052 [main] DEBUG org.apache.thrift.transport.TSaslTransport - CLIENT: reading data length: 159
11:19:18.058 [main] DEBUG org.apache.thrift.transport.TSaslTransport - writing data length: 96
11:19:18.062 [main] DEBUG org.apache.thrift.transport.TSaslTransport - CLIENT: reading data length: 42
11:19:18.073 [main] DEBUG org.apache.thrift.transport.TSaslTransport - writing data length: 83
11:19:18.307 [main] DEBUG org.apache.thrift.transport.TSaslTransport - CLIENT: reading data length: 40
Process finished with exit code 0
可以在node1的cluster job中看到执行的任务:
设定Spark动态资源分配
当通过上面thriftserver进行sql执行的时候,通过http://node1:8088/cluster发现,Excutor不能正常释放,表现为sql执行完成后,YarnCoarseGrainedExecutorBackend不主动释放,造成资源被无辜占用,浪费啊浪费,这个问题必须解决;因此,需要设置spark动态资源分配的方式,对于执行完成的job,释放YarnCoarseGrainedExecutorBackend进程。
参见:
注:对于Spark应用来说,资源是影响Spark应用执行效率的一个重要因素。当一个长期运行的服务,若分配给它多个Executor,可是却没有任何任务分配给它,而此时有其他的应用却资源紧张,这就造成了很大的资源浪费和资源不合理的调度。
动态资源调度就是为了解决这种场景,根据当前应用任务的负载情况,实时的增减Executor个数,从而实现动态分配资源,使整个Spark系统更加健康。
配置过程
修改yarn配置
首先需要对YARN进行配置,使其支持Spark的Shuffle Service。
修改每台集群上的yarn-site.xml:
<!--修改-->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle,spark_shuffle</value>
</property>
<!--增加-->
<property>
<name>yarn.nodemanager.aux-services.spark_shuffle.class</name>
<value>org.apache.spark.network.yarn.YarnShuffleService</value>
</property>
<property>
<name>spark.shuffle.service.port</name>
<value>7337</value>
</property>
拷贝HADOOP_HOME/share/hadoop/yarn/lib/目录下,重启yarn:
[hadoop@node1 ~]$ yarn--workers --daemon stop nodemanager
[hadoop@node1 ~]$ yarn--daemon stop resourcemanager
[hadoop@node2 ~]$ yarn--daemon stop resourcemanager
[hadoop@node1 ~]$ yarn--daemon start resourcemanager
[hadoop@node2 ~]$ yarn--daemon start resourcemanager
[hadoop@node1 ~]$ yarn--workers --daemon start nodemanager
增加Spark配置
配置$SPARK_HOME/conf/spark-defaults.conf,增加以下参数:
spark.shuffle.service.enabled true //启用External shuffle Service服务
spark.shuffle.service.port 7337 //Shuffle Service默认服务端口,必须和yarn-site中的一致
spark.dynamicAllocation.enabled true //开启动态资源分配
spark.dynamicAllocation.minExecutors 1 //每个Application最小分配的executor数
spark.dynamicAllocation.maxExecutors 30 //每个Application最大并发分配的executor数
spark.dynamicAllocation.schedulerBacklogTimeout 1s
spark.dynamicAllocation.sustainedSchedulerBacklogTimeout 5s
启动shtiftserver服务
$SPARK_HOME/sbin/start-thriftserver.sh \
--master yarn \ #很重要,否则不能通过命令行或者web看到application,以及任务执行情况
--executor-memory 6g \ #如果资源允许,尽可能增加mamory和cores等配置,可以增加多线程task并行度哦
--executor-cores 6 \
--driver-memory 3g \
--driver-cores 2 \
--conf spark.dynamicAllocation.enabled=true \
--conf spark.shuffle.service.enabled=true \
--conf spark.dynamicAllocation.initialExecutors=1 \ #初始化一个Excutor
--conf spark.dynamicAllocation.minExecutors=1 \
--conf spark.dynamicAllocation.maxExecutors=4 \
--conf spark.dynamicAllocation.executorIdleTimeout=30s \
--conf spark.dynamicAllocation.schedulerBacklogTimeout=10s
启动后,会在机器上面增加三个服务
可以通过命令行看到yarn增加了一个长期application:
[hadoop@node4 yarn]$ yarn application -list
2021-08-27 09:24:41,846 INFO client.AHSProxy: Connecting to Application History server at node1/192.168.111.49:10200
Total number of applications (application-types: [], states: [SUBMITTED, ACCEPTED, RUNNING] and tags: []):1
Application-Id Application-Name Application-Type User Queue State Final-State Progress Tracking-URL
application_1629977232578_0003 Thrift JDBC/ODBC Server SPARK hadoop default RUNNING UNDEFINED 10% http://node4:4040
在Applications页面中也可以看到增加的application:
进入AM(ApplicationMaster)详情页可以看到启动的一个Excutor:
任务执行
通过beeline登录,执行一个作业job,如下:
[hadoop@node2 conf]$ beeline
Beeline version 2.3.7 by Apache Hive
beeline> use caoyong_test;
No current connection
beeline> !connect jdbc:hive2://node4:11240
Connecting to jdbc:hive2://node4:11240
Enter username for jdbc:hive2://node4:11240: hadoop
Enter password for jdbc:hive2://node4:11240: **********
2021-08-27 09:21:44,517 INFO jdbc.Utils: Supplied authorities: node4:11240
2021-08-27 09:21:44,517 INFO jdbc.Utils: Resolved authority: node4:11240
Connected to: Spark SQL (version 3.1.2)
Driver: Hive JDBC (version 2.3.7)
Transaction isolation: TRANSACTION_REPEATABLE_READ
1: jdbc:hive2://node4:11240> SELECT min(auto_index) min_index,max(auto_index) max_index,count(*) count_,count(distinct mtm) mtm_count from caoyong_test.ship_ib;
+------------+------------+------------+------------+
| min_index | max_index | count_ | mtm_count |
+------------+------------+------------+------------+
| 1 | 588578233 | 184826995 | 4735 |
+------------+------------+------------+------------+
1 row selected (31.119 seconds)
1: jdbc:hive2://node4:11240>
查询1.8亿数据量,用时31.119s,比单台sqlserver(sqlserver用时42.725s)速度还快
在AM详情页面可以看到,动态增加了三个Excutor
进入job详情页面,可以看到DAG的stage,总的任务数等等
在等待30s后,动态增加的Excutor被释放掉,如下图:
其他常用命令
nohup hive --service hiveserver2 > /home/hadoop/hive-3.1.2/logs/hive.log 2>&1
nohup hiveserver2 > $HIVE_HOME/logs/hive.log 2>&1 &
!connect jdbc:hive2://node3:11240
!connect jdbc:hive2://node2:2181,node3:2181,node4:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hive_zk
!connect jdbc:hive2://$aa
spark-submit --master yarn --deploy-mode cluster --driver-memory 1g --executor-memory 512m --class com.lenovo.ai.bigdata.spark.WordCount bigdata-0.0.1-SNAPSHOT.jar /demo/input/ /demo/output-spark/
spark-submit --master local[2] --class com.lenovo.ai.bigdata.spark.WordCount bigdata-0.0.1-SNAPSHOT.jar ./ ./output
spark-submit --master yarn --deploy-mode cluster --driver-memory 1g --executor-memory 512m --class com.lenovo.ai.bigdata.spark.hive.HiveOnSparkTest bigdata-0.0.1-SNAPSHOT.jar
mapred --daemon start historyserver
mapred --daemon start timelineserver
create table student(id int,name string);
insert into student values(1005,"zhangsan");
nohup hiveserver2 > $HIVE_HOME/logs/hive.log 2>&1
spark-sql --master yarn --deploy-mode cluster
hdfs:///demo/input/hive/kv1.txt
动态分配资源启动任务:
spark-submit --master yarn --driver-memory 6g --executor-memory 6g --conf spark.dynamicAllocation.initialExecutors=10 --conf spark.dynamicAllocation.minExecutors=1 --conf spark.dynamicAllocation.executorIdleTimeout=30s --conf spark.dynamicAllocation.schedulerBacklogTimeout=10s --conf spark.executor.memoryOverhead=2g --conf spark.dynamicAllocation.enabled=true --name import_pdp_data_test_job app.jar "{\"job\":\"import_dim_pdp_test_job\","test":true }"