文章目录

  • 解压安装包
  • 修改配置
  • 用户环境变量
  • 修改flink-conf.yaml
  • 修改masters
  • 修改works
  • 修改zoo.cfg
  • 小虫子:访问不了web页面
  • 分发集群
  • on yarn的运行方式
  • 单纯的yarn启动
  • yarn session模式
  • 2021.3.8更新
  • 期待引入hadoopClasspath
  • JAVA中shell工具类调用exe模式找不到环境变量


解压安装包

tar -zxvf flink-1.12.1-bin-scala_2.11.tgz


修改配置

用户环境变量

vim ~/.bash_profile

export FLINK_HOME=/app/flink-1.12.1
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HADOOP_HOME=/app/hadoop-2.7.2
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.131-11.b12.el7.x86_64/jre
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$FLINK_HOME/bin



修改flink-conf.yaml

前面三个只要放开就好,第四个指定zk复用(相当于 zkCluster01:2181,zkCluster02:2181,zkCluster03:2181/flink,带目录是因为zk集群不止给flink用还有kafka等)。

high-availability: zookeeper

high-availability.storageDir: hdfs://cluster/flinkha/

high-availability.zookeeper.quorum: zkCluster01:2181,zkCluster02:2181,zkCluster03:2181

high-availability.zookeeper.path.root: /flink



修改masters

e3base01:8081

这里我们指定e3base01为master节点,带8081端口。



修改works

老版本叫slaves,不带端口。

e3base02

e3base03



修改zoo.cfg

这里主要修改下面参数:

# ZooKeeper quorum peers
server.1=zkCluster01:2888:3888
server.2=zkCluster02:2888:3888
server.3=zkCluster03:2888:3888



小虫子:访问不了web页面

启动正常,不报错,但是没启动成功。

不要慌,我们查看启动日志flink/logs:

2021-03-03 16:40:15,488 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] - Shutting StandaloneSessionClusterEntrypoint down with application status FAILED. Diagnostics java.io.IOException: Could not create FileSystem for highly available storage path (hdfs:/flink/ha/default)
        at org.apache.flink.runtime.blob.BlobUtils.createFileSystemBlobStore(BlobUtils.java:92)
        at org.apache.flink.runtime.blob.BlobUtils.createBlobStoreFromConfig(BlobUtils.java:76)
        at org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createHighAvailabilityServices(HighAvailabilityServicesUtils.java:115)
        at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.createHaServices(ClusterEntrypoint.java:332)
        at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.initializeServices(ClusterEntrypoint.java:290)
        at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:223)
        at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$0(ClusterEntrypoint.java:178)
        at org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28)
        at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:175)
        at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runClusterEntrypoint(ClusterEntrypoint.java:569)
        at org.apache.flink.runtime.entrypoint.StandaloneSessionClusterEntrypoint.main(StandaloneSessionClusterEntrypoint.java:59)
Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Could not find a file system implementation for scheme 'hdfs'. The scheme is not directly supported by Flink and no Hadoop file system to support this scheme could be loaded. For a full list of supported file systems, please see https://ci.apache.org/projects/flink/flink-docs-stable/ops/filesystems/.
        at org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:531)
        at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:408)
        at org.apache.flink.core.fs.Path.getFileSystem(Path.java:274)
        at org.apache.flink.runtime.blob.BlobUtils.createFileSystemBlobStore(BlobUtils.java:89)
        ... 10 more
Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Hadoop is not in the classpath/dependencies.
        at org.apache.flink.core.fs.UnsupportedSchemeFactory.create(UnsupportedSchemeFactory.java:55)
        at org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:527)
        ... 13 more

原因是因为缺失了一个连接hdfs的依赖包:flink-shaded-hadoop-2-uber-2.7.5-7.0.jar:官网下载

选定指定的hadoop版本即可

flink 集群因资源不足挂掉 flinkonyarn集群部署_flink 集群因资源不足挂掉

重新启动,正常访问。



分发集群

使用scp -r 命令,分发给其它两天主机e3base02、e3base03即可。


on yarn的运行方式

单纯的yarn启动

一个作业一个环境:

flink run -m yarn-cluster -yjm 1024m -ytm 1024m -ynm test /app/flink-1.12.1/examples/batch/WordCount.jar



yarn session模式

使用 yarn session 模式,我们需要先启动一个 yarn-session 会话,相当于启动了一个 yarn 任务,这个任务所占用的资源不会变化,并且一直运行。我们在使用 flink run 向这个 session 任务提交作业时,如果 session 的资源不足,那么任务会等待,直到其他资源释放。当这个 yarn-session 被杀死时,所有任务都会停止。



首先创建yarn session 任务:

启动一个 yarn session 任务,该任务拥有 8G 内存、32 个槽位。

./bin/yarn-session.sh -tm 8192 -s 32

这个时候,因为hadoop的配置缘故回报错:

org.apache.flink.client.deployment.ClusterDeploymentException: Couldn’t deploy Yarn session cluster
 at org.apache.flink.yarn.YarnClusterDescriptor.deploySessionCluster(YarnClusterDescriptor.java:425) ~[flink-dist_2.11-1.12.1.jar:1.12.1]
 at org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:606) ~[flink-dist_2.11-1.12.1.jar:1.12.1]
 at org.apache.flink.yarn.cli.FlinkYarnSessionCli.lambda$main$4(FlinkYarnSessionCli.java:860) ~[flink-dist_2.11-1.12.1.jar:1.12.1]
 at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_131]
 at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_131]
 at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1754) ~[flink-shaded-hadoop-2-uber-2.7.5-7.0.jar:2.7.5-7.0]
 at org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) ~[flink-dist_2.11-1.12.1.jar:1.12.1]
 at org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:860) [flink-dist_2.11-1.12.1.jar:1.12.1]
Caused by: org.apache.flink.configuration.IllegalConfigurationException: The number of requested virtual cores per node 32 exceeds the maximum number of virtual cores 8 available in the Yarn Cluster. Please note that the number of virtual cores is set to the number of task slots by default unless configured in the Flink config with 'yarn.containers.vcores.' at org.apache.flink.yarn.YarnClusterDescriptor.isReadyForDeployment(YarnClusterDescriptor.java:338) ~[flink-dist_2.11-1.12.1.jar:1.12.1]
 at org.apache.flink.yarn.YarnClusterDescriptor.deployInternal(YarnClusterDescriptor.java:534) ~[flink-dist_2.11-1.12.1.jar:1.12.1]
 at org.apache.flink.yarn.YarnClusterDescriptor.deploySessionCluster(YarnClusterDescriptor.java:418) ~[flink-dist_2.11-1.12.1.jar:1.12.1]
 … 7 more



根据报错信息我们将槽数降低在8以内即可:

./bin/yarn-session.sh -tm 8192 -s 2

然后在yarn的管理界面我们看到:

flink 集群因资源不足挂掉 flinkonyarn集群部署_flink集群_02

如何使用他?在运行新的作业的时候,用session作业的applicationId提交任务即可

./bin/flink run -m yarn-cluster -yid application_xxxx ./examples/batch/WordCount.jar

application_xxxx及session任务的id。




2021.3.8更新

期待引入hadoopClasspath

------------------------------------------------------------
 The program finished with the following exception:

java.lang.IllegalStateException: No Executor found. Please make sure to export the HADOOP_CLASSPATH environment variable or have hadoop in your classpath. For more information refer to the "Deployment" section of the official Apache Flink documentation.
        at org.apache.flink.yarn.cli.FallbackYarnSessionCli.isActive(FallbackYarnSessionCli.java:41)
        at org.apache.flink.client.cli.CliFrontend.validateAndGetActiveCommandLine(CliFrontend.java:1240)
        at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:234)
        at org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1058)
        at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1136)
        at org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28)
        at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1136)



flink提交失败,原因是在1.11.0version后,flink对Hadoop3.0.0以及更高版本Hadoop的支持,不再提供“flink-shaded-hadoop-*” jars,而是通过配置YARN_CONF_DIR或者HADOOP_CONF_DIR和HADOOP_CLASSPATH环境变量完成与yarn集群的对接。

具体步骤如下:

  1. 确保安装有Hadoop集群,版本至少Hadoop 2.4.1
  2. 增加环境变量如下:
export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop
export HADOOP_CLASSPATH=`hadoop classpath`

其中hadoop classpath是调用的hadoop的shell命令(直接输出的是hadoop calsspath的值)。



JAVA中shell工具类调用exe模式找不到环境变量

在执行shell前,固定执行一次source ~/.bash_profile引入环境变了。