基于Flink1.8版本,分析Flink各节点之间的RPC实现:

  • 介绍RPC相关的主要接口
  • RPC节点之间的通信方式

Flink老版本处理Rpc时,各节点通过继承FlinkActor接口,接收Actor消息,根据消息类型进行不同的业务处理。此种方式将流程业务和具体通信组件耦合在一起,不利于后期更换通信组件(如使用netty),因此Flink引入了RPC调用,各节点通过GateWay方式回调,隐藏通信组件的细节,实现解耦。

RPC相关的主要接口

  • RpcEndpoint
  • RpcService
  • RpcGateway

RpcEndpoint:远程过程调用(remote procedure calls) 的基类

RpcEndpoint是Flink RPC终端的基类,所有提供远程过程调用的分布式组件必须扩展RpcEndpoint, RpcEndpoint功能由RpcService支持。

RpcEndpoint子类

flink远程发布任务 flink远程调试_d大数据


如上图所示,RpcEndpoint的子类只有四类组件:Dispatcher,JobMaster,ResourceManager,TaskExecutor,即Flink中只有这四个组件有RPC的能力,换句话说只有这四个组件有RPC的这个需求。

这也对应了Flink这的四大组件:Dispatcher,JobMaster,ResourceManager,TaskExecutor,彼此之间的通信需要依赖RPC实现。(目前通信组件依然是Akka)

RpcService:RPC服务提供者

RpcServer是RpcEndpoint的成员变量,为RpcService提供RPC服务,连接远程Server,其只有一个子类实现:AkkaRpcService,可见目前Flink的通信方式依然是Akka。

RpcServer用于启动和连接到RpcEndpoint, 连接到rpc服务器将返回一个RpcGateway,可用于调用远程过程。

Flink四大组件Dispatcher,JobMaster,ResourceManager,TaskExecutor,都是RpcEndpoint的实现,所以构建四大组件时,同步需要初始化RpcServer。

如JobManager的构造方式,第一个参数就是需要知道RpcService :

public JobMaster(
			RpcService rpcService,
			JobMasterConfiguration jobMasterConfiguration,
			ResourceID resourceId,
			JobGraph jobGraph,
			HighAvailabilityServices highAvailabilityService,
			SlotPoolFactory slotPoolFactory,
			SchedulerFactory schedulerFactory,
			JobManagerSharedServices jobManagerSharedServices,
			HeartbeatServices heartbeatServices,
			JobManagerJobMetricGroupFactory jobMetricGroupFactory,
			OnCompletionActions jobCompletionActions,
			FatalErrorHandler fatalErrorHandler,
			ClassLoader userCodeLoader){}

所有的RpcService都是通过AkkaRpcServiceUtils这个工具类的createRpcService方法创建的。

RpcGateway:RPC调用的网关

RpcGateway主要实现接口有:FencedRpcEndpoint和TaskExecutorGateway,而这两个接口又分别被Flink四大组件继承,即Dispatcher,JobMaster,ResourceManager,TaskExecutor可通过各自的Gateway实现RPC调用。

  • Rpc gateway interface,所有Rpc组件的网关,定义了各组件的Rpc接口
  • 常见的就是Rpc实现,如JobMasterGateway,DispatcherGateway,ResourceManagerGateway,TaskExecutorGateway等
  • 各组件类的成员变量都有需要通信的其他组件的GateWay实现类,便于Rpc调用

以JobMaster为例,JobMaster实现JobMasterGateway接口,JobMasterGateway接口中定义的方法如下:

public interface JobMasterGateway extends
	CheckpointCoordinatorGateway,
	FencedRpcGateway<JobMasterId>,
	KvStateLocationOracle,
	KvStateRegistryGateway {

	/**
	 * 取消正在执行的任务(与TaskExecutorGateway交互)
	 */
	CompletableFuture<Acknowledge> cancel(@RpcTimeout Time timeout);

	/**
	 * 取消正在执行的任务(与TaskExecutorGateway交互)
	 */
	CompletableFuture<Acknowledge> stop(@RpcTimeout Time timeout);

	/**
	 * 修改正在运行的任务的并行度(与TaskExecutorGateway交互)
	 */
	CompletableFuture<Acknowledge> rescaleJob(
		int newParallelism,
		RescalingBehaviour rescalingBehaviour,
		@RpcTimeout Time timeout);

	/**
	 * 修改指定算子的并行度(与TaskExecutorGateway交互)
	 */
	CompletableFuture<Acknowledge> rescaleOperators(
		Collection<JobVertexID> operators,
		int newParallelism,
		RescalingBehaviour rescalingBehaviour,
		@RpcTimeout Time timeout);

	CompletableFuture<Acknowledge> updateTaskExecutionState(
			final TaskExecutionState taskExecutionState);

	CompletableFuture<SerializedInputSplit> requestNextInputSplit(
			final JobVertexID vertexID,
			final ExecutionAttemptID executionAttempt);

	CompletableFuture<ExecutionState> requestPartitionState(
			final IntermediateDataSetID intermediateResultId,
			final ResultPartitionID partitionId);

	CompletableFuture<Acknowledge> scheduleOrUpdateConsumers(
			final ResultPartitionID partitionID,
			@RpcTimeout final Time timeout);

	CompletableFuture<Acknowledge> disconnectTaskManager(ResourceID resourceID, Exception cause);

	/**
	 * 和ResourceManager断开连接(与ResourceManager交互)
	 */
	void disconnectResourceManager(
		final ResourceManagerId resourceManagerId,
		final Exception cause);

	/**
	 * Offers the given slots to the job manager. The response contains the set of accepted slots.
	 *
	 * @param taskManagerId identifying the task manager
	 * @param slots         to offer to the job manager
	 * @param timeout       for the rpc call
	 * @return Future set of accepted slots.
	 */
	CompletableFuture<Collection<SlotOffer>> offerSlots(
			final ResourceID taskManagerId,
			final Collection<SlotOffer> slots,
			@RpcTimeout final Time timeout);

	void failSlot(final ResourceID taskManagerId,
			final AllocationID allocationId,
			final Exception cause);
ableFuture<RegistrationResponse> registerTaskManager(
			final String taskManagerRpcAddress,
			final TaskManagerLocation taskManagerLocation,
			@RpcTimeout final Time timeout);

	void heartbeatFromTaskManager(
		final ResourceID resourceID,
		final AccumulatorReport accumulatorReport);

	/**
	 * Sends heartbeat request from the resource manager.
	 *
	 * @param resourceID unique id of the resource manager
	 */
	void heartbeatFromResourceManager(final ResourceID resourceID);

	CompletableFuture<JobDetails> requestJobDetails(@RpcTimeout Time timeout);

	CompletableFuture<JobStatus> requestJobStatus(@RpcTimeout Time timeout);

	CompletableFuture<ArchivedExecutionGraph> requestJob(@RpcTimeout Time timeout);

	CompletableFuture<String> triggerSavepoint(
		@Nullable final String targetDirectory,
		final boolean cancelJob,
		@RpcTimeout final Time timeout);

	CompletableFuture<OperatorBackPressureStatsResponse> requestOperatorBackPressureStats(JobVertexID jobVertexId);

	void notifyAllocationFailure(AllocationID allocationID, Exception cause);

	CompletableFuture<Object> updateGlobalAggregate(String aggregateName, Object aggregand, byte[] serializedAggregationFunction);
}

上面JobMasterGateway 定义的方法有两类返回值:Void和CompletableFuture

  • Void返回值:表示从其他组件(如Dispatcher)触发动作,JobMaster中定义此方法作为Dispatcher的回调;
  • CompletableFuture返回值:表示将此方法的实现由JobManager主动调用,并且该方法中一般都有其他组件的Gateway调用

总结:

之前版本跨节点的通信是直接基于Akka,现在Flink1.8基于业务需要,定义各组件的GateWay,方便直接使用Rpc,但是底层依然是Akka。好处在于,GateWay在具体组件中排出了Akka相关代码,将业务和通信方式进行分离,便于后期更换通信方式,如netty