Flink Per-job模式下的TaskManager启动由yarn回调onContainersAllocated方法触发,通过追踪发现TaskManager的启动类为YarnTaskExecutorRunner,那么下面让我们从YarnTaskExecutorRunner.main方法继续分析TaskManager的启动源码。

  1. TaskManager启动源码入口从YarnTaskExecutorRunner.main开始,如下:
public static void main(String[] args) {
    EnvironmentInformation.logEnvironmentInfo(LOG, "YARN TaskExecutor runner", args);
    SignalHandler.register(LOG);
    JvmShutdownSafeguard.installAsShutdownHook(LOG);
    // 启动TaskManager
    runTaskManagerSecurely(args);
}
  1. 追踪堆栈
runTaskManagerSecurely(args);
TaskManagerRunner.runTaskManagerProcessSecurely(Preconditions.checkNotNull(configuration));

从堆栈追踪可以发现,Flink Per-job下的TaskManager的任务类也回到了TaskManagerRunner,与Standalne一致。

  1. 继续追踪代码TaskManagerRunner.runTaskManagerProcessSecurely(Preconditions.checkNotNull(configuration))
/**
 * runTaskManager 为启动入口方法
 */
exitCode = SecurityUtils.getInstalledContext().runSecured(
        () -> runTaskManager(configuration, pluginManager)
);
  1. 继续追踪runTaskManager(configuration, pluginManager):
// createTaskExecutorService实例化taskexecutor对象
taskManagerRunner = new TaskManagerRunner(
                configuration,
                pluginManager,
                TaskManagerRunner::createTaskExecutorService);

// 上边仅实例化对象,通过下边start方法进行各种资源初始化操作
taskManagerRunner.start();

可以发现,在这里创建了TaskManagerRunner,并且启动TaskManagerRunner。这里通过代码TaskManagerRunner::createTaskExecutorService,在内部实例化了TaskExecutor

4.1 这里有必要深入追踪TaskManagerRunner::createTaskExecutorService发现:

public static TaskExecutorService createTaskExecutorService(
        Configuration configuration,
        ResourceID resourceID,
        RpcService rpcService,
        HighAvailabilityServices highAvailabilityServices,
        HeartbeatServices heartbeatServices,
        MetricRegistry metricRegistry,
        BlobCacheService blobCacheService,
        boolean localCommunicationOnly,
        ExternalResourceInfoProvider externalResourceInfoProvider,
        WorkingDirectory workingDirectory,
        FatalErrorHandler fatalErrorHandler
) throws Exception {
    // 实例化taskExecutor 
    final TaskExecutor taskExecutor = startTaskManager( // 内部实例化taskManagerService
                    configuration,
                    resourceID,
                    rpcService,
                    highAvailabilityServices,
                    heartbeatServices,
                    metricRegistry,
                    blobCacheService,
                    localCommunicationOnly,
                    externalResourceInfoProvider,
                    workingDirectory,
                    fatalErrorHandler);

    // 放回工厂方法 
    return TaskExecutorToServiceAdapter.createFor(taskExecutor);
}

4.2 首先创建了TaskExecutor,继续追踪startTaskManager()发现:

/**
 * 实例化 TaskManagerServices ,这内部包含实例化了一堆flink运行时依赖的组件信息
 */
TaskManagerServices taskManagerServices = TaskManagerServices.fromConfiguration(
                taskManagerServicesConfiguration,
                taskExecutorBlobService.getPermanentBlobService(),
                taskManagerMetricGroup.f1,
                ioExecutor,
                fatalErrorHandler,
                workingDirectory);

/**
 * 实例化 TaskExecutor
 */
new TaskExecutor(
        rpcService,
        taskManagerConfiguration,
        highAvailabilityServices,
        taskManagerServices,
        externalResourceInfoProvider,
        heartbeatServices,
        taskManagerMetricGroup.f0,
        metricQueryServiceAddress,
        taskExecutorBlobService,
        fatalErrorHandler,
        new TaskExecutorPartitionTrackerImpl(taskManagerServices.getShuffleEnvironment())
);

这个方法内主要做了两件事:

  • 实例化TaskManagerServices
  • 任务调度服务taskEventDispatcher
  • IO管理器ioManager
  • 实例化并启动shuffer组件shuffleEnvironment
  • 实例化并启动状态服务kvStateService
  • 实例化taskSlotTable,用于管理 slot 与 task对应关系
  • 实例化jobTable,用于管理运行的job信息
  • 实例化状态管理器taskStateManager
  • 实例化变更日志管理器changelogStoragesManager

  • 实例化TaskExecutor
  • 这里注册了心跳服务
  • TM与JM
  • TM与RM
  • 初始化了taskExecutorServices、taskSlotTable、jobTable、localStateStoresManager、shuffleEnvironment、kvStateService、resourceManagerLeaderRetriever等组件
  1. 继续追踪start方法:
public void start() throws Exception {
    synchronized (lock) {
        // 初始化各种服务
        startTaskManagerRunnerServices();
        // 启动TaskManager服务
        taskExecutorService.start();
    }
}
  • 方法startTaskManagerRunnerServices()初始化的服务包括:
  • 初始化rpcService
  • 初始化线程池executor
  • 高可用服务highAvailabilityServices
  • 心跳服务heartbeatServices
  • 性能指标metricRegistry
  • 文件传输服务blobCacheService
  • taskExecutorService服务,在这个内的实现内部初始化了taskexecutor
  • taskExecutorService.start();最后调用了taskExecutor.start();,及调用了TaskExecutor的onStar()方法。
  1. 追踪TaskExecutor.onStar()方法:
public void onStart() throws Exception {
    try {
        // 启动各种服务,连接rm,注册信息等
        startTaskExecutorServices();
    } catch (Throwable t) {}
    // 开启注册超时
    startRegistrationTimeout();
}
  1. 追踪startTaskExecutorServices()如下:
private void startTaskExecutorServices() throws Exception {
    try {
        // start by connecting to the ResourceManager
        /**
         * 注册/监听
         * 连接resourcemanager ,注册,维持心跳,监控rm的变更
         * DefaultLeaderRetrievalService == resourceManagerLeaderRetriever
         * new ResourceManagerLeaderListener():ResourceManagerAddress实例
         */
        resourceManagerLeaderRetriever.start(new ResourceManagerLeaderListener());

        // tell the task slot table who's responsible for the task slot actions
        taskSlotTable.start(new SlotActionsImpl(), getMainThreadExecutor());

        // start the job leader service
        jobLeaderService.start(getAddress(), getRpcService(), haServices, new JobLeaderListenerImpl());

        fileCache = new FileCache(taskManagerConfiguration.getTmpDirectories(), taskExecutorBlobService.getPermanentBlobService());

        tryLoadLocalAllocationSnapshots();
    } catch (Exception e) {
        handleStartTaskExecutorServicesException(e);
    }
}

这里启动了taskSlotTable,同时在tryLoadLocalAllocationSnapshots()依赖taskSlotTable服务为job分配slot资源。

  1. TM向RM注册,堆栈调用如下:
resourceManagerLeaderRetriever.start(new ResourceManagerLeaderListener());
StandaloneLeaderRetrievalService.start(LeaderRetrievalListener listener)
listener.notifyLeaderAddress(leaderAddress, leaderId);
notifyOfNewResourceManagerLeader(leaderAddress, ResourceManagerId.fromUuidOrNull(leaderSessionID)));
reconnectToResourceManager()
tryConnectToResourceManager();
connectToResourceManager(); // 连接到RM 
resourceManagerConnection.start();
createNewRegistration();
generateRegistration()
return new TaskExecutorToResourceManagerConnection.ResourceManagerRegistration(
        log,
        rpcService,
        getTargetAddress(),
        getTargetLeaderId(),
        retryingRegistrationConfiguration,
        taskExecutorRegistration
);
register()
invokeRegistration(gateway, fencingToken, timeoutMillis);
resourceManager.registerTaskExecutor(taskExecutorRegistration, timeout);
registerTaskExecutorInternal(taskExecutorGateway, taskExecutorRegistration);
return new TaskExecutorRegistrationSuccess(registration.getInstanceID(), resourceId, clusterInformation);
onRegistrationSuccess(result.getSuccess()); // 注册成功后调用 
registrationListener.onRegistrationSuccess(this, success);
establishResourceManagerConnection(
        resourceManagerGateway,
        resourceManagerId,
        taskExecutorRegistrationId,
        clusterInformation);
resourceManagerGateway.sendSlotReport(
        getResourceID(),
        taskExecutorRegistrationId,
        taskSlotTable.createSlotReport(getResourceID()),
        taskManagerConfiguration.getRpcTimeout());
slotManager.registerTaskManager(
        workerTypeWorkerRegistration,
        slotReport,
        workerTypeWorkerRegistration.getTotalResourceProfile(),
        workerTypeWorkerRegistration.getDefaultSlotResourceProfile())

这里的堆栈调用链路比较长,但总体思想就是想RM注册

  1. 代码tryLoadLocalAllocationSnapshots()分配slot调用:
// 1 
allocateSlotForJob(
        slotAllocationSnapshot.getJobId(),
        slotAllocationSnapshot.getSlotID(),
        slotAllocationSnapshot.getAllocationId(),
        slotAllocationSnapshot.getResourceProfile(),
        slotAllocationSnapshot.getJobTargetAddress()
);
// 2 
allocateSlot(slotId, jobId, allocationId, resourceProfile);
// 3 
taskSlotTable.allocateSlot(
                    slotId.getSlotNumber(),
                    jobId,
                    allocationId,
                    resourceProfile,
                    taskManagerConfiguration.getSlotTimeout())