`YarnClientSchedulerBackend` — SchedulerBackend for YARN in Client Deploy Mode

YarnClientSchedulerBackend is the SchedulerBackend for Spark on YARN for client deploy mode.

Note	`client` deploy mode is the default deploy mode of Spark on YARN.

YarnClientSchedulerBackend is a YarnSchedulerBackend that comes with just two custom implementations of the methods from the SchedulerBackend Contract:

start
stop

YarnClientSchedulerBackend uses client internal attribute to submit a Spark application when it starts up and waits for the Spark application until it has exited, either successfully or due to some failure.

In order to initialize a YarnClientSchedulerBackend Spark passes a TaskSchedulerImpl and SparkContext (but only SparkContext is used in this object with TaskSchedulerImpl being passed on to the supertype — YarnSchedulerBackend).

YarnClientSchedulerBackend belongs to org.apache.spark.scheduler.cluster package.

Tip

Enable DEBUG logging level for org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend logger to see what happens inside YarnClientSchedulerBackend.

Add the following line to conf/log4j.properties:

log4j.logger.org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend=DEBUG

Refer to Logging.

client Internal Attribute

client private attribute is an instance of Client that YarnClientSchedulerBackend creates an instance of when it starts and uses to submit the Spark application.

client is also used to monitor the Spark application when YarnClientSchedulerBackend waits for the application.

client is stopped when YarnClientSchedulerBackend stops.

Starting YarnClientSchedulerBackend (start method)

start is part of the SchedulerBackend Contract. It is executed when TaskSchedulerImpl starts.

start(): Unit

It creates the internal client object and submits the Spark application to YARN ResourceManager. After the application is deployed to YARN and running, it starts the internal monitorThread state monitor thread. In the meantime it also calls the supertype’s start.

spark yarn YarnClientSchedulerBackend start.png

Figure 1. Starting YarnClientSchedulerBackend

start sets spark.driver.appUIAddress as appUIAddress (of SparkUI) (and only if Spark’s web UI is enabled).

With DEBUG log level enabled you should see the following DEBUG message in the logs:

DEBUG YarnClientSchedulerBackend: ClientArguments called with: --arg [hostport]

Note	`hostport` is spark.driver.host and spark.driver.port separated by `:`, e.g. `192.168.99.1:64905`.

It then creates an instance of ClientArguments (using --arg [hostport] arguments).

It sets the parent’s totalExpectedExecutors to the initial number of executors.

Caution

FIXME Why is this part of subtypes since they both set it to the same value?

It creates a Client object using the instance of ClientArguments and SparkConf.

The parent’s YarnSchedulerBackend.bindToYarn method is called with the current application id (being the result of calling Client.submitApplication) and None for the optional attemptId.

The parent’s YarnSchedulerBackend.start is called.

waitForApplication is executed that blocks until the application is running or an SparkException is thrown.

If spark.yarn.credentials.file is defined, YarnSparkHadoopUtil.get.startExecutorDelegationTokenRenewer(conf) is called.

Caution

FIXME Why? What does startExecutorDelegationTokenRenewer do?

A MonitorThread object is created (using asyncMonitorApplication) and started to asynchronously monitor the currently running application.

stop

stop is part of the SchedulerBackend Contract.

It stops the internal helper objects, i.e. monitorThread and client as well as "announces" the stop to other services through Client.reportLauncherState. In the meantime it also calls the supertype’s stop.

stop makes sure that the internal client has already been created (i.e. it is not null), but not necessarily started.

stop stops the internal monitorThread using MonitorThread.stopMonitor method.

It then "announces" the stop using Client.reportLauncherState(SparkAppHandle.State.FINISHED).

Later, it passes the call on to the suppertype’s stop and, once the supertype’s stop has finished, it calls YarnSparkHadoopUtil.stopExecutorDelegationTokenRenewer followed by stopping the internal client.

Eventually, when all went fine, you should see the following INFO message in the logs:

INFO YarnClientSchedulerBackend: Stopped

Waiting For Spark Application (waitForApplication method)

waitForApplication(): Unit

waitForApplication is an internal (private) method that waits until the current application is running (using Client.monitorApplication).

If the application has FINISHED, FAILED, or has been KILLED, a SparkException is thrown with the following message:

Yarn application has already ended! It might have been killed or unable to launch application master.

You should see the following INFO message in the logs for RUNNING state:

INFO YarnClientSchedulerBackend: Application [appId] has started running.

asyncMonitorApplication

asyncMonitorApplication(): MonitorThread

asyncMonitorApplication internal method creates a separate daemon MonitorThread thread called "Yarn application state monitor".

Note	`asyncMonitorApplication` does not start the daemon thread.

MonitorThread

MonitorThread internal class is to monitor a Spark application deployed to YARN in client mode.

When started, it calls the blocking Client.monitorApplication (with no application reports printed out to the console, i.e. logApplicationReport is disabled).

Note	`Client.monitorApplication` is a blocking operation and hence it is wrapped in `MonitorThread` to be executed in a separate thread.

When the call to Client.monitorApplication has finished, it is assumed that the application has exited. You should see the following ERROR message in the logs:

ERROR Yarn application has already exited with state [state]!

That leads to stopping the current SparkContext (using SparkContext.stop).

YarnClientSchedulerBackend