log4j.logger.org.apache.spark.deploy.yarn.ExecutorRunnable=INFO
ExecutorRunnable
ExecutorRunnable starts a YARN container with CoarseGrainedExecutorBackend. If external shuffle service is used, it is set in the ContainerLaunchContext context as a service under the name of spark_shuffle.
|
Note
|
Despite the name ExecutorRunnable is not a java.lang.Runnable anymore after SPARK-12447.
|
|
Tip
|
Enable Add the following line to Refer to Logging. |
prepareEnvironment Method
|
Caution
|
FIXME |
Creating ExecutorRunnable Instance
ExecutorRunnable(
container: Container,
conf: Configuration,
sparkConf: SparkConf,
masterAddress: String,
slaveId: String,
hostname: String,
executorMemory: Int,
executorCores: Int,
appId: String,
securityMgr: SecurityManager,
localResources: Map[String, LocalResource])
YarnAllocator creates an instance of ExecutorRunnable when launching Spark executors in allocated YARN containers.
A single ExecutorRunnable is created with the YARN container to run a Spark executor in.
The input conf (Hadoop’s Configuration), sparkConf, masterAddress directly correspond to the constructor arguments of YarnAllocator.
The input slaveId is from the internal counter in YarnAllocator.
The input hostname is the host of the YARN container.
The input executorMemory and executorCores are from YarnAllocator, but come from spark.executor.memory and spark.executor.cores configuration settings.
The input appId, securityMgr, and localResources are the same as YarnAllocator was created for.
Running ExecutorRunnable — run Method
When called, you should see the following INFO message in the logs:
INFO ExecutorRunnable: Starting Executor Container
It ultimately starts CoarseGrainedExecutorBackend in the container.
Starting CoarseGrainedExecutorBackend in Container — startContainer Method
startContainer(): java.util.Map[String, ByteBuffer]
startContainer uses the NMClient API to start a CoarseGrainedExecutorBackend in a YARN container.
When startContainer is executed, you should see the following INFO message in the logs:
INFO ExecutorRunnable: Setting up ContainerLaunchContext
It then creates a YARN ContainerLaunchContext (which represents all of the information for the YARN NodeManager to launch a container) with the local resources and environment being the localResources and env, respectively, passed in to the ExecutorRunnable when it was created. It also sets security tokens.
It prepares the command to launch CoarseGrainedExecutorBackend with all the details as provided when the ExecutorRunnable was created.
You should see the following INFO message in the logs:
INFO ExecutorRunnable:
===============================================================================
YARN executor launch context:
env:
[key] -> [value]
...
command:
[commands]
===============================================================================
The command is set to the just-created ContainerLaunchContext.
It sets application ACLs using YarnSparkHadoopUtil.getApplicationAclsForYarn.
If external shuffle service is used, it registers with the YARN shuffle service already started on the NodeManager. The external shuffle service is set in the ContainerLaunchContext context as a service data using spark_shuffle.
Ultimately, startContainer requests the YARN NodeManager to start the YARN container for a Spark executor (as passed in when the ExecutorRunnable was created) with the ContainerLaunchContext context.
If any exception happens, a SparkException is thrown.
Exception while starting container [containerId] on host [hostname]
|
Note
|
startContainer is exclusively called as a part of running ExecutorRunnable.
|
Preparing Command to Launch CoarseGrainedExecutorBackend — prepareCommand Method
prepareCommand(
masterAddress: String,
slaveId: String,
hostname: String,
executorMemory: Int,
executorCores: Int,
appId: String): List[String]
prepareCommand is a private method to prepare the command that is used to start org.apache.spark.executor.CoarseGrainedExecutorBackend application in a YARN container. All the input parameters of prepareCommand become the command-line arguments of CoarseGrainedExecutorBackend application.
The input executorMemory is in m and becomes -Xmx in the JVM options.
It uses the optional spark.executor.extraJavaOptions for the JVM options.
If the optional SPARK_JAVA_OPTS environment variable is defined, it is added to the JVM options.
It uses the optional spark.executor.extraLibraryPath to set prefixEnv. It uses Client.getClusterPath.
|
Caution
|
FIXME Client.getClusterPath?
|
It sets -Dspark.yarn.app.container.log.dir=<LOG_DIR>
It sets the user classpath (using Client.getUserClasspath).
|
Caution
|
FIXME Client.getUserClasspath?
|
Finally, it creates the entire command to start org.apache.spark.executor.CoarseGrainedExecutorBackend with the following arguments:
-
--driver-urlbeing the inputmasterAddress -
--executor-idbeing the inputslaveId -
--hostnamebeing the inputhostname -
--coresbeing the inputexecutorCores -
--app-idbeing the inputappId
Internal Registries and Counters
| Name | Description |
|---|---|
An instance of YARN’s YarnConfiguration. Created when |