attemptId: Option[ApplicationAttemptId] = None
YarnSchedulerBackend — Coarse-Grained Scheduler Backend for YARN
YarnSchedulerBackend
is an abstract CoarseGrainedSchedulerBackend for YARN that contains common logic for the client
and cluster
YARN scheduler backends, i.e. YarnClientSchedulerBackend and YarnClusterSchedulerBackend respectively.
YarnSchedulerBackend
is available in the RPC Environment as YarnScheduler
RPC Endpoint (or yarnSchedulerEndpointRef internally).
YarnSchedulerBackend
expects TaskSchedulerImpl and SparkContext to initialize itself.
It works for a single Spark application (as appId
of type ApplicationId
)
Caution
|
FIXME It may be a note for scheduler backends in general. |
attemptId Internal Attribute
attemptId
is the application attempt ID for this run of a Spark application. It is only available for cluster deploy mode.
It is explicitly set to None
when YarnClientSchedulerBackend starts (and bindToYarn is called).
It is set to the current attempt id (using YARN API’s ApplicationMaster.getAttemptId
) when YarnClusterSchedulerBackend starts (and bindToYarn is called).
Note
|
attemptId is exposed using applicationAttemptId which is a part of SchedulerBackend Contract.
|
applicationAttemptId
Note
|
applicationAttemptId is a part of SchedulerBackend Contract.
|
applicationAttemptId(): Option[String]
applicationAttemptId
returns the application attempt id of a Spark application.
Resetting YarnSchedulerBackend
Note
|
reset is a part of CoarseGrainedSchedulerBackend Contract.
|
reset
resets the parent CoarseGrainedSchedulerBackend
scheduler backend and ExecutorAllocationManager (accessible by SparkContext.executorAllocationManager
).
doRequestTotalExecutors
def doRequestTotalExecutors(requestedTotal: Int): Boolean
Note
|
doRequestTotalExecutors is a part of the CoarseGrainedSchedulerBackend Contract.
|
doRequestTotalExecutors
simply sends a blocking RequestExecutors message to YarnScheduler RPC Endpoint with the input requestedTotal
and the internal localityAwareTasks
and hostToLocalTaskCount
attributes.
Caution
|
FIXME The internal attributes are already set. When and how? |
Reference to YarnScheduler RPC Endpoint (yarnSchedulerEndpointRef attribute)
yarnSchedulerEndpointRef
is the reference to YarnScheduler RPC Endpoint.
totalExpectedExecutors
totalExpectedExecutors
is a value that is 0
initially when a YarnSchedulerBackend
instance is created but later changes when Spark on YARN starts (in client mode or cluster mode).
Note
|
After Spark on YARN is started, totalExpectedExecutors is initialized to a proper value.
|
It is used in sufficientResourcesRegistered.
Caution
|
FIXME Where is this used? |
Creating YarnSchedulerBackend Instance
When created, YarnSchedulerBackend
sets the internal minRegisteredRatio which is 0.8
when spark.scheduler.minRegisteredResourcesRatio is not set or the parent’s minRegisteredRatio.
totalExpectedExecutors is set to 0
.
It creates a YarnSchedulerEndpoint (as yarnSchedulerEndpoint
) and registers it as YarnScheduler with the RPC Environment.
It sets the internal askTimeout
Spark timeout for RPC ask operations using the SparkContext
constructor parameter.
It sets optional appId
(of type ApplicationId
), attemptId
(for cluster mode only and of type ApplicationAttemptId
).
It also creates SchedulerExtensionServices
object (as services
).
Caution
|
FIXME What is SchedulerExtensionServices ?
|
The internal shouldResetOnAmRegister flag is turned off.
sufficientResourcesRegistered
sufficientResourcesRegistered
checks whether totalRegisteredExecutors
is greater than or equals to totalExpectedExecutors
multiplied by minRegisteredRatio.
Note
|
It overrides the parent’s CoarseGrainedSchedulerBackend.sufficientResourcesRegistered. |
Caution
|
FIXME Where’s this used? |
minRegisteredRatio
minRegisteredRatio
is set when YarnSchedulerBackend
is created.
It is used in sufficientResourcesRegistered.
Starting the Backend (start method)
start
creates a SchedulerExtensionServiceBinding
object (using SparkContext
, appId
, and attemptId
) and starts it (using SchedulerExtensionServices.start(binding)
).
Note
|
A SchedulerExtensionServices object is created when YarnSchedulerBackend is initialized and available as services .
|
Ultimately, it calls the parent’s CoarseGrainedSchedulerBackend.start.
Note
|
|
Stopping the Backend (stop method)
stop
calls the parent’s CoarseGrainedSchedulerBackend.requestTotalExecutors (using (0, 0, Map.empty)
parameters).
Caution
|
FIXME Explain what 0, 0, Map.empty means after the method’s described for the parent.
|
It calls the parent’s CoarseGrainedSchedulerBackend.stop.
Ultimately, it stops the internal SchedulerExtensionServiceBinding
object (using services.stop()
).
Caution
|
FIXME Link the description of services.stop() here.
|
Recording Application and Attempt Ids (bindToYarn method)
bindToYarn(appId: ApplicationId, attemptId: Option[ApplicationAttemptId]): Unit
bindToYarn
sets the internal appId
and attemptId
to the value of the input parameters, appId
and attemptId
, respectively.
Note
|
start requires appId .
|
Internal Registries
shouldResetOnAmRegister flag
When YarnSchedulerBackend is created, shouldResetOnAmRegister
is disabled (i.e. false
).
shouldResetOnAmRegister
controls whether to reset YarnSchedulerBackend
when another RegisterClusterManager
RPC message arrives.
It allows resetting internal state after the initial ApplicationManager failed and a new one was registered.
Note
|
It can only happen in client deploy mode. |