Scheduler Backends

Spark comes with a pluggable backend mechanism called scheduler backend (aka backend scheduler) to support various cluster managers, e.g. Apache Mesos, Hadoop YARN or Spark’s own Spark Standalone and Spark local.

These cluster managers differ by their custom task scheduling modes and resource offers mechanisms, and Spark’s approach is to abstract the differences in SchedulerBackend Contract.

A scheduler backend is created and started as part of SparkContext’s initialization (when TaskSchedulerImpl is started - see Creating Scheduler Backend and Task Scheduler).

Caution

FIXME Image how it gets created with SparkContext in play here or in SparkContext doc.

Scheduler backends are started and stopped as part of TaskSchedulerImpl’s initialization and stopping.

Being a scheduler backend in Spark assumes a Apache Mesos-like model in which "an application" gets resource offers as machines become available and can launch tasks on them. Once a scheduler backend obtains the resource allocation, it can start executors.

Tip	Understanding how Apache Mesos works can greatly improve understanding Spark.

SchedulerBackend Contract

Note	`org.apache.spark.scheduler.SchedulerBackend` is a `private[spark]` Scala trait in Spark.

Every SchedulerBackend has to follow the following contract:

Can be started (using start()) and stopped (using stop())
reviveOffers
Calculate default level of parallelism
killTask
Answers isReady() to inform whether it is currently started or stopped. It returns true by default.
Knows the application id for a job (using applicationId()).

Caution

FIXME applicationId() doesn’t accept an input parameter. How is Scheduler Backend related to a job and an application?

Knows an application attempt id (see applicationAttemptId)
Knows the URLs for the driver’s logs (see getDriverLogUrls).

Caution

FIXME Screenshot the tab and the links

`reviveOffers` Method

Note	It is used in `TaskSchedulerImpl` using `backend` internal reference when submitting tasks.

There are currently three custom implementations of reviveOffers available in Spark for different clustering options:

For local mode read Task Submission a.k.a. reviveOffers.
CoarseGrainedSchedulerBackend
MesosFineGrainedSchedulerBackend

Default Level of Parallelism (defaultParallelism method)

defaultParallelism(): Int

Default level of parallelism is used by TaskScheduler to use as a hint for sizing jobs.

Note	It is used in `TaskSchedulerImpl.defaultParallelism`.

Refer to LocalBackend for local mode.

Refer to Default Level of Parallelism for CoarseGrainedSchedulerBackend.

Refer to Default Level of Parallelism for CoarseMesosSchedulerBackend.

No other custom implementations of defaultParallelism() exists.

Killing Task — `killTask` Method

killTask(taskId: Long, executorId: String, interruptThread: Boolean)

killTask throws a UnsupportedOperationException by default.

applicationAttemptId

applicationAttemptId(): Option[String] = None

applicationAttemptId returns the application attempt id of a Spark application.

It is currently only supported by YARN cluster scheduler backend as the YARN cluster manager supports multiple application attempts.

Note	`applicationAttemptId` is also a part of TaskScheduler contract and `TaskSchedulerImpl` directly calls the SchedulerBackend’s `applicationAttemptId`.

getDriverLogUrls

getDriverLogUrls: Option[Map[String, String]] returns no URLs by default.

It is currently only supported by YarnClusterSchedulerBackend

Available Implementations

Spark comes with the following scheduler backends:

LocalBackend (local mode)
CoarseGrainedSchedulerBackend
- SparkDeploySchedulerBackend used in Spark Standalone (and local-cluster - FIXME)
- YarnSchedulerBackend
  - YarnClientSchedulerBackend (for client deploy mode)
  - YarnClusterSchedulerBackend (for cluster deploy mode).
- CoarseMesosSchedulerBackend
MesosSchedulerBackend

Scheduler Backend

Scheduler Backends

SchedulerBackend Contract

`reviveOffers` Method

Default Level of Parallelism (defaultParallelism method)

Killing Task — `killTask` Method

applicationAttemptId

getDriverLogUrls

Available Implementations

results matching ""

No results matching ""

Scheduler Backends

SchedulerBackend Contract

reviveOffers Method

Default Level of Parallelism (defaultParallelism method)

Killing Task — killTask Method

applicationAttemptId

getDriverLogUrls

Available Implementations

results matching ""

No results matching ""

`reviveOffers` Method

Killing Task — `killTask` Method