ExecutorsListener Spark Listener

ExecutorsListener is a SparkListener that tracks executors and their tasks in a Spark application for Stage Details page, Jobs tab and /allexecutors REST endpoint.

Table 1. ExecutorsListener Event Handlers
Event Handler Description

onApplicationStart

May create an entry for the driver in executorToTaskSummary registry

onExecutorAdded

May create an entry in executorToTaskSummary registry. It also makes sure that the number of entries for dead executors does not exceed spark.ui.retainedDeadExecutors and removes excess.

Adds an entry to executorEvents registry and optionally removes the oldest if the number of entries exceeds spark.ui.timeline.executors.maximum.

onExecutorRemoved

Marks an executor dead in executorToTaskSummary registry.

Adds an entry to executorEvents registry and optionally removes the oldest if the number of entries exceeds spark.ui.timeline.executors.maximum.

onTaskStart

May create an entry for an executor in executorToTaskSummary registry.

onTaskEnd

May create an entry for an executor in executorToTaskSummary registry.

ExecutorsListener requires a StorageStatusListener and SparkConf.

Registries

Table 2. ExecutorsListener Registries
Registry Description

executorToTaskSummary

The lookup table for ExecutorTaskSummary per executor id.

Used to build a ExecutorSummary for /allexecutors REST endpoint, to display stdout and stderr logs in Tasks and Aggregated Metrics by Executor sections in Stage Details page.

executorEvents

A collection of SparkListenerEvents.

Used to build the event timeline in All Jobs and Details for Job pages.

onApplicationStart Method

onApplicationStart(applicationStart: SparkListenerApplicationStart): Unit

onApplicationStart takes driverLogs property from the input applicationStart (if defined) and finds the driver’s active StorageStatus (using the current StorageStatusListener). onApplicationStart then uses the driver’s StorageStatus (if defined) to set executorLogs.

Table 3. ExecutorTaskSummary and ExecutorInfo Attributes
ExecutorTaskSummary Attribute SparkListenerApplicationStart Attribute

executorLogs

driverLogs (if defined)

onExecutorAdded Method

onExecutorAdded(executorAdded: SparkListenerExecutorAdded): Unit

onExecutorAdded finds the executor (using the input executorAdded) in the internal executorToTaskSummary registry and sets the attributes. If not found, onExecutorAdded creates a new entry.

Table 4. ExecutorTaskSummary and ExecutorInfo Attributes
ExecutorTaskSummary Attribute ExecutorInfo Attribute

executorLogs

logUrlMap

totalCores

totalCores

tasksMax

totalCores / spark.task.cpus

onExecutorAdded adds the input executorAdded to executorEvents collection. If the number of elements in executorEvents collection is greater than spark.ui.timeline.executors.maximum, the first/oldest event is removed.

onExecutorAdded removes the oldest dead executor from executorToTaskSummary lookup table if their number is greater than spark.ui.retainedDeadExecutors.

onExecutorRemoved Method

onExecutorRemoved(executorRemoved: SparkListenerExecutorRemoved): Unit

onExecutorRemoved adds the input executorRemoved to executorEvents collection. It then removes the oldest event if the number of elements in executorEvents collection is greater than spark.ui.timeline.executors.maximum.

The executor is marked as removed/inactive in executorToTaskSummary lookup table.

onTaskStart Method

onTaskStart(taskStart: SparkListenerTaskStart): Unit

onTaskStart increments tasksActive for the executor (using the input SparkListenerTaskStart).

Table 5. ExecutorTaskSummary and SparkListenerTaskStart Attributes
ExecutorTaskSummary Attribute Description

tasksActive

Uses taskStart.taskInfo.executorId.

onTaskEnd Method

onTaskEnd(taskEnd: SparkListenerTaskEnd): Unit

onTaskEnd uses the TaskInfo from the input taskEnd (if available).

Depending on the reason for SparkListenerTaskEnd onTaskEnd does the following:

Table 6. onTaskEnd Behaviour per SparkListenerTaskEnd Reason
SparkListenerTaskEnd Reason onTaskEnd Behaviour

Resubmitted

Do nothing

ExceptionFailure

Increment tasksFailed

anything

Increment tasksComplete

tasksActive is decremented but only when the number of active tasks for the executor is greater than 0.

Table 7. ExecutorTaskSummary and onTaskEnd Behaviour
ExecutorTaskSummary Attribute Description

tasksActive

Decremented if greater than 0.

duration

Uses taskEnd.taskInfo.duration

If the TaskMetrics (in the input taskEnd) is available, the metrics are added to the taskSummary for the task’s executor.

Table 8. Task Metrics and Task Summary
Task Summary Task Metric

inputBytes

inputMetrics.bytesRead

inputRecords

inputMetrics.recordsRead

outputBytes

outputMetrics.bytesWritten

outputRecords

outputMetrics.recordsWritten

shuffleRead

shuffleReadMetrics.remoteBytesRead

shuffleWrite

shuffleWriteMetrics.bytesWritten

jvmGCTime

metrics.jvmGCTime

Settings

Table 9. ExecutorsListener Spark Properties
Name Default Value Description

spark.ui.timeline.executors.maximum

1000

The maximum number of entries in executorEvents registry.

spark.ui.retainedDeadExecutors

100

The maximum number of dead executors in executorToTaskSummary registry.

results matching ""

    No results matching ""