ContextCleaner
It does cleanup of shuffles, RDDs and broadcasts.
|
Caution
|
FIXME What does the above sentence really mean? |
It uses a daemon Spark Context Cleaner thread that cleans RDD, shuffle, and broadcast states (using keepCleaning method).
|
Caution
|
FIXME Review keepCleaning
|
ShuffleDependencies register themselves for cleanup using ContextCleaner.registerShuffleForCleanup method.
ContextCleaner uses a Spark context.
registerRDDForCleanup
|
Caution
|
FIXME |
registerAccumulatorForCleanup
|
Caution
|
FIXME |
stop Method
|
Caution
|
FIXME |
Settings
-
spark.cleaner.referenceTracking(default:true) controls whether to enable or not ContextCleaner as a Spark context initializes. -
spark.cleaner.referenceTracking.blocking(default:true) controls whether the cleaning thread will block on cleanup tasks (other than shuffle, which is controlled by thespark.cleaner.referenceTracking.blocking.shuffleparameter).It is
trueas a workaround to SPARK-3015 Removing broadcast in quick successions causes Akka timeout. -
spark.cleaner.referenceTracking.blocking.shuffle(default:false) controls whether the cleaning thread will block on shuffle cleanup tasks.It is
falseas a workaround to SPARK-3139 Akka timeouts from ContextCleaner when cleaning shuffles.