catalog: SessionCatalog
SessionState
SessionState is the state separation layer between sessions, including SQL configuration, tables, functions, UDFs, the SQL parser, and everything else that depends on a SQLConf.
It uses a SparkSession and manages its own SQLConf.
|
Note
|
Given the package org.apache.spark.sql.internal that SessionState belongs to, this one is truly internal. You’ve been warned.
|
|
Note
|
SessionState is a private[sql] class.
|
SessionState offers the following services:
-
newHadoopConf to create a new Hadoop’s
Configuration.
catalog Attribute
catalog attribute points at shared internal SessionCatalog for managing tables and databases.
SessionCatalog
SessionCatalog is a proxy between SparkSession and the underlying metastore, e.g. HiveSessionCatalog.
Accessing Catalyst Query Optimizer — optimizer Attribute
optimizer: Optimizer
optimizer is a Spark session’s Catalyst query optimizer for logical query plans.
It is (lazily) set to SparkOptimizer (that adds additional optimization batches). It is created for the session-owned SessionCatalog, SQLConf, and ExperimentalMethods (as defined in experimentalMethods attribute).
planner method
planner is the SparkPlanner for the current session.
Whenever called, planner returns a new SparkPlanner instance with the SparkContext of the current SparkSession, the SQLConf, and a collection of extra SparkStrategies (via experimentalMethods attribute).
Preparing Logical Plan for Execution — executePlan Method
executePlan(plan: LogicalPlan): QueryExecution
executePlan executes the input LogicalPlan to produce a QueryExecution in the current SparkSession.
streamingQueryManager Attribute
streamingQueryManager: StreamingQueryManager
streamingQueryManager attribute points at shared StreamingQueryManager (e.g. to start streaming queries in DataStreamWriter).
udf Attribute
udf: UDFRegistration
udf attribute points at shared UDFRegistration for a given Spark session.
Creating New Hadoop Configuration — newHadoopConf Method
newHadoopConf(): Configuration
newHadoopConf returns Hadoop’s Configuration that it builds using SparkContext.hadoopConfiguration (through SparkSession) with all configuration settings added.
|
Note
|
newHadoopConf is used by HiveSessionState (for HiveSessionCatalog), ScriptTransformation, ParquetRelation, StateStoreRDD, and SessionState itself, and few other places.
|
|
Caution
|
FIXME What is ScriptTransformation? StateStoreRDD?
|