catalog: SessionCatalog
SessionState
SessionState
is the state separation layer between sessions, including SQL configuration, tables, functions, UDFs, the SQL parser, and everything else that depends on a SQLConf.
It uses a SparkSession and manages its own SQLConf.
Note
|
Given the package org.apache.spark.sql.internal that SessionState belongs to, this one is truly internal. You’ve been warned.
|
Note
|
SessionState is a private[sql] class.
|
SessionState
offers the following services:
-
newHadoopConf to create a new Hadoop’s
Configuration
.
catalog
Attribute
catalog
attribute points at shared internal SessionCatalog for managing tables and databases.
SessionCatalog
SessionCatalog
is a proxy between SparkSession and the underlying metastore, e.g. HiveSessionCatalog
.
Accessing Catalyst Query Optimizer — optimizer
Attribute
optimizer: Optimizer
optimizer
is a Spark session’s Catalyst query optimizer for logical query plans.
It is (lazily) set to SparkOptimizer (that adds additional optimization batches). It is created for the session-owned SessionCatalog, SQLConf, and ExperimentalMethods
(as defined in experimentalMethods attribute).
planner
method
planner
is the SparkPlanner for the current session.
Whenever called, planner
returns a new SparkPlanner
instance with the SparkContext of the current SparkSession, the SQLConf, and a collection of extra SparkStrategies (via experimentalMethods attribute).
Preparing Logical Plan for Execution — executePlan
Method
executePlan(plan: LogicalPlan): QueryExecution
executePlan
executes the input LogicalPlan to produce a QueryExecution in the current SparkSession.
streamingQueryManager
Attribute
streamingQueryManager: StreamingQueryManager
streamingQueryManager
attribute points at shared StreamingQueryManager (e.g. to start streaming queries in DataStreamWriter
).
udf
Attribute
udf: UDFRegistration
udf
attribute points at shared UDFRegistration
for a given Spark session.
Creating New Hadoop Configuration — newHadoopConf
Method
newHadoopConf(): Configuration
newHadoopConf
returns Hadoop’s Configuration
that it builds using SparkContext.hadoopConfiguration (through SparkSession) with all configuration settings added.
Note
|
newHadoopConf is used by HiveSessionState (for HiveSessionCatalog ), ScriptTransformation , ParquetRelation , StateStoreRDD , and SessionState itself, and few other places.
|
Caution
|
FIXME What is ScriptTransformation ? StateStoreRDD ?
|