SessionState

SessionState is the state separation layer between sessions, including SQL configuration, tables, functions, UDFs, the SQL parser, and everything else that depends on a SQLConf.

It uses a SparkSession and manages its own SQLConf.

Note
Given the package org.apache.spark.sql.internal that SessionState belongs to, this one is truly internal. You’ve been warned.
Note
SessionState is a private[sql] class.

SessionState offers the following services:

catalog Attribute

catalog: SessionCatalog

catalog attribute points at shared internal SessionCatalog for managing tables and databases.

It is used to create the shared analyzer, optimizer

SessionCatalog

SessionCatalog is a proxy between SparkSession and the underlying metastore, e.g. HiveSessionCatalog.

analyzer Attribute

analyzer: Analyzer

analyzer is the Analyzer for the current Spark SQL session.

Accessing Catalyst Query Optimizer — optimizer Attribute

optimizer: Optimizer

optimizer is a Spark session’s Catalyst query optimizer for logical query plans.

It is (lazily) set to SparkOptimizer (that adds additional optimization batches). It is created for the session-owned SessionCatalog, SQLConf, and ExperimentalMethods (as defined in experimentalMethods attribute).

experimentalMethods

experimentalMethods is…​

sqlParser Attribute

sqlParser is…​

planner method

planner is the SparkPlanner for the current session.

Whenever called, planner returns a new SparkPlanner instance with the SparkContext of the current SparkSession, the SQLConf, and a collection of extra SparkStrategies (via experimentalMethods attribute).

Preparing Logical Plan for Execution — executePlan Method

executePlan(plan: LogicalPlan): QueryExecution

executePlan executes the input LogicalPlan to produce a QueryExecution in the current SparkSession.

refreshTable Method

refreshTable is…​

addJar Method

addJar is…​

analyze Method

analyze is…​

streamingQueryManager Attribute

streamingQueryManager: StreamingQueryManager

streamingQueryManager attribute points at shared StreamingQueryManager (e.g. to start streaming queries in DataStreamWriter).

udf Attribute

udf: UDFRegistration

udf attribute points at shared UDFRegistration for a given Spark session.

Creating New Hadoop Configuration — newHadoopConf Method

newHadoopConf(): Configuration

newHadoopConf returns Hadoop’s Configuration that it builds using SparkContext.hadoopConfiguration (through SparkSession) with all configuration settings added.

Note
newHadoopConf is used by HiveSessionState (for HiveSessionCatalog), ScriptTransformation, ParquetRelation, StateStoreRDD, and SessionState itself, and few other places.
Caution
FIXME What is ScriptTransformation? StateStoreRDD?

results matching ""

    No results matching ""