QueryPlanner
QueryPlanner transforms a LogicalPlan through a chain of GenericStrategy objects to produce a physical execution plan, i.e. SparkPlan for SparkPlanner or the Hive-Specific SparkPlanner.
QueryPlanner Contract
QueryPlanner contract defines the following operations:
strategies
The abstract strategies method that returns a collection of GenericStrategy objects (that are used in the other plan method).
plan
plan(plan: LogicalPlan) that returns an Iterator[PhysicalPlan] with elements being the result of applying each GenericStrategy object from strategies collection to plan input parameter.
collectPlaceholders
collectPlaceholders that returns a collection of pairs of a physical plan and a corresponding LogicalPlan.
SparkStrategies
SparkStrategies is an abstract base QueryPlanner that produces a SparkPlan.
SparkStrategies merely serves as a "container" with concrete SparkStrategy objects, e.g. FileSourceStrategy, SpecialLimits, JoinSelection, StatefulAggregationStrategy, Aggregation, InMemoryScans, StreamingRelationStrategy, etc.
|
Note
|
Strategy is a type alias of SparkStrategy that is defined in org.apache.spark.sql package object.
|
|
Note
|
SparkPlanner is the one and only concrete implementation of SparkStrategies available.
|
|
Caution
|
FIXME What is singleRowRdd for?
|
Hive-Specific SparkPlanner for HiveSessionState
HiveSessionState class uses an custom anonymous SparkPlanner for planner method (part of the SessionState contract).
The custom anonymous SparkPlanner uses Strategy objects defined in HiveStrategies.