log4j.logger.org.apache.spark.sql.execution.datasources.FileSourceStrategy=INFO
FileSourceStrategy
FileSourceStrategy is a Strategy that uses a PhysicalOperation to destructure and then optimize a LogicalPlan.
|
Tip
|
Enable Add the following line to Refer to Logging. |
|
Caution
|
FIXME |
PhysicalOperation
PhysicalOperation is used to destructure a LogicalPlan into a tuple of (Seq[NamedExpression], Seq[Expression], LogicalPlan).
The following idiom is often used in Strategy implementations (e.g. HiveTableScans, InMemoryScans, DataSourceStrategy, FileSourceStrategy):
def apply(plan: LogicalPlan): Seq[SparkPlan] = plan match {
case PhysicalOperation(projections, predicates, plan) =>
// do something
case _ => Nil
}
Whenever used to pattern match to a LogicalPlan, PhysicalOperation's unapply is called.
unapply(plan: LogicalPlan): Option[ReturnType]
unapply uses collectProjectsAndFilters method that recursively destructures the input LogicalPlan.
|
Note
|
unapply is almost collectProjectsAndFilters method itself (with some manipulations of the return value).
|
collectProjectsAndFilters Method
collectProjectsAndFilters(plan: LogicalPlan):
(Option[Seq[NamedExpression]], Seq[Expression], LogicalPlan, Map[Attribute, Expression])
collectProjectsAndFilters is a pattern used to destructure a LogicalPlan that can be Project, Filter or BroadcastHint. Any other LogicalPlan give an all-empty response.