[SPARK-27945][SQL] Minimal changes to support columnar processing #24795

revans2 · 2019-06-04T18:03:08Z

What changes were proposed in this pull request?

This is the first part of SPARK-27396. This is the minimum set of changes necessary to support a pluggable back end for columnar processing. Follow on JIRAs would cover removing some of the duplication between functionality in this patch and functionality currently covered by things like ColumnarBatchScan.

How was this patch tested?

I added in a new unit test to cover new code not really covered in other places.

I also did manual testing by implementing two plugins/extensions that take advantage of the new APIs to allow for columnar processing for some simple queries. One version runs on the CPU. The other version run on a GPU, but because it has unreleased dependencies I will not include a link to it yet.

The CPU version I would expect to add in as an example with other documentation in a follow on JIRA

This is contributed on behalf of NVIDIA Corporation.

tgravescs · 2019-06-04T18:06:39Z

ok to test

SparkQA · 2019-06-04T18:15:33Z

Test build #106161 has finished for PR 24795 at commit 40c8199.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

revans2 · 2019-06-04T18:18:10Z

Just as a high level overview of what the changes are.

The classes in vectorized, including ColumnVector, and ColumnarBatch were moved to catalyst from core so that expressions could get access to them. To be able to comply with conventions about releasing a ColumnVector's resources we also made ColumnVector reference counted, but restrict the usage to really just BoundReference.

Expression was updated to include optional columnar evaluation that is off by default, as was SparkPlan. A few expressions were updated to also do columnar evaluation, but these are things like literals that there really is only one way to do it.

Columnar.scala is where the core of the changes are. It adds in a new rule that runs on the physical plan during the execution stage. It will look at the plan and find sections that should be run columnar and will insert transitions to/from columnar formatted data to allow for the processing to happen. It can also take a user supplied set of rules to also modify the plan so that an extension can add in columnar versions of operations and expressions themselves.

I also updated the Extensions API to allow for the rules to be injected.

SparkQA · 2019-06-04T19:18:12Z

Test build #106162 has finished for PR 24795 at commit 484fb41.

This patch fails MiMa tests.
This patch merges cleanly.
This patch adds no public classes.

gatorsmile · 2019-06-05T01:04:26Z

cc @mengxr @rxin @rednaxelafx

dongjoon-hyun · 2019-06-05T03:13:51Z

cc @dbtsai

NOTICE-binary

rxin · 2019-06-05T14:01:05Z

@revans2 why do we need a ColumnarRule? It seems like you are bifurcating everything.

rxin · 2019-06-05T14:02:10Z

sql/core/src/main/scala/org/apache/spark/sql/execution/Columnar.scala

+ * inserting stages to build larger batches for more efficient processing, or stages that
+ * transition the data to/from an accelerator's memory.
+ */
+@Experimental


All apis in execution are non-public so there's no need to document "experimental" or "unstable" api here. They only have public visibility for debugging.

So how do we want to handle this situation where we are exposing some internal implementation details but with no real guarantees of stability? Not marking them as anything I felt gave the wrong impression that there was some guarantee.

They are all private. The package.scala file documents that. Also execution and catalyst packages are ignored in the public API docs.

Basically it's use at your own risk type of internal APIs.

revans2 · 2019-06-05T14:37:31Z

@rxin we don't need a ColumnarRule. It just made the code simpler. I see your point and I am happy to change it if you like. I can do separate registration for pre/post rules, or I can just have them be internal to ColumnarRule.

class ColumnarRule {
  def pre: Rule[SparkPlan] = plan => plan
  def post: Rule[SparkPlan] = plan => plan
}

SparkQA · 2019-06-05T16:13:06Z

Test build #106205 has finished for PR 24795 at commit 52e31ab.

This patch fails MiMa tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2019-06-06T13:45:11Z

Test build #106242 has finished for PR 24795 at commit 2a3287b.

This patch fails MiMa tests.
This patch merges cleanly.
This patch adds no public classes.

kiszk · 2019-06-06T15:09:33Z

sql/catalyst/src/main/java/org/apache/spark/sql/vectorized/ColumnVector.java

+   * Actually cleans up memory for this column vector. The column vector is really not usable after
+   * this.
+   */
+  protected abstract void doClose();


Can we use more self-declarative name? This is because columnVector is public API for developers who want to support their storage.

kiszk · 2019-06-06T18:58:48Z

sql/core/src/main/scala/org/apache/spark/sql/execution/Columnar.scala

+ * inserting stages to build larger batches for more efficient processing, or stages that
+ * transition the data to/from an accelerator's memory.
+ */
+class ColumnarRule {


Are pre and post appropriate naming, in particular, pre? IIUC correctly, pre is a mandatory function for columnar processing and post is an optional function to optimize columnars or to clean up resources. To me, a pair of pre and post looks dual. (Of course, I am not a native English speaker).

In addition, can you create a test case using ColunarRule (for example, a simplified version of this, but use pre and post)? It would help understanding of reviewers and can ensure the behavior of this API.

For me pre and post made since in relation to the comments, but perhaps not standalone. I'll see if I can make them more descriptive. I'll also add in a test.

SparkQA · 2019-06-06T20:00:10Z

Test build #106255 has finished for PR 24795 at commit 31d7248.

This patch fails MiMa tests.
This patch merges cleanly.
This patch adds no public classes.

rxin · 2019-06-07T12:42:29Z

@revans2 I still feel this is pretty invasive. I think you can still accomplish your goal but reduce the scope quite significantly.

If I understand your use case clearly, what you want to do is to build some processing logic completely outside Spark. What you really need there is quite small (not very different from an uber UDF interface):

The ability to inspect and transform Catalyst plans, to create your own plans, which already exists (catalyst rules / strategies).
The ability for operators to read columnar input, which somewhat exists at the scan level for Parquet and ORC but is a bit hacky at the moment. It can be generalized to support arbitrary operators.
A conversion operator that generates ColumnarBatches.

I think that's all you need?

There is no need to define column specific rules, or create a columnar expression interface. Those expressions are entirely outside Spark, and don't really need to be Spark Expression's.

I also don't understand why you added reference counting to these column vectors. Are you doing that because you might have other things outside Spark that wants to hold these vectors? If that's the case, I think those code outside Spark should just copy, rather than holding onto existing vectors. Otherwise, Spark can no longer reuse these vectors, and also would create bigger memory management challenges (why is Spark generating so many vectors that it's not releasing?).

revans2 · 2019-06-07T15:33:07Z

@rxin

I am confused on exactly what you are proposing.

The ability to inspect and transform Catalyst plans, to create your own plans, which already exists

That I understand and there are some large technical burdens with using the current set of APIs, but I can get into that more if needed.

The ability for operators to read columnar input, which somewhat exists at the scan level for Parquet and ORC but is a bit hacky at the moment. It can be generalized to support arbitrary operators.

This is the one that I don't understand what you are proposing. The only API even remotely associated with SparkPlan where columnar formatted data can leave the plan is

spark/sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala

Lines 77 to 82 in 5cdc506

    
             /** 
        
              * Returns all the RDDs of InternalRow which generates the input rows. 
        
              * 
        
              * @note Right now we support up to two RDDs 
        
              */ 
        
             def inputRDDs(): Seq[RDD[InternalRow]]

Which is a part of the CodeGenSupport trait. To look at it you would have no idea at all that it ever could actually hold ColumnarBatches instead of InternalRows. This hack is really just a way so that a leaf node in the plan which pulls in ColumnarBatchScan can delegate to code generation the conversion between batches and rows. I'm not sure how that in particular can be generalized into something that would allow for columnar input to arbitrary operators.

As such are you suggesting then that we keep the columnar execution path for SparkPlan like in this patch, but not the Expressions? Are you suggesting instead that we don't change SparkPlan and have it so that each operator can output an RDD[ColumnarBatch], but type erase it to an RDD[InternalRow]? We would still need some kind of flag or something else to indicate that this is happening in the child and a Rule of some kind to make sure that the transitions were inserted at the appropriate places.

rxin · 2019-06-07T15:49:50Z

I'm suggesting breaking this into two changes: first change is columnar interface (this physical plan perhaps takes columns and generates columns); second change is create a converter operator.

You shouldn't need the optimizer changes or the reference counting changes, or the expression changes...

revans2 · 2019-06-07T16:56:08Z

Yes, the reference counting and the expression changes could technically be removed. The reference counting is there to be able to support columnar expressions.

In reality I believe that none of this patch is technically required to make columnar processing work as a plugin. The reason I picked this level of abstraction, and adding in new rules that modify the physical plan was to try and reduce the amount of redundant code between a plugin and Spark SQL itself to the point that it would be preferable to use the plugin API over forking Spark. Every other implementation of columnar processing on Spark has gone the route of forking because the existing extension APIs, thought technically sufficient, require any plugin to re-implement large parts of query planning and execution.

For example, user defined rules for mapping a logical to a physical plan run prior to the built in rules. This means that if I add in a rule to replace a projection with a columnar enabled version the built in rule that creates the FileScanExec and does predicate push down into Parquet or Orc will not match the projection I just mapped. Predicate push down is then silently disabled. We ran into this issue when we tried to use the existing extension APIs.

Another example is ShuffleExchangeExec. The best performance optimization is to not do something, so if we can keep the data columnar through an exchange we have eliminated one columnar to row translation and one row to columnar translation. This is a part of the requirements in the SPIP. Replacing that in the logical to physical plan translation is possible, although difficult. But it also has the unintended consequence of the ReuseExchange rule not matching our alternative ShuffleExchangeExec and without full access to the physical plan I don't think we could re-implement that in the Logical to physical plan translation.

I picked a very specific place in the life cycle of the plan to inject these rules because it was as close to the final physical plan as possible so there were the fewest number of transformation afterwards. Even then the code generation rule afterwards still required some modifications.

I think dropping the changes to the Expressions is doable, but not having a way to modify the physical plan very late in the process would require us to really rethink the direction we are taking.

SparkQA · 2019-06-07T17:51:17Z

Test build #106279 has finished for PR 24795 at commit a388060.

This patch fails MiMa tests.
This patch merges cleanly.
This patch adds no public classes.

revans2 · 2019-06-10T18:13:10Z

@rxin

I removed the columnar changes to Expression and also reference counting from ColumnVector. Please take another look.

SparkQA · 2019-06-10T18:23:49Z

Test build #106357 has finished for PR 24795 at commit ca6bb76.

This patch fails to generate documentation.
This patch merges cleanly.
This patch adds no public classes.

BryanCutler

Changes look good, thanks! I'm still wondering if the conf org.apache.spark.example.columnar.enabled or similar is meant to be per plugin or is there going to be a global conf to turn off all columnar processing?

BryanCutler · 2019-06-18T17:13:04Z

sql/core/src/test/scala/org/apache/spark/sql/SparkSessionExtensionSuite.scala

+    // Empty
+  }
+
+


nit: can you use 1 newline here and below instead of 2?

revans2 · 2019-06-18T18:55:39Z

@BryanCutler Sorry about missing that question.

Yes, the assumption is that each plugin could/would add their own configs. We don't want to globally disable columnar support because we will eventually use it in regular operation, like with batch data being read from parquet or the transformations for pandas or R.

SparkQA · 2019-06-18T22:12:53Z

Test build #106636 has finished for PR 24795 at commit dc53e83.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

kiszk · 2019-06-20T15:25:50Z

sql/core/src/main/scala/org/apache/spark/sql/execution/Columnar.scala

+
+        TaskContext.get().addTaskCompletionListener[Unit] { _ =>
+          if (cb != null) {
+            cb.close


nit: cb.close()?

kiszk · 2019-06-20T15:33:55Z

sql/core/src/test/scala/org/apache/spark/sql/SparkSessionExtensionSuite.scala

+import org.apache.spark.rdd.RDD
 import org.apache.spark.sql.catalyst.{FunctionIdentifier, TableIdentifier}
-import org.apache.spark.sql.catalyst.expressions.{Expression, ExpressionInfo, Literal}
+import org.apache.spark.sql.catalyst.expressions.{Add, Alias, AttributeReference, AttributeSeq, BoundReference, Expression, ExpressionInfo, ExprId, Literal, NamedExpression}


nit: how about import org.apache.spark.sql.catalyst.expressions._. It looks too long

kiszk · 2019-06-20T15:34:11Z

sql/core/src/test/scala/org/apache/spark/sql/SparkSessionExtensionSuite.scala

 import org.apache.spark.sql.catalyst.rules.Rule
-import org.apache.spark.sql.execution.{SparkPlan, SparkStrategy}
-import org.apache.spark.sql.types.{DataType, IntegerType, StructType}
+import org.apache.spark.sql.execution.{ColumnarRule, ColumnarToRowExec, ProjectExec, RowToColumnarExec, SparkPlan, SparkStrategy}


revans2 · 2019-06-20T16:15:27Z

@BryanCutler @kiszk @rxin

I think I have addressed all of the review comments so far. I am happy to do more if there are more issues, but I really would like to keep this moving and try to get some of the follow on work up as well. Please let me know what else is needed to get this merged in.

kiszk · 2019-06-20T16:48:50Z

sql/core/src/test/scala/org/apache/spark/sql/SparkSessionExtensionSuite.scala

+      val df = data.selectExpr("vals + 1")
+      // Verify that both pre and post processing of the plan worked.
+      val found = df.queryExecution.executedPlan.collect {
+        case rep: ReplacedRowToColumnarExec => 1


How about the following? The sum will return the unique value that corresponds to the combination.

case rep: ... => 1 case proj: ... => 4 case c2r: ... => 13

SparkQA · 2019-06-20T19:38:15Z

Test build #106732 has finished for PR 24795 at commit 8443b29.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

BryanCutler

LGTM as a first step, what will be the next task in the epic?

rxin · 2019-06-21T04:36:36Z

sql/core/src/main/scala/org/apache/spark/sql/execution/Columnar.scala

+import org.apache.spark.sql.vectorized.{ColumnarBatch, ColumnVector}
+
+/**
+ * Holds a user defined rule that can be used to inject columnar implementations of various


I still don't understand - why do we need ColumnarRule, i.e. any rule that's column specific?

@rxin
As I explained before there needs to be at least one rule that operates on the entire physical SparkPlan after the exchanges have been inserted into the plan.

Having two rules, one that runs prior to inserting columnar transitions and one that runs after it makes it so a plugin does not have to duplicate code for inserting columnar transitions and provides a cleaner API for anyone trying to insert columnar processing, but that is relatively minor.

If you want me to split ColumnarRule up and rename the individual parts to not be columnar specific I am happy to do it. If you want me to make it a single rule that runs prior to code generation I can make that work too. Just let me know which of the two changes you would prefer, and I will make it happen.

revans2 · 2019-06-21T12:37:50Z

@BryanCutler the next step is to remove ColumnarBatchScan and replace it with the injected columnar transitions. I have a patch for that ready to go as soon as this patch goes in.

After that, I would look at how to update the code that uses arrow vectors to communicate with Python/R/.net, etc and how best to update them to commonize as much of the code as possible.

SparkQA · 2019-06-21T17:28:27Z

Test build #106769 has finished for PR 24795 at commit 7f1753d.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

tgravescs

+1, I'll wait a bit longer to commit in case anyone else wants to review.

cloud-fan · 2019-07-01T02:53:44Z

sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/WritableColumnVector.java

  public final int appendStruct(boolean isNull) {
    if (isNull) {
-      appendNull();
+      // This is the same as appendNull but without the assertion for struct types


why is this necessary?

Because appendNull itself has an assertion that you are not appending a struct. So any call to appendStruct with isNull true would have failed.

cloud-fan · 2019-07-01T03:45:54Z

sql/core/src/test/scala/org/apache/spark/sql/SparkSessionExtensionSuite.scala

+   * Returns the result of evaluating this expression on the entire
+   * [[org.apache.spark.sql.vectorized.ColumnarBatch]]. The result of
+   * calling this may be a single [[org.apache.spark.sql.vectorized.ColumnVector]] or a scalar
+   * value. Scalar values typically happen if they are a part of the expression i.e. col("a") + 100.


I understand that this is in test not an API, but other people may look at this test to learn how to implement columnar operator, and I feel the current example is not that good.

IIUC, the goal is:

users can write a rule to replace an arbitrary SQL operator with a custom optimized columnar version

Spark automatically insert column-to-row and row-to-column operators around the columnar operator.

For 1, I think a pretty simple approach is, take in an expression tree, compile it to a columnar processor that can execute the expression tree in a columnar fashion. We don't need to create a ColumnarExpression, which seems over complicated to me as a column processor.

The main reason it is this way is that originally I had Expression support columnar as well but as a part of the review it changed so I made minimal changes to the example to match.
I do find it simpler from a development standpoint to have a one to one mapping. I can then write unit tests with just inputs and verify that the outputs match exactly. But yes there probably are simpler ways to do this depending on the columnar library you are using.

cloud-fan · 2019-07-01T12:54:47Z

Generally looks good, but I'm wondering if it's better to use Apache Arrow as the exchange format instead of the internal ColumnarBatch. I understand that this is a developer API so stability is not a concern here. I'm just afraid people may find ColumnarBatch hard to use when doing high-perf columnar operations, but they can't change it as it's in Spark.

tgravescs · 2019-07-01T14:21:30Z

If you look at the discuss and vote threads there was discussion on this and people were worried about it since it wasn't 1.0 and could potentially change, so we wanted to separate that out and not do at this point

HyukjinKwon · 2019-07-04T10:05:24Z

sql/core/src/test/scala/org/apache/spark/sql/SparkSessionExtensionSuite.scala

    (_: Seq[Expression]) => Literal(5, IntegerType))
 }

+case class CloseableColumnBatchIterator(itr: Iterator[ColumnarBatch],


Not a big deal but for my own understanding, why is it case class?

It doesn't have to be a case class, just made it so I didn't need to add in the new, but I am happy to change it if you want me to in a follow on PR.

HyukjinKwon · 2019-07-04T10:08:13Z

sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala

+  private def adjustColumnar(plan: SparkPlan, isColumnar: Boolean): Boolean =
+    // We are walking up the plan, so columnar starts when we transition to rows
+    // and ends when we transition to columns
+  plan match {


not a biggie but indentation here looks weird.

HyukjinKwon · 2019-07-04T10:20:42Z

Looks good to me in general too.

SPARK-27945: Minimal changes to support columnar processing

40c8199

Fixed test style issue

484fb41

dongjoon-hyun reviewed Jun 5, 2019

View reviewed changes

NOTICE-binary Outdated Show resolved Hide resolved

rxin reviewed Jun 5, 2019

View reviewed changes

Addressed some initial review comments

52e31ab

Merge branch 'master' into columnar-basic

2a3287b

kiszk reviewed Jun 6, 2019

View reviewed changes

removed unneeded change

31d7248

Addressed more review comments and added new test

a388060

revans2 added 2 commits June 10, 2019 10:24

Removed columnar changes to Expression

2edcdc3

Removed reference counting from ColumnVector

ca6bb76

BryanCutler reviewed Jun 18, 2019

View reviewed changes

Addressed review comments

dc53e83

kiszk reviewed Jun 20, 2019

View reviewed changes

Addressed review comments

8443b29

kiszk reviewed Jun 20, 2019

View reviewed changes

BryanCutler reviewed Jun 20, 2019

View reviewed changes

rxin reviewed Jun 21, 2019

View reviewed changes

Addressed review comments

7f1753d

tgravescs approved these changes Jun 26, 2019

View reviewed changes

asfgit closed this in c341de8 Jun 28, 2019

revans2 mentioned this pull request Jun 28, 2019

[SPARK-28213][SQL] Replace ColumnarBatchScan with equivilant from Columnar #25008

Closed

cloud-fan reviewed Jul 1, 2019

View reviewed changes

HyukjinKwon reviewed Jul 4, 2019

View reviewed changes

[SPARK-27945][SQL] Minimal changes to support columnar processing #24795

[SPARK-27945][SQL] Minimal changes to support columnar processing #24795

Uh oh!

Conversation

revans2 commented Jun 4, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

tgravescs commented Jun 4, 2019

Uh oh!

SparkQA commented Jun 4, 2019

Uh oh!

revans2 commented Jun 4, 2019

Uh oh!

SparkQA commented Jun 4, 2019

Uh oh!

gatorsmile commented Jun 5, 2019

Uh oh!

dongjoon-hyun commented Jun 5, 2019

Uh oh!

Uh oh!

rxin commented Jun 5, 2019

Uh oh!

rxin Jun 5, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

revans2 commented Jun 5, 2019

Uh oh!

SparkQA commented Jun 5, 2019

Uh oh!

SparkQA commented Jun 6, 2019

Uh oh!

kiszk Jun 6, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kiszk Jun 6, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Jun 6, 2019

Uh oh!

rxin commented Jun 7, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

revans2 commented Jun 7, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rxin commented Jun 7, 2019

Uh oh!

revans2 commented Jun 7, 2019

Uh oh!

SparkQA commented Jun 7, 2019

Uh oh!

revans2 commented Jun 10, 2019

Uh oh!

SparkQA commented Jun 10, 2019

Uh oh!

BryanCutler left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

revans2 commented Jun 18, 2019

Uh oh!

SparkQA commented Jun 18, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

revans2 commented Jun 4, 2019 •

edited

Loading

rxin Jun 5, 2019 •

edited

Loading

kiszk Jun 6, 2019 •

edited

Loading

kiszk Jun 6, 2019 •

edited

Loading

rxin commented Jun 7, 2019 •

edited

Loading

revans2 commented Jun 7, 2019 •

edited

Loading

kiszk Jun 20, 2019 •

edited

Loading

cloud-fan Jul 1, 2019 •

edited

Loading