-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-14019] [SQL] Remove noop SortOrder in Sort #11840
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| * Removes no-op SortOrder from Sort | ||
| */ | ||
| object PruneSorts extends Rule[LogicalPlan] { | ||
| def apply(plan: LogicalPlan): LogicalPlan = plan transform { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we just test whether the expressions are constant foldable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we can do it. Thanks!
|
Can you change the title to SPARK instead of Spark? |
|
I will let @cloud-fan review and merge this. |
|
Test build #53605 has finished for PR 11840 at commit
|
|
Test build #53608 has finished for PR 11840 at commit
|
|
Test build #53610 has finished for PR 11840 at commit
|
|
Thank you! |
|
LGTM, thanks, merging to master! |
#### What changes were proposed in this pull request?
This PR is to add a new Optimizer rule for pruning Sort if its SortOrder is no-op. In the phase of **Optimizer**, if a specific `SortOrder` does not have any reference, it has no effect on the sorting results. If `Sort` is empty, remove the whole `Sort`.
For example, in the following SQL query
```SQL
SELECT * FROM t ORDER BY NULL + 5
```
Before the fix, the plan is like
```
== Analyzed Logical Plan ==
a: int, b: int
Sort [(cast(null as int) + 5) ASC], true
+- Project [a#92,b#93]
+- SubqueryAlias t
+- Project [_1#89 AS a#92,_2#90 AS b#93]
+- LocalRelation [_1#89,_2#90], [[1,2],[1,2]]
== Optimized Logical Plan ==
Sort [null ASC], true
+- LocalRelation [a#92,b#93], [[1,2],[1,2]]
== Physical Plan ==
WholeStageCodegen
: +- Sort [null ASC], true, 0
: +- INPUT
+- Exchange rangepartitioning(null ASC, 5), None
+- LocalTableScan [a#92,b#93], [[1,2],[1,2]]
```
After the fix, the plan is like
```
== Analyzed Logical Plan ==
a: int, b: int
Sort [(cast(null as int) + 5) ASC], true
+- Project [a#92,b#93]
+- SubqueryAlias t
+- Project [_1#89 AS a#92,_2#90 AS b#93]
+- LocalRelation [_1#89,_2#90], [[1,2],[1,2]]
== Optimized Logical Plan ==
LocalRelation [a#92,b#93], [[1,2],[1,2]]
== Physical Plan ==
LocalTableScan [a#92,b#93], [[1,2],[1,2]]
```
cc rxin cloud-fan marmbrus Thanks!
#### How was this patch tested?
Added a test suite for covering this rule
Author: gatorsmile <[email protected]>
Closes apache#11840 from gatorsmile/sortElimination.
What changes were proposed in this pull request?
This PR is to add a new Optimizer rule for pruning Sort if its SortOrder is no-op. In the phase of Optimizer, if a specific
SortOrderdoes not have any reference, it has no effect on the sorting results. IfSortis empty, remove the wholeSort.For example, in the following SQL query
Before the fix, the plan is like
After the fix, the plan is like
cc @rxin @cloud-fan @marmbrus Thanks!
How was this patch tested?
Added a test suite for covering this rule