-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-28108][SQL][test-hadoop3.2] Simplify OrcFilters #24910
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| None | ||
| } else { | ||
| Some(Or(leftResultOptional.get, rightResultOptional.get)) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part can be simplified to:
for {
lhs <- convertibleFiltersHelper(left, canPartialPushDown)
rhs <- convertibleFiltersHelper(right, canPartialPushDown)
} yield Or(lhs, rhs)| Some(other) | ||
| } else { | ||
| None | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be simplified to:
for (_ <- buildSearchArgument(dataTypeMap, other, newBuilder())) yield other|
Test build #106679 has finished for PR 24910 at commit
|
|
I think we can still benefit from some of the naming and code structure we decided on in the other PR. Will comment with 1-2 specific small suggestions tomorrow. |
|
Test build #106684 has finished for PR 24910 at commit
|
I don't quite agree with this. The benefit of putting them together is: they share a similar procedure and it's easy to maintain and read if these 2 are combined. For example, when we want to support a new leaf predicate, we will always support it in both "building a convertible filter tree" and the "actual SearchArgument builder". One idea: we can separate the Then we can still remove |
|
Test build #106709 has finished for PR 24910 at commit
|
|
Test build #106708 has finished for PR 24910 at commit
|
|
retest this please. |
|
Test build #106714 has finished for PR 24910 at commit
|
sql/core/v2.3.5/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFilters.scala
Outdated
Show resolved
Hide resolved
|
Test build #106802 has finished for PR 24910 at commit
|
|
retest this please. |
|
Test build #106803 has finished for PR 24910 at commit
|
|
retest this please. |
|
Test build #106808 has finished for PR 24910 at commit
|
|
thanks, merging to master! |
What changes were proposed in this pull request?
In #24068, @IvanVergiliev fixes the issue that OrcFilters.createBuilder has exponential complexity in the height of the filter tree due to the way the check-and-build pattern is implemented.
Comparing to the approach in #24068, I propose a simple solution for the issue:
ActionType,TrimUnconvertibleFiltersandBuildSearchArgumentin [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion #24068 can be dropped. The code is more readable.How was this patch tested?
Run the benchmark provided in #24068:
Result:
Also verified with Unit tests.