-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-25306][SQL] Avoid skewed filter trees to speed up createFilter in ORC
#22313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
createFiltercreateFilter in ORC
|
Test build #95583 has finished for PR 22313 at commit
|
|
Could you review this PR, @gatorsmile and @cloud-fan ? |
| } | ||
| }) | ||
|
|
||
| private def getOrBuildSearchArgumentWithNewBuilder( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a little question about is any possible to reuse code with https://github.com/apache/spark/pull/22313/files#diff-224b8cbedf286ecbfdd092d1e2e2f237R61?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xuanyuanking . This already reuses cacheExpireTimeout.
For the cache value, SearchArgument, SearchArgumentFactory and Builder are different classes. (They only share the same names.)
- Here, they comes from
org.apache.hadoop.hive.ql.io.sarg.*. - There, they comes from
org.apache.orc.storage.ql.io.sarg.*.
The only exception I made is FilterWithTypeMap. I wanted to keep them separately since it's also related to cache key.
|
General question: Why do we use |
|
Thank you for review, @kiszk . First, I don't want to hold the memory up after query completion. If we do, it will be a regression. So, I wanted Second, It's difficult to estimate the enough limit for the number of filters.
In short, |
| if (cacheExpireTimeout > 0) { | ||
| searchArgumentCache.get(FilterWithTypeMap(expression, dataTypeMap)) | ||
| } else { | ||
| buildSearchArgument(dataTypeMap, expression, SearchArgumentFactory.newBuilder()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we set timeout to zero on the cache, the loaded element can be removed immediately. Maybe we don't need to check timeout like this and we can simplify the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ya. It's possible. But, if we create a Guava loading cache and pass through all the cache management logic in Guava, it means a more overhead than this PR. In this PR, spark.sql.orc.cache.sarg.timeout=0 means not creating the loading cache at all.
| if (cacheExpireTimeout > 0) { | ||
| // Build in a bottom-up manner | ||
| getOrBuildSearchArgumentWithNewBuilder(dataTypeMap, newFilter) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we need to cache all sub filters? Don't we just need to cache the final conjunction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Final conjunction? All sub function results will be cached in the end.
|
Do you know why |
This reverts commit ac06b0c.
|
Thank you for review and advice, @cloud-fan . It turns out that my initial assessment is not enough. First of all, from the beginning, SPARK-2883 is designed as a recursive function like the following. Please see val tryLeft = buildSearchArgument(left, newBuilder)
val tryRight = buildSearchArgument(right, newBuilder)
val conjunction = for {
_ <- tryLeft
_ <- tryRight
lhs <- buildSearchArgument(left, builder.startAnd())
rhs <- buildSearchArgument(right, lhs)
} yield rhs.end()However, before that, |
createFilter in ORCcreateFilter in ORC
|
Test build #95637 has finished for PR 22313 at commit
|
| val schema = new StructType(Array(StructField("a", IntegerType, nullable = true))) | ||
| val filters = (1 to 2000).map(LessThan("a", _)).toArray[Filter] | ||
| failAfter(2 seconds) { | ||
| OrcFilters.createFilter(schema, filters) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test looks tricky... It's a bad practice to assume some code will return in a certain time. Can we just add a microbenchmark for it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
- Something like the test code in the PR description? And marked as
ignore(...)instead oftest(...)here? - Or, do you want another test case in
FilterPushdownBenchmark?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll choose (2), @cloud-fan .
| for { | ||
| // Combines all convertible filters using `And` to produce a single conjunction | ||
| conjunction <- convertibleFilters.reduceOption(org.apache.spark.sql.sources.And) | ||
| conjunction <- buildTree(convertibleFilters) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does parquet has the same problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In parquet, this is done as
filters
.flatMap(ParquetFilters.createFilter(requiredSchema, _))
.reduceOption(FilterApi.and)
can we follow it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the first question, I don't think Parquet has the same issue because Parquet uses canMakeFilterOn while ORC is trying to build a full result (with a fresh builder) to check if it's okay or not.
For the second question,
- in ORC, we did the first half(
flatMap) to computeconvertibleFilters, but we can change it withfilters.filter. I'll update like that
val convertibleFilters = for {
filter <- filters
_ <- buildSearchArgument(dataTypeMap, filter, SearchArgumentFactory.newBuilder())
} yield filter- The second half
reduceOption(FilterApi.and)was the original ORC code which generated a skewed tree having exponential time complexity. We need to usebuildTree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, Parquet has another issue here due to .reduceOption(FilterApi.and). When I make a benchmark, Parquet seems to be unable to handle 1000 filters, @cloud-fan .
| withTempPath { dir => | ||
| val columns = (1 to width).map(i => s"id c$i") | ||
| val df = spark.range(1).selectExpr(columns: _*) | ||
| withTempTable("orcTable", "patquetTable") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: a typo, patquetTable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, thanks!
|
Test build #95651 has finished for PR 22313 at commit
|
|
Test build #95652 has finished for PR 22313 at commit
|
|
retest this please |
| } yield builder.build() | ||
| buildTree(filters.filter(buildSearchArgument(dataTypeMap, _, newBuilder).isDefined)) | ||
| .flatMap(buildSearchArgument(dataTypeMap, _, newBuilder)) | ||
| .map(_.build) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah i see what you mean now. Can we restore to the previous version? That seems better. Sorry for the back and forth!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. No problem, @cloud-fan . :)
|
Test build #95658 has finished for PR 22313 at commit
|
|
Test build #95665 has finished for PR 22313 at commit
|
|
Retest this please. |
|
Test build #95669 has finished for PR 22313 at commit
|
|
Retest this please. |
|
The previous failures are irrelevant to this PR.
|
|
Test build #95680 has finished for PR 22313 at commit
|
|
Retest this please. |
|
Test build #95685 has finished for PR 22313 at commit
|
|
thanks, merging to master! |
|
@dongjoon-hyun please also update the title of the JIRA ticket, thanks! |
|
Thank you, @cloud-fan . Sure. I'll update them. |
|
Also, thank you for review, @xuanyuanking, @kiszk , @viirya , @HyukjinKwon . |
What changes were proposed in this pull request?
In both ORC data sources,
createFilterfunction has exponential time complexity due to its skewed filter tree generation. This PR aims to improve it by using newbuildTreefunction.REPRODUCE
RESULT
AFTER THIS PR
How was this patch tested?
Pass the Jenkins with newly added test cases.