Skip to content

Conversation

@alamb
Copy link
Contributor

@alamb alamb commented Aug 24, 2024

Which issue does this PR close?

Closes #.

Rationale for this change

While reviewing apache/datafusion#12135 from @itsjunetime I noticed that the docs for RowFilter and ArrowPredicate could be improved

What changes are included in this PR?

  1. Add link from ArrowPredicate back to RowFilter
  2. Refine the description of how RowFilter works

Are there any user-facing changes?

Improved documentation. No functional change

@github-actions github-actions bot added the parquet Changes to the parquet crate label Aug 24, 2024
@alamb alamb requested a review from tustvold August 24, 2024 11:45
@alamb
Copy link
Contributor Author

alamb commented Aug 25, 2024

Thank you for the review @tustvold

@alamb alamb added the documentation Improvements or additions to documentation label Aug 25, 2024
@alamb
Copy link
Contributor Author

alamb commented Aug 25, 2024

FYI @XiangpengHao -- these are the structures related to predicate pushdown. As the other docs explain, there are some tradeoffs when evaluating these predicates (among other thing, I think the same columns may be decoded multiple times if there are multiple predicates that refer to them).

There may be some optimization opportunity in this area -- if you get to the point where you might use this in your research, please let me know and I can writeup some thoughts on how we could improve things in this area

@alamb alamb merged commit f73dbc3 into apache:master Aug 25, 2024
@alamb alamb deleted the alamb/docs_predicate branch August 25, 2024 11:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation parquet Changes to the parquet crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants