-
Notifications
You must be signed in to change notification settings - Fork 1.2k
[core] support btree global index in paimon-common #6869
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
d60bf1b to
a2134dd
Compare
|
Test failures are irrelevant to this PR. Will try to reopen after all reviews addressed. |
|
@steFaiz Can you open actions in your repo too? This allows for double verification. |
|
@JingsongLi Thanks! I've enabled all workflows in my forked repo. |
|
I think we can just introduce another format. BTree Index File: |
Thanks for your inspiration! I will try to refactor the code! |
JingsongLi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
* upstream/master: (51 commits) [test] Fix unstable test: handle MiniCluster shutdown gracefully in collect method (apache#6913) [python] fix ray dataset not lazy loading issue when parallelism = 1 (apache#6916) [core] Refactor ExternalPathProviders abstraction [spark] fix Merge Into unstable tests (apache#6912) [core] Enable Entropy Inject for data file path to prevent being throttled by object storage (apache#6832) [iceberg] support millisecond timestamps in iceberg compatibility mode (apache#6352) [spark] Handle NPE for pushdown aggregate when a datasplit has a null max/min value (apache#6611) [test] Fix unstable case testLimitPushDown [core] Refactor row id pushdown to DataEvolutionFileStoreScan [spark] paimon-spark supports row id push down (apache#6697) [spark] Support compact_database procedure (apache#6328) (apache#6910) [lucene] Fix row count in IndexManifestEntry [test] Remove unstable test: AppendTableITCase.testFlinkMemoryPool [core] Refactor Global index writer and reader for Btree [core] Minor refactor to magic number into footer [core] Support btree global index in paimon-common (apache#6869) [spark] Optimize compact for data-evolution table, commit multiple times to avoid out of memory (apache#6907) [rest] Add fromSnapshot to rollback (apache#6905) [test] Fix unstable RowTrackingTestBase test [core] Simplify FileStoreCommitImpl to extract some classes (apache#6904) ...
Purpose
This PR is a part of #6834, aiming to provide capability in paimon-common module.
File Format
The introduced BTree Index File is also base on SST File Format, with slight difference with LookupStore File, as illustrated below:

The file is composed by:
File Entry
The key-value pair of internal SST File is as below:

Tests
Please see:
API and Format
None
Documentation
Will be added in the future.