Skip to content

Conversation

@gszadovszky
Copy link
Contributor

No description provided.

@gszadovszky gszadovszky requested review from rdblue and zivanfi July 31, 2018 17:12
@gszadovszky gszadovszky merged commit 43ac3e1 into apache:column-indexes Aug 3, 2018
zivanfi pushed a commit that referenced this pull request Oct 18, 2018
This is a squashed feature branch merge including the changes listed below. The detailed history can be found in the 'column-indexes' branch.

* PARQUET-1211: Column indexes: read/write API (#456)
* PARQUET-1212: Column indexes: Show indexes in tools (#479)
* PARQUET-1213: Column indexes: Limit index size (#480)
* PARQUET-1214: Column indexes: Truncate min/max values (#481)
* PARQUET-1364: Invalid row indexes for pages starting with nulls (#507)
* PARQUET-1310: Column indexes: Filtering (#509)
* PARQUET-1386: Fix issues of NaN and +-0.0 in case of float/double column indexes (#515)
* PARQUET-1389: Improve value skipping at page synchronization (#514)
* PARQUET-1381: Fix missing endRecord after merging columnIndex

private void repetitionLevel(int repetitionLevel) {
repetitionLevelColumn.writeInteger(repetitionLevel);
assert pageRowCount == 0 ? repetitionLevel == 0 : true : "Every page shall start on record boundaries";

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the logic of adding verification here? I have encountered a situation where the valuecount is 0 but the replicationlevel is not 0. Is this situation itself normal? Why do you need to add this check after columnindex

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zhaochengzhch, I think the message describes it. We require to end/start pages at record boundaries so the repetition level shall be 0 when the page row count is 0 (which means we are starting a page). If the repetition level is not 0 at this point it breaks the mentioned requirement which is needed for column indexes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants