Skip to content

Conversation

@albertlockett
Copy link
Contributor

Which issue does this PR close?

Rationale for this change

A regression was reported in issue #8404 which was introduced in #7585. This PR resolves the issue.

What changes are included in this PR?

The root cause of the issue was that the behaviour of ByteArrayDictionaryReader is to return a new empty length array of values if the record reader has already been consumed. The problem was that the repetition and definition level buffers were not being advanced in this early return case.

fn consume_batch(&mut self) -> Result<ArrayRef> {
if self.record_reader.num_values() == 0 {
// once the record_reader has been consumed, we've replaced its values with the default
// variant of DictionaryBuffer (Offset). If `consume_batch` then gets called again, we
// avoid using the wrong variant of the buffer by returning empty array.
return Ok(new_empty_array(&self.data_type));
}
let buffer = self.record_reader.consume_record_data();
let null_buffer = self.record_reader.consume_bitmap_buffer();
let array = buffer.into_array(null_buffer, &self.data_type)?;
self.def_levels_buffer = self.record_reader.consume_def_levels();
self.rep_levels_buffer = self.record_reader.consume_rep_levels();
self.record_reader.reset();
Ok(array)

The StructArrayReader reads the repetition and definition levels from the first child to determine the nullability of the struct array. When we returned the empty values buffer for the child, without advancing the repetition and definition buffers, the StructArrayReader a length mismatch between the empty child array and the non-empty nullability bitmask, and this produces the error.

if self.nullable {
// calculate struct def level data
// children should have consistent view of parent, only need to inspect first child
let def_levels = self.children[0]
.get_def_levels()
.expect("child with nullable parents must have definition level");
// calculate bitmap for current array
let mut bitmap_builder = BooleanBufferBuilder::new(children_array_len);
match self.children[0].get_rep_levels() {
Some(rep_levels) => {
// Sanity check
assert_eq!(rep_levels.len(), def_levels.len());
for (rep_level, def_level) in rep_levels.iter().zip(def_levels) {
if rep_level > &self.struct_rep_level {
// Already handled by inner list - SKIP
continue;
}
bitmap_builder.append(*def_level >= self.struct_def_level)
}
}
None => {
for def_level in def_levels {
bitmap_builder.append(*def_level >= self.struct_def_level)
}
}
}
if bitmap_builder.len() != children_array_len {
return Err(general_err!("Failed to decode level data for struct array"));
}

The fix is simple, always have ByteArrayDictionaryReader advance the repetition and definition level buffers when consume_next_batch is called.

Are these changes tested?

Yes, a new unit test was added test_read_nullable_structs_with_binary_dict_as_first_child_column, which before the changes introduced in this PR would replicate the issue.

Are there any user-facing changes?

No


self.def_levels_buffer = self.record_reader.consume_def_levels();
self.rep_levels_buffer = self.record_reader.consume_rep_levels();
self.record_reader.reset();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick action on this.
During my debugging of the bug, I also came to the conclusion that advancing the rep and buffers would make sense, but I am lacking familiarity with these internals.

The docs for reset state that this should be called after consuming data. This isn't called in the shortcut case now. I am lacking a bit of context here, and the docs of GenericRecordReader aren't helping.

During my debug session, in the error case, num_values() and num_records() were both 0 already at that point. So resetting isn't doing much.
But I am wondering if we can get to a state where num_records > num_values.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps reset could also be called before the early exit.

}

#[test]
fn test_read_nullable_structs_with_binary_dict_as_first_child_column() {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I verified that this test fails without the fix in parquet/src/arrow/array_reader/byte_array_dictionary.rs

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @albertlockett 🙏

@valkum can you please verify that this fix fixes the example file you have on
#8404 (comment)

Update: I verfied

The test on #8404 passes with this PR:

warning: `parquet` (lib) generated 10 warnings (run `cargo fix --lib -p parquet` to apply 4 suggestions)
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.05s
     Running `target/debug/rust_playground`
Read 1 rows

And fails on main:


thread 'main' panicked at src/main.rs:18:27:
called `Result::unwrap()` on an `Err` value: ParquetError("Parquet error: Failed to decode level data for struct array")
stack backtrace:
   0: __rustc::rust_begin_unwind
             at /rustc/1159e78c4747b02ef996e55082b704c09b970588/library/std/src/panicking.rs:697:5

@alamb alamb requested a review from tustvold October 8, 2025 21:09
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks reasonable to me -- thank you @albertlockett and @valkum

I am not super familiar with this code, so maybe @tustvold could give it a look if he has some time.

@valkum
Copy link

valkum commented Oct 9, 2025

Just tested this PR with a larger file that caused this error. Works with this fix 👍

@alamb
Copy link
Contributor

alamb commented Oct 9, 2025

I'll plan to merge this PR in the next day or two unless anyone else would like a chance to review

Comment on lines +169 to +170
self.def_levels_buffer = self.record_reader.consume_def_levels();
self.rep_levels_buffer = self.record_reader.consume_rep_levels();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still trying to understand the issue and the fix. Is the following reasoning correct?

When struct array calls consume_batch on its children, the level buffers are set to the levels for the current batch. A subsequent call to consume_batch sees that num values is 0 and returns an empty array...but the level buffers still contain the levels from the already consumed batch. So this fix here is to ensure that on a subsequent call, the level buffers will be set to the empty ones from the consume_xxx_levels calls?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @etseidl exactly, that is the correct reasoning

Copy link
Contributor

@etseidl etseidl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @albertlockett!

@etseidl
Copy link
Contributor

etseidl commented Oct 14, 2025

Thanks @albertlockett @valkum @alamb!

@etseidl etseidl merged commit 77ca6dc into apache:main Oct 14, 2025
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

parquet Changes to the parquet crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Column with List(Struct) causes failed to decode level data for struct array (regression in 56)

4 participants