-
Notifications
You must be signed in to change notification settings - Fork 1k
Open
Labels
Description
Describe the bug
Collecting to a BooleanArray
produces unintuitive results if the upper bound of the iterator is an over estimation. At least I think after looking at the code.
Is this intended behavior? If not, I could try to come up with a fix.
To Reproduce
Tested with Arrow v56.2.0 (via DataFusion 50)
The following test reproduces this:
#[test]
fn test_boolean_array_from() {
let values = vec![Some(true), None, Some(true), Some(false)]
.into_iter()
.filter(Option::is_some)
.collect::<BooleanArray>();
assert_debug_snapshot!(values, @r"
BooleanArray
[
true,
true,
false,
null,
]
")
}
Expected behavior
I'd have expected the following Array (without the null):
BooleanArray
[
true,
true,
false,
]
Additional context
The result of the "same" operation on an Int64Array:
#[test]
fn test_int64_array_from() {
let values = vec![Some(1), None, Some(2), Some(3)]
.into_iter()
.filter(Option::is_some)
.collect::<Int64Array>();
assert_debug_snapshot!(values, @r"
PrimitiveArray<Int64>
[
1,
2,
3,
]
")
}