Skip to content

BitvSet intersect_with has incorrect behavior when called with an empty set #16542

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
kaseyc opened this issue Aug 16, 2014 · 10 comments
Closed

Comments

@kaseyc
Copy link
Contributor

kaseyc commented Aug 16, 2014

The intersection of a BitvSet with an empty (returned from BitvSet::new()) set is incorrect, as it will return the original set. However, if the arguments are reversed (an empty BitvSet calls intersect_with with some set as a parameter), the correct result is returned.

Additionally, if the set returned by new is modified (e.g. add and remove a value), the correct result will be returned.

Examples:
http://is.gd/Aksaz6
http://is.gd/6zTo9R

@Gankra
Copy link
Contributor

Gankra commented Aug 16, 2014

Alright, so the bug is that intersect_with delegates to an internal method other_op, which iterates over other's underlying uints to set the bits of self. So if self occupies more uints than other, it doesn't actually touch those extra bits.

other_op has no documentation, so I'm not sure if intersect_with is using it wrong, or if other_op itself is wrong. Investigating further usage now.

@kaseyc
Copy link
Contributor Author

kaseyc commented Aug 16, 2014

That makes sense. I think other_op is wrong, or to be more precise it is flawed in a way that only intersect exposes (for union and difference, leaving the higher bits unchanged is technically correct). It looks like some people went through and cleaned up bitv.rs in July, so I think the breakage happened then, as the function worked properly before.

@Gankra
Copy link
Contributor

Gankra commented Aug 16, 2014

I guess one underlying question is if a BitVSet should be assumed to be all 1's or all 0's or undefined out of its known range. My intuition says all 0's, however the current implementation of BitV seems to be "undefined". For instance:

let mut a = BitvSet::from_bitv(bitv::from_bytes([0b00000000]));
let b = BitvSet::new();
println!("{}", a == b);

prints false.

@kaseyc
Copy link
Contributor Author

kaseyc commented Aug 17, 2014

I have a feeling that that is due to the mbits of the two being different, as BitvSet just derives Eq from Bitv:

https://github.com/rust-lang/rust/blob/master/src/libcollections/bitv.rs#L866

@Gankra
Copy link
Contributor

Gankra commented Aug 17, 2014

I suppose this could be interpreted as two BitVSets with differing (implementation defined!) sizes representing subsets of different universes, and are therefore unequal, but that's... stupid?

CC @aturon: Agree that BitVSet should be changed to behave as if it was a set over all integers, and therefore internally should behave like an infinite stream of 0's outside of its known bounds?

@kaseyc
Copy link
Contributor Author

kaseyc commented Aug 17, 2014

I agree with @gankro . A BitvSet is a set implemented using Bitv's, though they do not represent the same things, and it makes more sense for a BitvSet to act as if it is all 0's outside of the bounds.

@Gankra
Copy link
Contributor

Gankra commented Aug 17, 2014

Wait yeah I'm thinking of BitVSet too much in terms of BitV's. Of course it should be a set over all integers!

Although now I'm wondering if BitV's should behave as if they were 0's forever out of sight, instead of fail horribly on OOB accesses.

Edit: Probably not. Probably.

@Gankra
Copy link
Contributor

Gankra commented Aug 17, 2014

CC @apoelstra

@apoelstra
Copy link
Contributor

Hey all,

other_op is carried over from the old implementation, so I'm surprised that it's causing new problems now. My recommendation is to drop it entirely, it did cause me a fair bit of pain inferring its behaviour, which apparently I did wrong :).

The correct behaviour is definitely that BitvSet should assume all zeros outside of the range of its underlying Bitv. This was the behaviour I intended when I did the cleanup.

@kaseyc
Copy link
Contributor Author

kaseyc commented Aug 17, 2014

@gankro You rock!

@bors bors closed this as completed in 8c9bdda Aug 18, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants