-
Notifications
You must be signed in to change notification settings - Fork 20
Description
As promised in golang/go#54766 (comment), some issues I discovered with this design while creating the Go runtime version.
As a reminder, the language spec states: "The iteration order over maps is not specified and is not guaranteed to be the same from one iteration to the next. If a map entry that has not yet been reached is removed during iteration, the corresponding iteration value will not be produced. If a map entry is created during iteration, that entry may be produced during the iteration or may be skipped. The choice may vary for each entry created and from one iteration to the next. If the map is nil, the number of iterations is 0."
For what it's worth, your All just states "The map can be mutated during iteration, though there is no guarantee that the mutations will be visible to the iteration." which is less restrictive as written, but for the purposes of this issue I am assuming the spec definition.
Entries deleted after grow may be produced
Suppose:
- Start iteration
- Add new entries to the map, enough to trigger the map to grow.
- Delete an entry that existed prior to iteration start.
- Continue iteration; deleted entry may be produced.
Per the spec above, deleted entries must not be produced during iteration.
This occurs because iterations snapshots the groups in the current bucket during iteration.
Snapshotting makes handling ordering across resize/split simple. Resize will replace the groups, and the entries will be shuffled in the new groups. By continuing to iterate over the old groups, we don't need to worry about the shuffling. This takes advantage of "If a map entry is created during iteration, that entry may be produced during the iteration or may be skipped." because new entries will go into the new groups and iteration will skip them entirely.
But deletion will also target only the new groups, so iteration will still see the deleted entries.
In the runtime, iteration still snapshots the groups. I handle this problem by making iteration detect if the bucket (called a table in the runtime) has grown, and if so we consult the grown table to see if the entry still exists before returning the entry. Things get complicated for NaNs, but otherwise it is straightforward.
Entries updated after grow may be stale
Same issue as above, except instead of deleting an entry after grow, you change its value. Because iteration is using the old groups snapshot it won't see the new value and will return the old one instead.
The fix is the same as above, but it is important to return the value from the new groups, not just check existence.
While writing this bug report I realized that the spec doesn't explicitly guarantee that iteration won't produce stale entries. I think that is an oversight in the spec and I'll look into fixing it.
Rehash in place shuffles groups
Resize and split allocate new groups, so iteration snapshots work fine. However, rehash in place uses the same groups (hence "in place"), reordering the entries to eliminate tombstones.
If this occurs in the middle of iteration on a bucket, the entry order will be completely shuffled and iteration may miss entries or produce duplicate entries.
I did not find a good solution to this problem and eliminated rehashing in place from the runtime implementation. An alternative would be "same size resize", which allocates new groups of the same size, copying the entries over to eliminate tombstones. I have not done this in the runtime, but it is a potential follow-up.