Skip to content

[std]: Use XXHash64 instead FNV-1a #1581

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 12 commits into from
Closed

[std]: Use XXHash64 instead FNV-1a #1581

wants to merge 12 commits into from

Conversation

MaxGraey
Copy link
Member

@MaxGraey MaxGraey commented Dec 14, 2020

This experiment with XXHash 64-bit hashes

See 32-bit here: #1580

FNV-1a:

  • Poor uniform distribution. Only up to 14 bits can be used for the hash table (up to 2 ^ 14 hash table slots) without degradation
  • Bad avalanche effect
  • Speed: 4.5 cyclers per byte

XXHash:

  • Excellent uniform distribution

  • Near-perfect avalanche effect

  • Speed: ~1.7 cyclers per byte

  • I've read the contributing guidelines

@MaxGraey MaxGraey requested a review from dcodeIO December 21, 2020 21:20
@@ -58,7 +58,7 @@ export class Map<K,V> {

// buckets referencing their respective first entry, usize[bucketsMask + 1]
private buckets: ArrayBuffer = new ArrayBuffer(INITIAL_CAPACITY * <i32>BUCKET_SIZE);
private bucketsMask: u32 = INITIAL_CAPACITY - 1;
private bucketsMask: u64 = INITIAL_CAPACITY - 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit confused by the u32/u64 changes in Map and Set. For instance, u64 isn't really necessary here given that ArrayBuffer capacity is capped, but also may make sense given that hashes are 64-bit now, not sure. But then, further down below, we still have rehash(newBucketsMask: u32) or halfBucketsMask = <u32>(this.bucketsMask >> 1). Now, just using usize as a workaround also doesn't seem quite right, hmm. A strange mix.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tied just truncate u64 result to u32 inside hashes but it looks less optimal with more u64 -> u32 conversions

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this makes me a bit in favor of the 32-bit variant for now. Would you agree with a plan like, let's use 32-bit now, and once we have Memory64 support, revisit 64-bit? Iirc that's about what you suggested earlier as well (sorry).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I'm ok with that. Will sync XXH32

v = (v ^ <u32>load<u8>(changetype<usize>(key) + i)) * FNV_PRIME;
// @ts-ignore: decorator
@inline
function hashStr(key: string): u64 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a reference for the code in this function?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link

github-actions bot commented Dec 6, 2023

This PR has been automatically marked as stale because it has not had recent activity. It will be closed in one week if no further activity occurs. Thank you for your contributions!

@github-actions github-actions bot added the stale label Dec 6, 2023
Copy link

This PR has been automatically closed due to lack of recent activity, but feel free to reopen it as long as you merge in the main branch afterwards.

@github-actions github-actions bot closed this Dec 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants