Skip to content

Unexpected performance of hash maps #785

Open
@jannisteunissen

Description

@jannisteunissen

Today I started exploring the stdlib (which I think is a great effort!) and I did a performance test of the hash map implementation. The test I did is rather simple: generate keys (with duplicates), and then add a key if it is not yet present, and remove a key if it is already present. The code is copied below. I compared performance against another Fortran hash map implementation I once made (https://github.com/jannisteunissen/ffhash), which was based on khash from https://github.com/attractivechaos/klib

With standard compile flags (-O2) and a test size of 5 million, on an Intel i7-1185G7 laptop CPU, I got the following numbers:

open_hashmap_type from stdlib, using fnv_1_hasher:
Elapsed time (s) 0.2749E+01
Entries/s 0.1819E+07

while the other code (ffhash, murmur3 hasher)

Elapsed time (s) 0.2541E+00
Entries/s 0.1968E+08

A factor 10 seems a bit much. I would expect some overhead due to the class-type interface, but not this much. Perhaps this benchmark can be useful in improving the performance.

program open_hashmap_benchmark
  use stdlib_kinds, only: dp, int8, int32
  use stdlib_hashmaps, only : open_hashmap_type
  use stdlib_hashmap_wrappers

  implicit none

  type(open_hashmap_type)     :: map
  integer, parameter          :: n_max = 5*1000*1000
  integer                     :: n
  integer(int32), allocatable :: keys(:)
  integer, allocatable        :: key_counts(:)
  real(dp)                    :: t_start, t_end
  real(dp), allocatable       :: r_uniform(:)
  type(key_type)              :: key
  logical                     :: present
  integer(int8)               :: int32_as_int8(4)

  call map%init(fnv_1_hasher, slots_bits=10)

  allocate(keys(n_max), r_uniform(n_max))

  call random_number(r_uniform)
  keys = nint(r_uniform * n_max * 0.25_dp)

  call cpu_time(t_start)
  do n = 1, n_max
     int32_as_int8 = transfer(keys(n), int32_as_int8)
     call set(key, int32_as_int8)

     call map%key_test(key, present)

     if (present) then
        call map%remove(key)
     else
        call map%map_entry(key)
     end if
  end do
  call cpu_time(t_end)

  write(*, "(A,E12.4)") "Elapsed time (s) ", t_end - t_start
  write(*, "(A,E12.4)") "Entries/s        ", n_max/(t_end - t_start)
  write(*, "(A,I12)")   "n_occupied       ", map%entries()
  write(*, "(A,I12)")   "n_buckets        ", map%num_slots()

  ! Count number of keys that occur an odd number of times
  allocate(key_counts(minval(keys):maxval(keys)))
  key_counts = 0
  do n = 1, n_max
     key_counts(keys(n)) = key_counts(keys(n)) + 1
  end do
  n = sum(iand(key_counts, 1))

  if (n /= map%entries()) then
     error stop "FAILED"
  else
     print *, "PASSED"
  end if

  ! Clean up allocated storage
  deallocate(key_counts)

end program open_hashmap_benchmarka

The benchmark I compared against is https://github.com/jannisteunissen/ffhash/blob/master/example_benchmark.f90

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions