Skip to content

Performance regression since v0.32-beta.16 for debug builds with profile overwrites #1048

@hanna-kruppe

Description

@hanna-kruppe

First off, thank you for wasmi and congrats on the recent 0.32 release! I recently swapped out wasmtime for wasmi 0.32-beta.16 in a certain project (not yet public) and was quite happy with it. The build got faster and smaller, the project got more portable, and wasmi's balance of startup latency + wasm execution speed worked much better for that project than wasmtime's (even with winch, which was already an improvement over cranelift). Many tests in that project compile a medium-sized wasm module and run it for short but nontrivial amount of time, and switching to wasmi v0.32.0-beta.16 made those tests run faster.

Unfortunately and surprisingly, when I tried to update to v0.32.0-beta.18 and later to v0.32.0, I found that it got 5x to 6x slower in the configuration I care about the most: building my project in the dev/test profile but enabling optimizations for wasmi and wasmi_core (via profile overrides). I've managed to minimize it down to a 1 KiB wasm module and a fairly trivial embedding: wasmi-slow-repro.tar.gz. In that tarball:

  • The compiled wasm module is included for completeness but ought to be reproducible
  • The two host-* crates do the same thing with different wasmi versions: instantiate the guest and run its sole export
  • compare.sh builds everything and runs it through hyperfine

I would expect the performance to be the same for both wasmi versions, but in the dev profile (as exercised by the script) it differs:

Benchmark 1: host-beta16/target/debug/host
  Time (mean ± σ):      67.8 ms ±   2.9 ms    [User: 66.9 ms, System: 0.9 ms]
  Range (min … max):    65.3 ms …  76.0 ms    40 runs

Benchmark 2: host-newer/target/debug/host
  Time (mean ± σ):     382.9 ms ±  19.7 ms    [User: 381.9 ms, System: 1.0 ms]
  Range (min … max):   369.1 ms … 434.2 ms    10 runs

Summary
  host-beta16/target/debug/host ran
    5.65 ± 0.38 times faster than host-newer/target/debug/host

This is on x86_64-linux-unknown-gnu, Rust 1.77.1, Intel i7-6700K CPU. Again note that wasmi and wasmi_core are compiled with optimizations in the debug profile. Removing the opt-level = 2 lines from the respective Cargo.toml files makes both programs much slower (both take ca. 1.7s on my machine). Building them with --release instead makes them perform the same, but that's of little use to me if I can't figure out how to get the same performance without building my entire project in release mode. I've tried various tweaks to the profile overrides, without success. I've also tried profiling, but all I can see is that 99% of the time is spent in wasmi's interpreter loop.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions