-
Notifications
You must be signed in to change notification settings - Fork 13.3k
rustc -O results in larger stack frames than no-opt #39791
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@yzarubin The second one is just a difference in what builtin HashMap does on top of C++ unordered_map, which was discussed several times before. You might want to see some previous discussions here https://users.rust-lang.org/t/hashmap-performance/6476 and FAQ here https://www.rust-lang.org/en-US/faq.html#why-are-rusts-hashmaps-slow. |
@RReverser Thanks for the comment. I was aware that Rust's HashMap was slower than unordered_map, but why is there such a difference in stack space? I'd expect that Rust code to work up to |
@yzarubin Sure, that's why I said "the second one" only :) |
C++'s hash function for integers is trivial - I think it may be a no-op? SipHash is very much nontrivial and if it's inlined you'd see some amount of increased stack use. If |
@sfackler I'm not sure inlining should result in increased stack usage, rather decreased one as at the very least return addresses can be eliminated from stack, and ideally some local variables can be propagated to registers too. So it appears to me that this is a bug. |
In addition, changing HashMap to BTreeMap had little to no impact on performance. |
@RReverser if the functions are not inlined, they are not consuming stack at the point the |
@sfackler D'oh, I didn't actually notice it recurses. Fair point. |
Looks like some change in 1.17 made |
I'm marking as E-easy since the initial patch to file a PR should involve adding the two annotations to this method here: https://github.com/rust-lang/rust/blob/master/src/libstd/collections/hash/map.rs#L757. |
I'll take this on, if that's alright. |
This adds the `inline(never)` and `cold` annotations to the HashMap::resize function.
I've sent a PR for the changes. |
Add annotations to the resize fn #39791 This adds the `inline(never)` and `cold` annotations to the HashMap::resize function.
@Mark-Simulacrum This should have been closed by #43093. |
I am rewriting some C++ code in Rust, and ran into an issue where certain code compiled with optimizations, actually results in larger stack frames than code compiled without optimizations leading to poorer recursive performance.
The code in question:
On my machine, this code runs fine when compiled with
rustc
, but will result in SO when compiled withrustc -O
. From my testing, optimizations result in a stack frame twice the size of no-opt. I suspect it has something to do with the hashmap usage, but I haven't dug deeper as I wanted to see if this is a known issue first.Another comment I'd like to make, is that it seems to me, that both the -O and no-opt variants result in much poorer performance than equivalent C++ compiled with clang.
The equivalent C++11 program:
When compiled on my machine (Darwin 14.5.0) with
g++ -std=c++11 -O3
, it works forN
up to 100000, which is almost 100x better than Rust.The text was updated successfully, but these errors were encountered: