Skip to content

rustc -O results in larger stack frames than no-opt #39791

Closed
@yzarubin

Description

@yzarubin

I am rewriting some C++ code in Rust, and ran into an issue where certain code compiled with optimizations, actually results in larger stack frames than code compiled without optimizations leading to poorer recursive performance.

The code in question:

use std::collections::*;
static N: usize = 1800;

fn go(i: usize, a: i64, b: i64, memo: &mut Vec<HashMap<i64, HashMap<i64, i64>>>) -> i64 {
  if i == N { return 0 }

  let ans = go(i + 1, a + 2, b - 1, memo);
  memo[i].entry(a).or_insert(HashMap::new()).insert(b, ans);

  return ans ^ a ^ b;
}

fn main() {
  let mut memo = vec![HashMap::<i64, HashMap<i64, i64>>::new(); N];
  let k = go(0, 0, 0, &mut memo);
  println!("{}", k);
}

On my machine, this code runs fine when compiled with rustc, but will result in SO when compiled with rustc -O. From my testing, optimizations result in a stack frame twice the size of no-opt. I suspect it has something to do with the hashmap usage, but I haven't dug deeper as I wanted to see if this is a known issue first.

Another comment I'd like to make, is that it seems to me, that both the -O and no-opt variants result in much poorer performance than equivalent C++ compiled with clang.

The equivalent C++11 program:

#include <vector>
#include <unordered_map>

using namespace std;

typedef long long ll;
ll N = 100000;

ll go(ll i, ll a, ll b, vector<unordered_map<ll, unordered_map<ll, ll>>> &memo) {
  if (i == N) return 0;

  auto ans = go(i + 1, a + 2, b - 1, memo);
  memo[i][a][b] = ans;

  return ans ^ a ^ b;
}

int main () {
  vector<unordered_map<ll, unordered_map<ll, ll>>> memo(N);
  auto k = go(0, 0, 0, memo);
  printf("%lld\n", k);
  return 0;
}

When compiled on my machine (Darwin 14.5.0) with g++ -std=c++11 -O3 , it works for N up to 100000, which is almost 100x better than Rust.

rustc --version --verbose
rustc 1.15.1 (021bd294c 2017-02-08)
binary: rustc
commit-hash: 021bd294c039bd54aa5c4aa85bcdffb0d24bc892
commit-date: 2017-02-08
host: x86_64-apple-darwin
release: 1.15.1
LLVM version: 3.9

g++ --version
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin14.5.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    E-easyCall for participation: Easy difficulty. Experience needed to fix: Not much. Good first issue.I-slowIssue: Problems and improvements with respect to performance of generated code.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions