Description
I was playing with http://rust.godbolt.org/ and noticed one weird thing where Rust (all of stable, beta and nightly versions) seems to prevent loop unroll and/or constant propagation optimizations.
Here is original minimal code using which I can reproduce an issue:
pub fn g0() -> bool {
vec![1,2,3].contains(&2)
}
So it basically creates a vector out of constant values with constant length & capacity, and then searches for a constant value within it, which is a perfect case for constant propagation. And yet the generated assembly looks like:
example::g0:
push rbp
mov rbp, rsp
push rbx
push rax
mov edi, 12
mov esi, 4
call __rust_allocate@PLT
test rax, rax
je .LBB0_6
movabs rcx, 8589934593
mov qword ptr [rax], rcx
mov dword ptr [rax + 8], 3
xor ecx, ecx
.LBB0_2:
cmp rcx, 12
je .LBB0_3
mov bl, 1
cmp dword ptr [rax + rcx], 2
lea rcx, [rcx + 4]
jne .LBB0_2
jmp .LBB0_5
.LBB0_3:
xor ebx, ebx
.LBB0_5:
mov esi, 12
mov edx, 4
mov rdi, rax
call __rust_deallocate@PLT
mov eax, ebx
add rsp, 8
pop rbx
pop rbp
ret
.LBB0_6:
call alloc::oom::oom@PLT
I tried writing down a plain "dumb" loop instead, with an assert
+ static loop bounds:
pub fn g1() -> bool {
let v = vec![1,2,3];
assert!(v.len() == 3);
for i in 0..3 {
if v[i] == 2 {
return true
}
}
return false
}
But output assembly is almost 100% the same (even though assert
was successfully removed by DCE, so apparently constant propagation works as expected).
However, unrolling this same loop by hand at the next step seems to suddenly enable optimization:
pub fn g2() -> bool {
let v = vec![1,2,3];
assert!(v.len() == 3);
if v[0] == 2 {
return true
}
if v[1] == 2 {
return true
}
if v[2] == 2 {
return true
}
return false
}
compiles to
example::g2:
push rbp
mov rbp, rsp
mov al, 1
pop rbp
ret
as originally expected.
Initially I though this is a missing attribute on Rust allocation functions that doesn't allow LLVM to reason about pointer contents, but after playing with replacing allocator with system one, replacing vectors with pure static-length slices ([i32; 3]
) and finally unrolling loop by hand as above, figured it's not an issue, otherwise 1) slices would be still optimized or 2) unrolled loop would still be not.
I would be happy to help to fix whatever is preventing those optimizations, but so far I don't really understand what's happening here or where to look for the potential bug.
Online playground URL with all these examples (plus slices): https://godbolt.org/g/TRQgg8