-
Notifications
You must be signed in to change notification settings - Fork 13.6k
Closed
Labels
A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.A-codegenArea: Code generationArea: Code generationC-enhancementCategory: An issue proposing an enhancement or a PR with one.Category: An issue proposing an enhancement or a PR with one.C-optimizationCategory: An issue highlighting optimization opportunities or PRs implementing suchCategory: An issue highlighting optimization opportunities or PRs implementing suchE-needs-testCall for participation: An issue has been fixed and does not reproduce, but no test has been added.Call for participation: An issue has been fixed and does not reproduce, but no test has been added.I-slowIssue: Problems and improvements with respect to performance of generated code.Issue: Problems and improvements with respect to performance of generated code.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.Relevant to the compiler team, which will review and decide on the PR/issue.
Description
If a signed integer is divided by a power of two using Euclidean division, it is equivalent to an arithmetic shift, but this is not caught. For example
pub fn foo(a: i32) -> i32 {
a.div_euclid(4)
}
produces the assembly code
example::foo:
mov eax, edi
sar eax, 31
shr eax, 30
add eax, edi
mov ecx, eax
sar ecx, 2
and eax, -4
sub edi, eax
sar edi, 31
lea eax, [rdi + rcx]
ret
while it could simply be
example::foo:
mov eax, edi
sar eax, 2
ret
The LLVM IR is
define i32 @_ZN7example3foo17h28d3a223d67ddccdE(i32 %a) unnamed_addr #0 !dbg !6 {
start:
%q.i = sdiv i32 %a, 4, !dbg !9
%_11.i = srem i32 %a, 4, !dbg !16
%_11.lobit.i = ashr i32 %_11.i, 31, !dbg !18
%.0.i = add nsw i32 %_11.lobit.i, %q.i, !dbg !18
ret i32 %.0.i, !dbg !19
}
attributes #0 = { norecurse nounwind nonlazybind readnone uwtable "probe-stack"="__rust_probestack" "target-cpu"="x86-64" }
which I interpret to be something like
pub fn foo_ir(a: i32) -> i32 {
let q = a / 4;
let r = a % 4;
let sign_r = if r < 0 { -1 } else { 0 };
sign_r + q
}
(I did check that this function produces the same LLVM IR.)
I don't know LLVM enough to know whether there is some kind of flooring division intrinsic that could be used to optimize this, or whether this a missed optimization in the LLVM side and LLVM can recognize that pattern into an arithmetic shift.
fanzier and MSxDOS
Metadata
Metadata
Assignees
Labels
A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.A-codegenArea: Code generationArea: Code generationC-enhancementCategory: An issue proposing an enhancement or a PR with one.Category: An issue proposing an enhancement or a PR with one.C-optimizationCategory: An issue highlighting optimization opportunities or PRs implementing suchCategory: An issue highlighting optimization opportunities or PRs implementing suchE-needs-testCall for participation: An issue has been fixed and does not reproduce, but no test has been added.Call for participation: An issue has been fixed and does not reproduce, but no test has been added.I-slowIssue: Problems and improvements with respect to performance of generated code.Issue: Problems and improvements with respect to performance of generated code.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.Relevant to the compiler team, which will review and decide on the PR/issue.