-
Notifications
You must be signed in to change notification settings - Fork 15.1k
Closed
Labels
llvm:instcombineCovers the InstCombine, InstSimplify and AggressiveInstCombine passesCovers the InstCombine, InstSimplify and AggressiveInstCombine passesmissed-optimization
Description
Alive2 proof: https://alive2.llvm.org/ce/z/7SnGmc
Description:
define i32 @src(i32 noundef %val) {
entry:
%div = udiv i32 %val, 10
%shl = shl i32 %div, 4
%rem = urem i32 %val, 10
%add = or disjoint i32 %shl, %rem
ret i32 %add
}
can be folded to:
define i32 @tgt(i32 noundef %val) {
entry:
%div = udiv i32 %val, 10
%reass.mul = mul nuw i32 %div, 6
%i = add i32 %reass.mul, %val
ret i32 %i
}
This is a simple example. or disjoint
can be replaced with add nsw nuw
, shl 4
can be replaced with equivalent mul 16
, and constants here can be replaced with other simple constants: https://alive2.llvm.org/ce/z/qmYLyJ
I don't replace constants with arguments because I haven't figure out completetly what condition is needed for this fold yet.
Real-world motivation
This snippet of IR is derived from linux/lib/bcd.c (after O3 pipeline).
The example above is a reduced version. If you're interested in the original suboptimal IR and optimal IR, see also:https://godbolt.org/z/hra9qYbY7
Let me know if you can confirm that it's an optimization opportunity, thanks.
Metadata
Metadata
Assignees
Labels
llvm:instcombineCovers the InstCombine, InstSimplify and AggressiveInstCombine passesCovers the InstCombine, InstSimplify and AggressiveInstCombine passesmissed-optimization