-
Notifications
You must be signed in to change notification settings - Fork 15k
Description
Compiling for AArch64.
-Oz = 5.86% bigger with outliner
-O2 = 6.78% bigger with outliner
(To be fair this is a tiny benchmark)
This was done by compiling the LLVM test suite using LNT and looking at the size.__text
results.
Reproducing
The IR for the worst size-increased file in network-patricia @ -O2 is here: https://godbolt.org/z/KGfEaMafb
Compiling this to assembly and using asm-printer remarks...
$ ~/llvm-project/build/bin/clang -S -O2 -Rpass-analysis=asm-printer /tmp/patricia-stripped.ll -o /dev/null
remark: <unknown>:0:0: 128 instructions in function [-Rpass-analysis=asm-printer]
$ ~/llvm-project/build/bin/clang -S -O2 -Rpass-analysis=asm-printer -mllvm -ir-outliner /tmp/patricia-stripped.ll -o /dev/null
remark: <unknown>:0:0: 144 instructions in function [-Rpass-analysis=asm-printer]
remark: <unknown>:0:0: 6 instructions in function [-Rpass-analysis=asm-printer]
remark: <unknown>:0:0: 5 instructions in function [-Rpass-analysis=asm-printer]
Baseline: 128 instructions
Outliner: 155 instructions
I think there are a couple issues here, one of them probably reduces down to this: https://godbolt.org/z/Pfq8oK68P
$ ~/llvm-project/build/bin/clang -S -O2 -Rpass-analysis=asm-printer /tmp/test.ll -o /dev/null
remark: <unknown>:0:0: 7 instructions in function [-Rpass-analysis=asm-printer]
remark: <unknown>:0:0: 7 instructions in function [-Rpass-analysis=asm-printer]
$ ~/llvm-project/build/bin/clang -S -O2 -mllvm -ir-outliner -Rpass-analysis=asm-printer /tmp/test.ll -o /dev/null
remark: <unknown>:0:0: 4 instructions in function [-Rpass-analysis=asm-printer]
remark: <unknown>:0:0: 4 instructions in function [-Rpass-analysis=asm-printer]
remark: <unknown>:0:0: 7 instructions in function [-Rpass-analysis=asm-printer]
Baseline: 14 instructions total
Outliner: 15 instructions total
Analysis
Using llvm-remark-size-diff
on network-patricia at -O2, we can see that the only difference is that main increased by 13 instructions, and two outlined functions were added.
++ > outlined_ir_func_0, 5 instrs, 0 stack B
++ > outlined_ir_func_1, 5 instrs, 0 stack B
== > main, 13 instrs, 0 stack B
### Summary ###
Total change:
instruction count: 23 (6.78%)
stack byte usage: None
Then I just recompiled with clang -save-temps
and yoinked out the bitcode.