Skip to content

[IROutliner] MultiSource/Benchmarks/MiBench/network-patricia is larger with IR outliner enabled #55395

@ornata

Description

@ornata

Compiling for AArch64.

-Oz = 5.86% bigger with outliner
-O2 = 6.78% bigger with outliner

(To be fair this is a tiny benchmark)

This was done by compiling the LLVM test suite using LNT and looking at the size.__text results.

Reproducing

The IR for the worst size-increased file in network-patricia @ -O2 is here: https://godbolt.org/z/KGfEaMafb

Compiling this to assembly and using asm-printer remarks...

$ ~/llvm-project/build/bin/clang -S -O2 -Rpass-analysis=asm-printer /tmp/patricia-stripped.ll -o /dev/null
remark: <unknown>:0:0: 128 instructions in function [-Rpass-analysis=asm-printer]

$ ~/llvm-project/build/bin/clang -S -O2 -Rpass-analysis=asm-printer -mllvm -ir-outliner /tmp/patricia-stripped.ll -o /dev/null
remark: <unknown>:0:0: 144 instructions in function [-Rpass-analysis=asm-printer]
remark: <unknown>:0:0: 6 instructions in function [-Rpass-analysis=asm-printer]
remark: <unknown>:0:0: 5 instructions in function [-Rpass-analysis=asm-printer]

Baseline: 128 instructions
Outliner: 155 instructions

I think there are a couple issues here, one of them probably reduces down to this: https://godbolt.org/z/Pfq8oK68P

$ ~/llvm-project/build/bin/clang -S -O2 -Rpass-analysis=asm-printer /tmp/test.ll -o /dev/null 
remark: <unknown>:0:0: 7 instructions in function [-Rpass-analysis=asm-printer]
remark: <unknown>:0:0: 7 instructions in function [-Rpass-analysis=asm-printer]

$ ~/llvm-project/build/bin/clang -S -O2 -mllvm -ir-outliner -Rpass-analysis=asm-printer /tmp/test.ll -o /dev/null
remark: <unknown>:0:0: 4 instructions in function [-Rpass-analysis=asm-printer]
remark: <unknown>:0:0: 4 instructions in function [-Rpass-analysis=asm-printer]
remark: <unknown>:0:0: 7 instructions in function [-Rpass-analysis=asm-printer]

Baseline: 14 instructions total
Outliner: 15 instructions total

Analysis

Using llvm-remark-size-diff on network-patricia at -O2, we can see that the only difference is that main increased by 13 instructions, and two outlined functions were added.

++ > outlined_ir_func_0, 5 instrs, 0 stack B
++ > outlined_ir_func_1, 5 instrs, 0 stack B
== > main, 13 instrs, 0 stack B

### Summary ###
Total change: 
 instruction count: 23 (6.78%)
 stack byte usage: None

Then I just recompiled with clang -save-temps and yoinked out the bitcode.

cc @AndrewLitteken

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions