[StackColoring] Change the StackColoring logic + enables it to handle spills #143800

Ralender · 2025-06-11T22:49:35Z

Here is a patch set I have been working on to reduce stack usage.
It starts with a bunch if trivial NFC-like commits, that can be easily handled separately
Then a replacement of the StackColoring algorithm, with all the original analysis and assumption kept.
Finally changes to allow the new algo in StackColoring to operate on spill slots

I know that lifetime annotations are a mess, so I tried to not change any of the assumptions made by the old StackColoring.

here is some examples of the current stack coloring's shortcoming:

void use(...);
void example(int test) {
  int e[4];
  int a, b, c, d;
  if (test)
    return use(&e);
  else
    return use(&a, &b, &c, &d);
}

one of the int will be merged with the array, and thats it.
when all 4 ints could share the space with the array

void bar(char *, int);
void foo(bool var) {
A: {
  char z[4096];
  bar(z, 0);
}

  char *p;
  char x[4096];
  char y[4096];
  if (var) {
    p = x;
  } else {
    bar(y, 1);
    p = y + 1024;
  }
B:
  bar(p, 2);
}

The current StackColoring is not capable of re-using the same storage for all 3 of these arrays
Even if its the example given in the comments of StackColoring.

Both those issues are fixed by the Algorithm change

StackColoring had TODO's about this issue and handling spill slots at the same time as regular slots

This patch set adds 2 options used to do the testing.
-mllvm -new-stack-coloring, this swaps out the algorithm for stack-coloring.
-mllvm -merged-stack-coloring, this moves stack-coloring after regalloc instead of stack-slot-coloring

I have not yet updated tests because all of the work is behind options
and didn't added golden tests YET because the code has kept changing. But I obviously will add tests. before any merging.

Testing Done on this patch:

Basic correctness checked on:
- x86_64-linux-gnu
- i386-linux-gnu
- x86_64-windows-pc
- aarch64-linux-gnu
- arm-linux-gnueabihf
- riscv64-linux-gnu
- I need to do basic correctness checking on targets where the stack grows up.
Deeper testing and benchmarking for x86_64-linux-gnu:
- Passes the correctness part of the llvm-test-suite
- All Benchmark and example presented.

Benchmark results (mean):
Note: all benchmarks are done with release with assert build, on the llvm-test-suite
With max-candidates=unlimited, because it doesn't seem to have much impact on compile-time.

exec time: 0%
compile-time: 0%
binary size: +0.1%
I suspect this is caused by some frequently accessed slot not being close to the sp,
leading to larger variants of the instruction being used. But I need to investigate
prologepilog.NumBytesStackSpace: -1.5% and no regressions

Note: The cleanup of the git history was only partial for now, its intended mostly help review.

github-actions · 2025-06-11T22:51:55Z

✅ With the latest revision this PR passed the C/C++ code formatter.

thanm · 2025-06-11T22:57:28Z

I am willing to take a look, but for various reasons I won't be able to dig in for a couple of months (I am away from my work computers). Hope you can make progress without my eyeballs for now. Thanks.

efriedma-quic · 2025-06-11T23:29:13Z

The cleanup commits look reasonable; please submit them separately.

The IR representation of lifetimes has some pretty significant issues (see #45725 (comment) , #51838, #132085, #43674). Will this cause those issues to be exposed more frequently?

Ralender · 2025-06-12T10:48:30Z

The cleanup commits look reasonable; please submit them separately.

Yes,

The IR representation of lifetimes has some pretty significant issues (see #45725 (comment) , #51838, #132085, #43674). Will this cause those issues to be exposed more frequently?

Here are my thoughts on the problems in general, I think there is 2 distinct issues:
Pointer comparisons self-consistency:
The only viable solution I see is tagging allocas, that need to stay distinct from each other and ensuring alias analysis and StackColoring respect it. and whenever the IR folds a icmp based on distinct provenance, it tags the allocas as distinct from each other. this keeps folding icmp simple. and it would not be legal to remove a tag from an alloca under any circumstances. I/We need to experiment, but it may not have much of negative impact.

IR transformation moving accesses out of lifetime, causing UB:
I have only seen it for loads, so maybe loads outside of lifetime could be defined as poison instead of UB, as has been proposed.
maybe an other solution is to say that lifetime annotation only have meaning for captures, and accesses outside of it are legal. alias analysis could still assume that a call outside of lifetime cannot access an escaped pointer, but potential load or store outside of lifetime would cause extension of the lifetime. Captures scopes cannot be changed by optimizations and everything else should analyzable even if we may need to be pessimist when multiple objects could be accessed by a load. the consequence is more complex analysis to figure out the real lifetime, and being pessimist while doing so may cause some regressions.

About the question, original question:
The old and new StackColoring should always agree on whether 2 slots can overlap or not.
I checked it at some point by having an assert during development (it was removed since).
That said, the new algorithm reduces stack space but getting more overlaps between slots.
So some slots that would not have overlapped with the old algo because of different decisions may overlap now.
Spills are unaffected by all the lifetime issues.

…loring

Without it it can lead to crazy situtation, when passing a LiveRange* to a function expecting a const LiveRange& the LiveRange* is converted to bool and a new empty LiveRange is created.

…o use NumDigits

…neInstructionElim

…+ deal with direct fallout

Ralender · 2025-06-22T12:17:39Z

After a bit of heuristics tuning, I got rid of the regression in average on code-size
here are the new results:

exec-time: 0%
compile-time: 0%
code-size: -0.1%
prologepilog.NumBytesStackSpace: -1.4% and no regressions

Ralender requested review from arsenm, nikic, thanm and efriedma-quic June 11, 2025 22:49

Ralender self-assigned this Jun 11, 2025

Ralender added llvm:codegen llvm:optimizations labels Jun 11, 2025

Ralender mentioned this pull request Jun 12, 2025

[NFC] Various Cleanup in StackColoring, StackSlotColoring, LiveStacks #143931

Open

Ralender added 18 commits June 19, 2025 23:06

[NFC][StackColoring] Remove unused member for StackColoring

f9c024d

[NFC][StackColoring] Use LiveRange instead of LiveInterval in StackCo…

48f34ec

…loring

[NFC] make constructor explicit LiveRange

8338498

Without it it can lead to crazy situtation, when passing a LiveRange* to a function expecting a const LiveRange& the LiveRange* is converted to bool and a new empty LiveRange is created.

[NFC] MachineFrameInfo::print add a bit more informations

5aeb0f0

[NFC][StackSlotColoring] Remove dead code

0ee9f61

[NFC][StackColoring] Use block numbers instead of maps

a352f1e

[NFC] Move NumDigits to MathExtras.h and update some users of log10 t…

5a1a346

…o use NumDigits

[NFC] Improve debug output of StackColoring

353a1e5

[NFC][LiveStacks] Use vectors instead of map and unordred_map

3c5f135

[NFC][CodeGen] Cleanup lifetime in StackColoring instead of DeadMachi…

0ff8b6c

…neInstructionElim

[CodeGen] Add option to move StackColoring after register allocation …

eef9b6e

…+ deal with direct fallout

Add new StackColoring algo

fb19fd0

Start rebuild lifetimes for spill slots

09de6f3

Fix bug + add comments + reduce ammount of debug prints

78e9bca

[NFC] Make StackColoring debug mode more concise

74054b5

[NFC] Cleanup + comments

709b55a

Cleanup the Diff

0472ecd

format StackColoring.cpp

df2fb92

Update selection heristics to avoid code-size regression in average

f7ae304

Ralender force-pushed the NewStackColoring branch from 595d5af to f7ae304 Compare June 22, 2025 12:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[StackColoring] Change the StackColoring logic + enables it to handle spills #143800

[StackColoring] Change the StackColoring logic + enables it to handle spills #143800

Uh oh!

Ralender commented Jun 11, 2025

Uh oh!

github-actions bot commented Jun 11, 2025 •

edited

Loading

Uh oh!

thanm commented Jun 11, 2025

Uh oh!

efriedma-quic commented Jun 11, 2025

Uh oh!

Ralender commented Jun 12, 2025 •

edited

Loading

Uh oh!

Ralender commented Jun 22, 2025

Uh oh!

Uh oh!

[StackColoring] Change the StackColoring logic + enables it to handle spills #143800

Are you sure you want to change the base?

[StackColoring] Change the StackColoring logic + enables it to handle spills #143800

Uh oh!

Conversation

Ralender commented Jun 11, 2025

Uh oh!

github-actions bot commented Jun 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

thanm commented Jun 11, 2025

Uh oh!

efriedma-quic commented Jun 11, 2025

Uh oh!

Ralender commented Jun 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Ralender commented Jun 22, 2025

Uh oh!

Uh oh!

github-actions bot commented Jun 11, 2025 •

edited

Loading

Ralender commented Jun 12, 2025 •

edited

Loading