-
Notifications
You must be signed in to change notification settings - Fork 15.2k
Description
LLVM currently assumes that if a function preserves the value of a register, this register can be safely accessed from a landing pad even if the function unwinds.
For example (x86_64):
#include <stdio.h>
__attribute__((ms_abi, noinline)) void f() { throw 1; }
struct Dropper {
int x;
~Dropper() { printf("Dropper{%d}\n", x); }
};
__attribute__((noinline)) void test(int arg) {
Dropper dropper{arg};
f();
}
int main() try { test(1); } catch (...) {
}
f()
is marked as ms_abi
, where rdi
is callee-saved, so the value of arg
can be read out from rdi
after invoking f()
. That's what LLVM does both if f()
returns and in test
's cleanup pad.
Unfortunately, that's incorrect behavior. According to the Itanium C++ ABI, the landing pad can only rely on the registers that are callee-saved by the base ABI. For Linux, "base ABI" is the SysV ABI, where rdi
is caller-saved, so the unwinding library is not required to restore rdi
.
While LLVM's libunwind goes beyond the EH ABI requirements and restores rdi
, libgcc indeed doesn't (which I'd argue is compliant with the standard). When compiled against libgcc, the above code produces Dropper{random garbage}
instead of Dropper{1}
.
I've tested this on clang 18.1.8 from the Arch repos, built against libgcc. It compiles the test
function from above to
│ test(int)
1200 │ 53 push rbx
1201 │ 48 83 ec 20 sub rsp, 0x20
1205 │ 89 fe mov esi, edi
1207 │ e8 84 ff ff ff call f()
120c │ 48 89 c3 mov rbx, rax
120f │ 48 8d 3d ee 0d 00 00 lea rdi, [rel _IO_stdin_used+0x4]
1216 │ 31 c0 xor eax, eax
1218 │ e8 13 fe ff ff call printf@plt
121d │ 48 89 df mov rdi, rbx
1220 │ e8 5b fe ff ff call _Unwind_Resume@plt
1225 │ 66 66 2e 0f 1f 84 00 data16 cs nop word [rax+rax]
│ 00 00 00 00
which erroneously assumes rsi
is retained rather than rdi
, but it's the same bug anyway.