Description
This bug was originally discovered at ClangBuiltLinux/linux#1215 by attempting to build Linux with Clang+LTO. The following is a copy of the analysis from ClangBuiltLinux/linux#1215 (comment):
Here's a hand-minimized repro:
void f(long);
__attribute__((noinline)) static void fun(long x) {
f(x + 1);
}
void repro(void) {
fun(({
label:
(long)&&label;
}));
}
$ clang -O2 -flto -c repro.c -o repro.i
$ ld.lld repro.i
ld.lld: error: Never resolved function from blockaddress (...)
Relevant part of the LLVM IR, generated with:
$ clang -O2 -flto -c repro.c -fno-discard-value-names -S
...
define dso_local void @repro() #0 {
entry:
br label %label
label: ; preds = %entry
tail call fastcc void @fun()
ret void
}
define internal fastcc void @fun() unnamed_addr #1 {
entry:
tail call void @f(i64 add (i64 ptrtoint (i8* blockaddress(@repro, %label) to i64), i64 1)) #3
ret void
}
...
We can see clang figured out that the first argument to fun()
is always the same, so it inlined the value of the argument (which is the address of label
in repro
).
Searching LLVM code for the source of the error message gives
(there is another error with the same string, but building lld with a change to the error message shows that this is the one that triggers). Reading through this code, we see that BitcodeReader lazily reads an LLVM bitcode file. The reader appears to initially return placeholder objects that need to be "materialized" on demand.
The code has some special handling of blockaddress IR constants, as a blockaddress that appears within a function F may reference a basic block from a different function G (as is the case in the IR for the bug repro).
When materializing a function with a blockaddress(Fn, Fn_BB)
, the code needs to separately handle the cases where Fn
either has or has not yet been materialized. That logic appears here:
llvm-project/llvm/lib/Bitcode/Reader/BitcodeReader.cpp
Lines 2901 to 2927 in 4b55329
If Fn
has been materialized (the Fn->empty()
check), then the reader creates a BlockAddress referring to the appropriate Function/BasicBlock object. If Fn
has not been materialized, then the reader creates a placeholder BasicBlock for the BlockAddress and adds Fn
to a queue of functions to be materialized. When Fn
is materialized, it will make sure to use the precreated BasicBlock object.
After adding some debug prints, it appears that lld's use of BitcodeReader breaks this lazy blockaddress handling. In the repro case, the linker does materialize repro
before it attempts to materialize fun
. However, before it attempts to materialize fun
, it steals the basic blocks out of repro
's Function object here:
llvm-project/llvm/lib/Linker/IRMover.cpp
Lines 1116 to 1117 in 4b55329
This causes the Fn->empty()
check mentioned above to pass, which makes the code think that Fn
(repro
) has not yet been materialized. That code path enqueues repro
to be materialized a second time. When that happens, this code:
llvm-project/llvm/lib/Bitcode/Reader/BitcodeReader.cpp
Lines 162 to 163 in 4b55329
detects (using
IsMaterializable()
instead of empty()
) that repro
has already been materialized, and errors out.
This looks like a fairly old bug in lld. Fixing it looks tricky - it seems that BitcodeReader requires that materialized Functions aren't modified while there are still functions that will be materialized later, and lld violates that assumption. Maintaining this invariant seems like it might conflict with the goal of lazy reading. Maybe one idea is to maintain some reverse dependency information in bitcode files. Then materializing a function F could trigger immediate materialization of all functions with a blockaddress pointing into F.