Description
I noticed that if a function is marked #[inline]
in an upstream crate, then even if upstream have already generated a function instance, downstream creates will not reuse the existing instance, instead, they will generate their own instance, which causes duplicated symbols generated in the binary file, which causes size bloat.
Note that this happens in the situation where a function is marked #[inline]
, but not actually being inlined, so the function will have a symbol entry of its own.
To reproduce, you can make a project with the following dependency graph:
+----------+
| upstream |
+----------+
/ | \
/ | \
+------+ | +-------+
| left | | | right |
+------+ | +-------+
\ | /
\ | /
+------------+
| downstream |
+------------+
where a fairly complex (to prevent being inlined) inline function is defined in upstream
crate, and all upstream
, left
and right
crates calls the inline function. downstream
crate is for collecting all symbols generated in all the crates. I wrote a shell script for generating a such project locally (Usage: <SCRIPT> <PROJECT PATH>
):
#!/bin/sh -ex
mkdir -p "$1"
cd "$1"
# Upstream.
cargo new --lib upstream
echo '#[inline]
pub fn inline_function(x: u32) {
if x != 42 {
if x % 2 == 0 {
inline_function(x / 2)
} else {
inline_function(x * 3 + 1)
}
}
std::hint::black_box(x);
}
pub fn instance(n: u32) {
inline_function(n)
}' > 'upstream/src/lib.rs'
# Left and right.
for x in left right; do
cargo new --lib "$x"
cargo add --manifest-path "$x/Cargo.toml" --path upstream
echo 'pub fn instance(x: u32) {
upstream::inline_function(x)
}' > "$x/src/lib.rs"
done
# Downstream.
cargo init
cargo add --path 'upstream'
cargo add --path 'left'
cargo add --path 'right'
echo '#[no_mangle]
extern "C" fn entry(x: u32) {
upstream::instance(x);
left::instance(x);
right::instance(x);
}' > "src/lib.rs"
# Build.
cargo rustc --lib --crate-type cdylib --release
You can inspect the result binary using llvm-nm
and llvm-objdump
. In my case, llvm-nm
gives me the following output:
0000000000003ee0 t __ZN3top15inline_function17hcd434eca693902bbE
0000000000003f20 t __ZN3top15inline_function17hcd434eca693902bbE
0000000000003f60 t __ZN3top15inline_function17hcd434eca693902bbE
0000000000003f90 t __ZN3top8instance17hc175d5b93e3a6803E
0000000000003f50 t __ZN4left8instance17h5cc7aa4e08943d83E
0000000000003f10 t __ZN5right8instance17hcefb8e0167b5830aE
0000000000003eb0 T _entry
Note the three duplicated __ZN3top15inline_function17hcd434eca693902bbE
symbol. And using llvm-objdump
, I got three pieces of duplicated assembly code:
...
0000000000003ee0 <__ZN3top15inline_function17hcd434eca693902bbE>:
3ee0: 55 pushq %rbp
3ee1: 48 89 e5 movq %rsp, %rbp
3ee4: 53 pushq %rbx
3ee5: 50 pushq %rax
3ee6: 89 fb movl %edi, %ebx
3ee8: 83 ff 2a cmpl $42, %edi
3eeb: 74 13 je 0x3f00 <__ZN3top15inline_function17hcd434eca693902bbE+0x20>
...
0000000000003f20 <__ZN3top15inline_function17hcd434eca693902bbE>:
3f20: 55 pushq %rbp
3f21: 48 89 e5 movq %rsp, %rbp
3f24: 53 pushq %rbx
3f25: 50 pushq %rax
3f26: 89 fb movl %edi, %ebx
3f28: 83 ff 2a cmpl $42, %edi
3f2b: 74 13 je 0x3f40 <__ZN3top15inline_function17hcd434eca693902bbE+0x20>
...
0000000000003f60 <__ZN3top15inline_function17hcd434eca693902bbE>:
3f60: 55 pushq %rbp
3f61: 48 89 e5 movq %rsp, %rbp
3f64: 53 pushq %rbx
3f65: 50 pushq %rax
3f66: 89 fb movl %edi, %ebx
3f68: 83 ff 2a cmpl $42, %edi
3f6b: 74 13 je 0x3f80 <__ZN3top15inline_function17hcd434eca693902bbE+0x20>
...
And If I remove the #[inline]
attribute in the upstream
crate, there will be no duplicated symbols.
"fat"
LTO seems to be able to merge the duplicated symbols, but not all project can enable this option, so is it possible to fix this problem even if "fat"
LTO is not used? Also, codegen-units=1
does not seem to help.