-
Notifications
You must be signed in to change notification settings - Fork 14.5k
Open
Labels
Description
I'm getting strange behavior when disassembling object files generated from C files vs. assembly:
$ cat << EOF > test.c
int testfn(int n) {
for (int i = 0; i < 5; i++) {
n += i;
}
return n;
}
EOF
$
$ # First generate directly from C file
$
$ clang -c test.c
$ llvm-nm -S test.o
00000000 00000048 T testfn
$
$ llvm-objdump -t test.o | grep testfn
00000000 g F .text 00000048 testfn
$
$ llvm-readobj --elf-output-style=GNU -s test.o | grep testfn
9: 00000000 72 FUNC GLOBAL DEFAULT 2 testfn
$
$ llvm-objdump --disassemble-symbols=testfn test.o
test.o: file format elf32-littleriscv
Disassembly of section .text:
00000000 <testfn>:
0: 1141 addi sp, sp, -0x10
2: c606 sw ra, 0xc(sp)
4: c422 sw s0, 0x8(sp)
6: 0800 addi s0, sp, 0x10
8: fea42a23 sw a0, -0xc(s0)
c: 4501 li a0, 0x0
e: fea42823 sw a0, -0x10(s0)
12: a001 j 0x12 <testfn+0x12>
14: ff042583 lw a1, -0x10(s0)
18: 4511 li a0, 0x4
1a: 00b54063 blt a0, a1, 0x1a <testfn+0x1a>
1e: a001 j 0x1e <testfn+0x1e>
20: ff042583 lw a1, -0x10(s0)
24: ff442503 lw a0, -0xc(s0)
28: 952e add a0, a0, a1
2a: fea42a23 sw a0, -0xc(s0)
2e: a001 j 0x2e <testfn+0x2e>
30: ff042503 lw a0, -0x10(s0)
34: 0505 addi a0, a0, 0x1
36: fea42823 sw a0, -0x10(s0)
3a: a001 j 0x3a <testfn+0x3a>
3c: ff442503 lw a0, -0xc(s0)
40: 40b2 lw ra, 0xc(sp)
42: 4422 lw s0, 0x8(sp)
44: 0141 addi sp, sp, 0x10
46: 8082 ret
$
$ # Now generate from assembly file
$
$ clang -S test.c
$ clang -c test.s
$ llvm-nm -S test.o | grep testfn
00000000 00000048 T testfn
$
$ llvm-objdump -t test.o | grep testfn
00000000 g F .text 00000048 testfn
$
$ llvm-readobj --elf-output-style=GNU -s test.o | grep testfn
9: 00000000 72 FUNC GLOBAL DEFAULT 2 testfn
$
$ llvm-objdump --disassemble-symbols=testfn test.o
test.o: file format elf32-littleriscv
Disassembly of section .text:
00000000 <testfn>:
0: 1141 addi sp, sp, -0x10
2: c606 sw ra, 0xc(sp)
4: c422 sw s0, 0x8(sp)
6: 0800 addi s0, sp, 0x10
8: fea42a23 sw a0, -0xc(s0)
c: 4501 li a0, 0x0
e: fea42823 sw a0, -0x10(s0)
12: a001 j 0x12 <testfn+0x12>
You can see that the disassembly output ends prematurely when going from C => assembly => object file compared to C => object file. The size of the testfn()
function is 72 (0x48) bytes in both cases. If I use the -d
option instead of --disassemble-symbols
, the entire file is disassembled properly for the assembly version.
The local labels are different for C => object file compared to C => assembly => object file:
# When compiled from C file
$ llvm-readobj --elf-output-style=GNU -s test.o
Symbol table '.symtab' contains 10 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 00000000 0 NOTYPE LOCAL DEFAULT UND
1: 00000000 0 FILE LOCAL DEFAULT ABS test.c
2: 00000000 0 NOTYPE LOCAL DEFAULT 2 $x
3: 00000014 0 NOTYPE LOCAL DEFAULT 2 .L0
4: 0000003c 0 NOTYPE LOCAL DEFAULT 2 .L0
5: 00000020 0 NOTYPE LOCAL DEFAULT 2 .L0
6: 00000030 0 NOTYPE LOCAL DEFAULT 2 .L0
7: 00000000 0 NOTYPE LOCAL DEFAULT 4 $d
8: 00000000 0 NOTYPE LOCAL DEFAULT 6 $d
9: 00000000 72 FUNC GLOBAL DEFAULT 2 testfn
$
# When compiled from assembly file
$ llvm-readobj --elf-output-style=GNU -s test.o
Symbol table '.symtab' contains 10 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 00000000 0 NOTYPE LOCAL DEFAULT UND
1: 00000000 0 FILE LOCAL DEFAULT ABS test.c
2: 00000000 0 NOTYPE LOCAL DEFAULT 2 $x
3: 00000014 0 NOTYPE LOCAL DEFAULT 2 .LBB0_1
4: 0000003c 0 NOTYPE LOCAL DEFAULT 2 .LBB0_4
5: 00000020 0 NOTYPE LOCAL DEFAULT 2 .LBB0_2
6: 00000030 0 NOTYPE LOCAL DEFAULT 2 .LBB0_3
7: 00000000 0 NOTYPE LOCAL DEFAULT 4 $d
8: 00000000 0 NOTYPE LOCAL DEFAULT 6 $d
9: 00000000 72 FUNC GLOBAL DEFAULT 2 testfn
I noticed that if I edit the assembly file and change the jump references to .LBB0_1
to another local label (.LBB0_3
), the disassembly output advances and stops at the next local label (.LBB0_2
):
$ sed -Ei 's/(j\s+\.LBB0_)1/\13/' test.s
$ clang -c test.s
$ llvm-objdump --disassemble-symbols=testfn test.o
test.o: file format elf32-littleriscv
Disassembly of section .text:
00000000 <testfn>:
0: 1141 addi sp, sp, -0x10
2: c606 sw ra, 0xc(sp)
4: c422 sw s0, 0x8(sp)
6: 0800 addi s0, sp, 0x10
8: fea42a23 sw a0, -0xc(s0)
c: 4501 li a0, 0x0
e: fea42823 sw a0, -0x10(s0)
12: a001 j 0x12 <testfn+0x12>
14: ff042583 lw a1, -0x10(s0)
18: 4511 li a0, 0x4
1a: 00b54063 blt a0, a1, 0x1a <testfn+0x1a>
1e: a001 j 0x1e <testfn+0x1e>
Tool versions:
$ clang -v
clang version 21.0.0git (https://github.com/llvm/llvm-project.git 30ff508614c90311509adc0890e32e7f86ec4fb8)
Target: riscv32-unknown-unknown-elf
$
$ llvm-objdump -v
LLVM (http://llvm.org/):
LLVM version 21.0.0git
Optimized build.
Registered Targets:
riscv32 - 32-bit RISC-V
riscv64 - 64-bit RISC-V