Skip to content

instruction-referencing creates DW_OP_deref_size with invalid/too large size #64093

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
dwblaikie opened this issue Jul 25, 2023 · 6 comments · Fixed by #123967
Closed

instruction-referencing creates DW_OP_deref_size with invalid/too large size #64093

dwblaikie opened this issue Jul 25, 2023 · 6 comments · Fixed by #123967

Comments

@dwblaikie
Copy link
Collaborator

From a discussion/debugging here: https://reviews.llvm.org/D154907

void b(double);
void c();
void e(double e) {
  c();
  b(e);
}
$ clang-tot x.ii -g -c -o - -O3 | llvm-dwarfdump-tot - | grep DW_OP_deref_size
                     [0x0000000000000006, 0x0000000000000016): DW_OP_breg7 RSP+0, DW_OP_deref_size 0x10, DW_OP_stack_value)

This seems to be created here:

OffsetOps.push_back(dwarf::DW_OP_deref_size);
OffsetOps.push_back(DerefSizeInBytes);
- where the size is 16 (0x10).

@llvmbot
Copy link
Member

llvmbot commented Jul 25, 2023

@llvm/issue-subscribers-debuginfo

@jmorse
Copy link
Member

jmorse commented Jul 27, 2023

/me squints -- IIRC, I've been using DW_OP_deref_size wherever the spill size has been expanded to be larger than the declared variable size. That isn't a problem for little-endian systems where the lower order bits end up in the same place, but I believe it's a problem on big-endian systems where that moves the variables position in the stack slot. There's at least one big-endian system that wanted DW_OP_deref_size, see https://reviews.llvm.org/D59687. Possibly I've been misunderstanding / misapplying that reasoning?

Obviously this isn't an issue for x86. Plus, digging through our internal bug tracker, apparently DW_OP_deref_size isn't supposed to load a size larger than the "generic" type, aka address size, so this case is wrong in all circumstances.

Simplest fix is to turn this off for little-endian targets (I'm not aware of anyone else using instr-ref), I'll cough up a patch tomorrow.

@markuslavin (via the Phab link above), is my reasoning about big-endian systems wanting the size of the target slot via DW_OP_deref_size correct?

@dwblaikie
Copy link
Collaborator Author

/me squints -- IIRC, I've been using DW_OP_deref_size wherever the spill size has been expanded to be larger than the declared variable size. That isn't a problem for little-endian systems where the lower order bits end up in the same place, but I believe it's a problem on big-endian systems where that moves the variables position in the stack slot. There's at least one big-endian system that wanted DW_OP_deref_size, see https://reviews.llvm.org/D59687. Possibly I've been misunderstanding / misapplying that reasoning?

Not sure I'm really following here, but kind of - yeah, you get a widened integer, for instance, and in LE, well the low bytes are close to the pointer, the high bytes are still zero (because it was widened - assuming this is just an integer, not a doubel or anything more complicated) then, yeah, you can just deref the existing/normal size. But BE the bytes close to the pointer are now the widened high bytes and you have to go further to get to the low bytes - so you need to increment the pointer and deref... but maybe not deref_size, though? In either case you're still trying to read, say, 4 bytes out of an 8 byte integer or whatever?

In this case it's a deref_size of 16 bytes - what happens on a LE architecture in that case? I don't think an unsized deref would work there either - because it'd only deref the address size of 8 bytes or whatever, not 16...

Obviously this isn't an issue for x86. Plus, digging through our internal bug tracker, apparently DW_OP_deref_size isn't supposed to load a size larger than the "generic" type, aka address size, so this case is wrong in all circumstances.

Right - that's the crux of this bug. deref_size isn't supposed to load a size larger than the generic type/address size. So you'd need to deref twice and stitch the values together, for instance, I guess.

@jmorse
Copy link
Member

jmorse commented Jul 28, 2023

Patch addressing main issue of an over-sized deref_size here: https://reviews.llvm.org/D156545

I think it's safe (and more efficient) to produce DW_OP_deref only on x86 at least. I'd like to implement that next and switch the relevant test for deref_size (llvm/test/DebugInfo/MIR/InstrRef/deref-spills-with-size.mir) to only apply to big endian things... however I'm out of the office for several weeks starting today alas, so this first patch is a stop gap.

@philipc
Copy link

philipc commented Aug 1, 2024

DW_OP_breg7 RSP+0, DW_OP_deref_size 0x10, DW_OP_stack_value

Why does this expression use DW_OP_deref_size 0x10, DW_OP_stack_value to make an implicit location, instead of simply being the memory location DW_OP_breg7 RSP+0? I'm asking because we've seen a similar problem for DW_OP_deref, where the value that this expression is meant to be for is larger than an address (gimli-rs/gimli#738).

Using a memory location instead of an implicit location would also avoid all the LE/BE issues.

@jmorse
Copy link
Member

jmorse commented Dec 10, 2024

I've been slowly digging into this for a while; the shallow explanation is that when we refer to (assuming x86) a single-or-double floating point number in an mmx register, we write something like:

DBG_VALUE, $xmm0, !123, !DIExpression()

indicating that the value is in the bottom lane of xmm0. This is all fine and well, because debug-info consumers are smart enough to infer that for double-precision floating point variables they should read the lower 64 bits of xmm0, and for single-precision they'll read 32 bits instead.

Where this becomes complicated though is when values become spilled to memory. Machines like x86 will widen spills-and-restores to their natural word size because the instruction encoding is smaller, thus a spill of a 32 bit value might become a spill+restore of a 64 bit register. Which LiveDebugValues is happy to track, and emit variable locations on the stack. It gets it wrong when describing those locations though, and in the xmm0 register example it emits a 16-byte DW_OP_deref_size, because it considers that a 16 byte register regardless of how much of the register the consumer will actually read.

The simple solution (which I'll do first) is to describe a memory location and allow the debugger to work out how much memory to read. However this can't be done when the DIExpression does computations. I started working out how to fix this, and then realised that it can become super-complicated due to there being multiple different sizes of data being considered:

  • The size of the spilled register data,
  • The size of the LLVM Value that's "actually" being referred to, within the spilled register data,
  • The size of data that the DIExpression has been constructed to refer to,
  • The size of data for the source-variable (potentially a piece/fragment) being described.

Into this mix we can throw LLVM adding-and-removing casts at various points, some of which instruction-referencing stores in an MachineFunction side-table. And then we can consider the possibility that for variadic variable locations where Values have been spilled, we might need to completely interpret the DIExpression to work out what size of data is supposed to be loaded.

(While this is all a pain, at least we're wrestling with the complicated cases nowadays rather than the simple ones).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants