-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Description
Generating certain instruction attributes from the td files is sometimes not possible, because those attributes are simply not defined.
Often those attributes are encoded in single flag bits and the authors of the td file simply treated them as part of the opcode and made them in-accessible this way.
E.g. instead of defining a single instructions with a variable for a flag bit F, they simply defined two instructions. one with the flag bit hard-coded to F = 1 and one with a hard-coded F = 0.
This is a problem, because fixing the td files for our use case can be enormously time consuming and would need many changes to the disassembler and the asm-writer. Both of which is very unlikely to be accepted upstream.
This leads to the problem, that we have to put a lot of effort to retrieve this information somehow. Currently mostly by scanning the asm-string if possible or patch the generated files. Both of which are not nice at all.
Examples:
AArch64post-index detection (string searching).ARMvector data type (the mnemonic post-fixes.u32,i16etc. of certain vector instructions) (patchingARMGenAsmWriter.inc).
A possible way to avoid this is by generating the encoding format of the instruction.
Luckily, is the encoding format either already defined (e.g. for PPC or Hexagon), or matches roughly the base class of an instruction.
In LLVM instructions follow an inheritance hierarchy. Usually the class at the top represents all instructions with a certain encoding format.
Having the knowledge of those encoding formats, allows us to know the positions of relevant bits.
This way, if a certain attribute is not defined in the td files, we can simply test the bits of the instruction bytes.