Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Commit fc8b156

Browse files
authored
Merge pull request #18127 from CarolEidt/StructAbiDoc
Design doc for struct passing
2 parents 208ea16 + 2950f8f commit fc8b156

File tree

1 file changed

+147
-0
lines changed

1 file changed

+147
-0
lines changed
Lines changed: 147 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,147 @@
1+
Passing and Returning Structs
2+
=============================
3+
Problem Statement
4+
-----------------
5+
The current implementation of ABI (Application Binary Interface, aka calling
6+
convention) support in RyuJIT is problematic in a number of areas, especially
7+
when it comes to the handling of structs (aka value types).
8+
9+
- RyuJIT currently supports 4 target architectures: x86, x64 (aka x86-64), ARM
10+
and ARM64, with two different ABIs for x64 (Windows and Linux).
11+
These each have unique requirements, yet these requirements are expressed in
12+
the code programmatically, with #ifdefs, and yet even where the requirements
13+
are shared, they are often handled in different code paths.
14+
15+
- When passing or returning structs, the code generator sometimes requires
16+
that the struct must be copied to or from memory. The morpher (`fgMorphArgs()`)
17+
attempts to discern these cases, and create copies when necessary, but sometimes it
18+
makes copies when they aren't needed.
19+
20+
- Even in cases where the code generator currently requires the struct to be
21+
in memory, it could be enhanced to handle the in-register case:
22+
- Currently, when we have a register-passed struct that fits in a register,
23+
but that doesn't have a single field of a matching type,
24+
`fgMorphArgs()` generates a `GT_LCL_FLD` of the appropriate scalar type
25+
to reference the value. This forces the struct to be marked `lvDoNotEnregister`.
26+
However, the backend has support for performing the necessary move in
27+
some cases (e.g. when a struct with a single field of `TYP_DOUBLE` is passed
28+
in an integer register as `TYP_LONG`), by generating a `GT_BITCAST` to move
29+
the value to the appropriate register.
30+
- In other cases (e.g. a struct with two `TYP_INT` fields in registers), the
31+
backend should be able to generate the necessary code to place the fields
32+
in the necessary register(s).
33+
34+
- Even when the requirements are similar, the IL representation, as well as the
35+
transformations performed by `fgMorphArgs()`, are not the same.
36+
37+
- Much of the information about each argument is contained in the `fgArgInfo`
38+
on the `GT_CALL` node. It in turn contains an `argTable` with an entry for
39+
each argument. However, this information is not complete, especially on
40+
x64/Linux where repeated calls are made to the VM to obtain the struct
41+
descriptor.
42+
43+
- The functionality of `fgMorphArgs()` combines the determination of the ABI
44+
requirements, which sets up the `fgArgInfo` and `argTable`, with the IR
45+
transformations required to ensure that the arguments of the `GT_CALL` are
46+
in the appropriate form.
47+
48+
- When `fgCanFastTailCall()` is called, it doesn't yet have the `fgArgInfo`,
49+
so it must duplicate some of the analysis that is done in `fgMorphArgs()`
50+
51+
High-Level Proposed Design
52+
--------------------------
53+
This is a preliminary design, and is likely to change as the implementation proceeds:
54+
55+
First, the `fgArgInfo` is extended to contain all the information needed to determine
56+
how an argument is passed. Ideally, most of the `#ifdef`s relating to ABI differences
57+
can be eliminated by querying the `fgArgInfo`. Most of the information will be queried
58+
via properties, such that when a target doesn't support a particular struct passing
59+
mechanism (e.g. passing structs by reference), the property will unconditionally return false, and the associated code paths will be eliminated.
60+
61+
The initial determination of the number of arguments and how they
62+
are passed is extracted from `fgMorphArgs()` into a separate method: `gtInitArgInfo()`. It is idempotent - that is, it can be re-invoked and will simply return if it
63+
has already been called. It can be called by `fgCanFastTailCall()` so that it can query
64+
the `argTable` to get the information it requires.
65+
66+
This method is responsible for the first part of what is currently `fgMorphArgs()`, plus setting up the `argTable`:
67+
- Count the number of args.
68+
- Create any non-standard args (e.g. indirection cells or cookie parameters) that
69+
are needed, but don't yet create copies
70+
- Create the `argTable` for the given number of args
71+
- Initialize the `fgArgInfo` for each arg, with all the information about how
72+
the arg is passed, and whether it requires a temp, but don't yet create any
73+
temps.
74+
- On x64/ux, this is the only method that should need to consult the struct
75+
descriptor for outgoing arguments.
76+
- The `isProcessed` flag remains false until `fgMorphArgs()` has handled
77+
the arg.
78+
- The `fgArgInfo` contains an array of register numbers (sized according to the
79+
maximum number of registers used for a single argument). If the first register
80+
in `REG_STK`, the argument is passed entirely on the stack. For most targets,
81+
if the first register is a register, the argument is passed entirely in
82+
registers. When arguments can be split (`_TARGET_ARM_`), this will be indicated
83+
with an `isSplit` property of `true`.
84+
- Note that the `isSplit` property would evaluate to false on targets where
85+
it is not supported, reducing the need for `ifdef`s (we can rely on the compiler
86+
to eliminate those dead paths).
87+
- Validate that each struct argument is either a `GT_LCL_VAR`, a `GT_OBJ`,
88+
or a `GT_MKREFANY`.
89+
90+
During the initial `fgMorph` phase, `fgMorphArgs()` does the following:
91+
92+
- Calls `gtInitArgInfo()` to ensure that the `argTable` is set up properly.
93+
94+
- Creates a copy of each argument as necessary.
95+
- This should only be done if one or more of the following conditions hold:
96+
- A copy is required to preserve possible ordering dependencies, in which
97+
case the `needsTmp` field of the `fgArgInfo` was set to true by
98+
`fgInitArgInfo()`.
99+
- A struct arg has been promoted, it is passed in register(s) (or split),
100+
and has not yet been marked `lvDoNotEnregister`.
101+
102+
- Sets up the actual argument for any non-standard args.
103+
104+
- Transforms struct arg nodes from `GT_LCL_VAR`, `GT_OBJ` or `GT_MKREFANY` into:
105+
- `GT_FIELD_LIST` (i.e. a list of fields) if the lclVar is promoted and
106+
either 1) passed on the stack, or 2) each register used to pass the struct
107+
corresponds to exactly one field of the struct. The type of the register
108+
in which a field is passed need not match the type of the field.
109+
- The case of a single `GT_FIELD_LIST` node subsumes the current
110+
`GT_LCL_FLD` representation for a matching single-field struct,
111+
and does not require a lclVar to be marked `lvDoNotEnregister`.
112+
Any register type mismatch (e.g. a float field passed in an integer
113+
register) will be handled by `Lowering` (see below).
114+
- In future, this should include *any* case of a promoted struct, and the
115+
backend (`Lowering` and/or `CodeGen`) should be enhanced to correctly
116+
perform the needed re-assembling of fields into registers.
117+
- `GT_LCL_VAR` if the argument is a non-promoted struct that is either
118+
marked `lvDoNotEnregister` or fully enregistered, such as a SIMD type lclVar
119+
or (in future) a struct that fits entirely into a register.
120+
- `GT_OBJ` otherwise. In this case, if it is a partial reference to a lclVar, it must be
121+
marked `lvDoNotEnregister`. (If it is a full reference to a lclVar, it falls into
122+
the `GT_LCL_VAR` case above.) This representation will be used even for structs
123+
that are passed as a primitive type (i.e. that currently use the `GT_LCL_FLD`
124+
representation).
125+
126+
During `Lowering`, any mismatches between the type of an actual register argument (i.e. the
127+
`GT_OBJ` or the `GT_FIELD_LIST` element) and the type of the register, will cause a
128+
`GT_BITCAST` node to be inserted. The purpose of this node is simply to instruct the
129+
register allocator to move the value between the register files, without requiring the
130+
value to necessarily be spilled to memory.'
131+
132+
Future
133+
------
134+
There are additional improvements for struct parameters for future consideration:
135+
136+
- Support passing promoted structs in registers (as suggested above), where `Lowering`
137+
would insert the necessary IR to assemble the fields into registers.
138+
- Instead of generating `GT_FIELD_LIST`, we should consider modeling the passing of a
139+
promoted struct as separate arguments. This would probably be best implemented by
140+
modifying the `argTable` during `fgMorphArgs()` such that it reflects the "as-if"
141+
signature with the exploded struct fields.
142+
- How this would impact the handling of fields that must be packed into a single
143+
register remains to be determined (i.e. does `fgMorphArgs()` generate the IR
144+
to assemble the fields into a single register-sized value, or is that somehow
145+
deferred?)
146+
- Support vector calling conventions. This should be somewhat simplified by the
147+
extraction of the ABI code.

0 commit comments

Comments
 (0)