Skip to content

Commit 78c431a

Browse files
authored
Merge pull request #64308 from gottesmm/pr-f1b021e0dd1be892eb0dbfba3f403d9823e0c5f7
[move-only] Update the address checker description comment given changes to the algorithm.
2 parents 4f74cbb + 156523e commit 78c431a

File tree

1 file changed

+133
-22
lines changed

1 file changed

+133
-22
lines changed

lib/SILOptimizer/Mandatory/MoveOnlyAddressCheckerUtils.cpp

Lines changed: 133 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -17,21 +17,120 @@
1717
/// compiler to perform move checking of address only lets, vars, inout args,
1818
/// and mutating self.
1919
///
20-
/// Algorithm At a High Level
21-
/// -------------------------
20+
/// Move Address Checking in Swift
21+
/// ------------------------------
2222
///
23-
/// At a high level, this algorithm can be conceptualized as attempting to
24-
/// completely classify the memory behavior of the recursive uses of a move only
25-
/// marked address and then seek to transform those uses such that they are in
26-
/// "simple move only address" form. We define "simple move only address" to
27-
/// mean that along any path from an init of an address to a consume, all uses
28-
/// are guaranteed to be semantically "borrow uses". If we find that any such
29-
/// owned uses can not be turned into a "borrow" use, then we emit an error
30-
/// since we would need to insert a copy there.
23+
/// In order to not have to rewrite all of SILGen to avoid copies, Swift has
24+
/// taken an approach where SILGen marks moveonly addresses with a special
25+
/// marker instruction and emits copies when it attempts to access move only
26+
/// addresses. Then this algorithm fixed up SILGen's output by analyzing the
27+
/// memory uses of a marked memory root location recursively using AccessPath
28+
/// based analyses and then attempting to transform those uses based off of the
29+
/// marked kind into one of a few variants of "simple move only address form"
30+
/// (see below for more information). If the pass is unable to reason that it
31+
/// can safely transform the uses into said form, we emit a diagnostic stating
32+
/// the error to the user. If we emit said diagnostic, we then bail early. If we
33+
/// do not emit a diagnostic, we then transform the IR so that the move only
34+
/// address uses are in said form. This then guarantees that despite SILGen
35+
/// emitting move only types with copies, in the end, our move only types are
36+
/// never copied. As an additional check, once the pass has run we emit an extra
37+
/// diagnostic if we find any copies of move only types so that the user can be
38+
/// sure that any accepted program does not copy move only types.
39+
///
40+
/// Simple Move Only Address Form
41+
/// -----------------------------
42+
///
43+
/// We define a memory location to be in "simple move only address" form (SMOA
44+
/// form for ease of typing) to mean that along any path from an init of the
45+
/// address to a consume of the address, all uses are guaranteed to be semantic
46+
/// "borrow uses" instead of semantic "copy uses". Additionally, SMOA does not
47+
/// consider destroy_addr to be a true consuming use since it will rewrite
48+
/// destroy_addr as necessary so the consuming uses are defined by consuming
49+
/// uses modulo destroy_addr.
50+
///
51+
/// An example of a memory location in "simple move only address form" is the
52+
/// following:
53+
///
54+
/// ```
55+
/// // Memory is defined
56+
/// %0 = alloc_stack $Type
57+
///
58+
/// // Initial initialization.
59+
/// store %input to [init] %0 : $Type
60+
///
61+
/// // Sequence of borrow uses.
62+
/// %1 = load_borrow %0 : $Type
63+
/// apply %f(%1) : $@convention(thin) (@guaranteed Type) -> ()
64+
/// end_borrow %1
65+
/// apply %f2(%0) : $@convention(thin) (@in_guaranteed Type) -> ()
66+
///
67+
/// // Assign is ok since we are just consuming the value.
68+
/// store %input2 to [assign] %0 : $*Type
69+
///
70+
/// // More borrow uses.
71+
/// %3 = load_borrow %0 : $*Type
72+
/// apply %f(%3) : $@convention(thin) (@guaranteed Type) -> ()
73+
/// end_borrow %1
74+
/// apply %f2(%0) : $@convention(thin) (@in_guaranteed Type) -> ()
75+
///
76+
/// // Final destroy
77+
/// destroy_addr %0 : $Type
78+
/// ```
79+
///
80+
/// An example of an instruction not in SMOA form is:
81+
///
82+
/// ```
83+
/// // Memory is defined
84+
/// %0 = alloc_stack $Type
85+
///
86+
/// // Initial initialization.
87+
/// store %input to [init] %0 : $*Type
88+
///
89+
/// // Perform a load + copy of %0 to pass as an argument to %f.
90+
/// %1 = load [copy] %0 : $*Type
91+
/// apply %f(%1) : $@convention(thin) (@guaranteed Type) -> ()
92+
/// destroy_value %1 : $Type
93+
///
94+
/// // Initialize other variable.
95+
/// %otherVar = alloc_stack $Type
96+
/// copy_addr %0 to [initialization] %otherVar : $*Type
97+
/// ...
98+
///
99+
/// // Final destroy that is not part of the use set.
100+
/// destroy_addr %0 : $*Type
101+
/// ```
102+
///
103+
/// The variants of SMOA form can be classified by the specific mark_must_check
104+
/// kind put on the the checker mark instruction and are as follows:
105+
///
106+
/// 1. no_consume_or_assign. This means that the address can only be consumed by
107+
/// destroy_addr and otherwise is only read from. This simulates guaranteed
108+
/// semantics.
109+
///
110+
/// 2. consumable_and_assignable. This means that the address can be consumed
111+
/// (e.x.: take/pass to a +1 function) or assigned to. Additionally, the value
112+
/// is supposed to have its lifetime end along all program paths locally in the
113+
/// function. This simulates a local var's semantics.
114+
///
115+
/// 3. assignable_but_not_consumable. This means that the address can be
116+
/// assigned over, but cannot be taken from. It additionally must have a valid
117+
/// value in it and the end of its lifetime. This simulates accesses to class
118+
/// fields, globals, and escaping mutable captures where we want the user to be
119+
/// able to update the value, but allowing for escapes of the value would break
120+
/// memory safety. In all cases where this is used, the mark_must_check is used
121+
/// as the initial def of the value lifetime. Example:
122+
///
123+
/// 4. initable_but_not_consumable. This means that the address can only be
124+
/// initialized once but cannot be taken from or assigned over. It is assumed
125+
/// that the initial def will always be the mark_must_check and that the value
126+
/// will be uninitialized at that point. Example:
127+
///
128+
/// Algorithm Stages In Detail
129+
/// --------------------------
31130
///
32131
/// To implement this, our algorithm works in 4 stages: a use classification
33132
/// stage, a dataflow stage, and then depending on success/failure one of two
34-
/// transform stagesWe describe them below.
133+
/// transform stages.
35134
///
36135
/// Use Classification Stage
37136
/// ~~~~~~~~~~~~~~~~~~~~~~~~
@@ -75,23 +174,22 @@
75174
/// Dataflow Stage
76175
/// ~~~~~~~~~~~~~~
77176
///
78-
/// To perform our dataflow, we do the following:
79-
///
80-
/// 1. We walk each block from top to bottom performing the single block version
81-
/// of the algorithm and preparing field sensitive pruned liveness.
82-
///
83-
/// 2. If we need to, we then use field sensitive pruned liveness to perform
84-
/// global dataflow to determine if any of our takeOrCopies are within the
85-
/// boundary lifetime implying a violation.
177+
/// To perform our dataflow, we take our classified uses and initialize field
178+
/// sensitive pruned liveness with the data. We then use field sensitive pruned
179+
/// liveness and our check kinds to determine if all of our copy uses that could
180+
/// not be changed into borrows are on the liveness boundary of the memory. If
181+
/// they are within the liveness boundary, then we know a copy is needed and we
182+
/// emit an error to the user. Otherwise, we know that we can change them
183+
/// semantically into a take.
86184
///
87185
/// Success Transformation
88186
/// ~~~~~~~~~~~~~~~~~~~~~~
89187
///
90-
/// Upon success Now that we know that we can change our address into "simple
91-
/// move only address form", we transform the IR in the following way:
188+
/// Now that we know that we can change our address into "simple move only
189+
/// address form", we transform the IR in the following way:
92190
///
93191
/// 1. Any load [copy] that are classified as borrows are changed to
94-
/// load_borrow.
192+
/// load_borrow.
95193
/// 2. Any load [copy] that are classified as takes are changed to load [take].
96194
/// 3. Any copy_addr [init] temporary allocation are eliminated with their
97195
/// destroy_addr. All uses are placed on the source address.
@@ -107,6 +205,19 @@
107205
/// are going to fail anyways at this point, so it is ok to do so since we will
108206
/// fail before attempting to codegen into LLVM IR.
109207
///
208+
/// Final Black Box Checks on Success
209+
/// ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
210+
///
211+
/// Finally since we want to be able to guarantee to users 100% that the
212+
/// compiler will reject programs even if the checker gives a false success for
213+
/// some reason due to human compiler writer error, we do a last pass over the
214+
/// IR and emit an error diagnostic on any copies of move only types that we
215+
/// see. The error states to the user that this is a compiler bug and to file a
216+
/// bug report. Since it is a completely separate, simple implementation, this
217+
/// gives the user of our implementation the confidence to know that the
218+
/// compiler even in the face of complexity in the checker will emit correct
219+
/// code.
220+
///
110221
//===----------------------------------------------------------------------===//
111222

112223
#define DEBUG_TYPE "sil-move-only-checker"

0 commit comments

Comments
 (0)