|
17 | 17 | /// compiler to perform move checking of address only lets, vars, inout args,
|
18 | 18 | /// and mutating self.
|
19 | 19 | ///
|
20 |
| -/// Algorithm At a High Level |
21 |
| -/// ------------------------- |
| 20 | +/// Move Address Checking in Swift |
| 21 | +/// ------------------------------ |
22 | 22 | ///
|
23 |
| -/// At a high level, this algorithm can be conceptualized as attempting to |
24 |
| -/// completely classify the memory behavior of the recursive uses of a move only |
25 |
| -/// marked address and then seek to transform those uses such that they are in |
26 |
| -/// "simple move only address" form. We define "simple move only address" to |
27 |
| -/// mean that along any path from an init of an address to a consume, all uses |
28 |
| -/// are guaranteed to be semantically "borrow uses". If we find that any such |
29 |
| -/// owned uses can not be turned into a "borrow" use, then we emit an error |
30 |
| -/// since we would need to insert a copy there. |
| 23 | +/// In order to not have to rewrite all of SILGen to avoid copies, Swift has |
| 24 | +/// taken an approach where SILGen marks moveonly addresses with a special |
| 25 | +/// marker instruction and emits copies when it attempts to access move only |
| 26 | +/// addresses. Then this algorithm fixed up SILGen's output by analyzing the |
| 27 | +/// memory uses of a marked memory root location recursively using AccessPath |
| 28 | +/// based analyses and then attempting to transform those uses based off of the |
| 29 | +/// marked kind into one of a few variants of "simple move only address form" |
| 30 | +/// (see below for more information). If the pass is unable to reason that it |
| 31 | +/// can safely transform the uses into said form, we emit a diagnostic stating |
| 32 | +/// the error to the user. If we emit said diagnostic, we then bail early. If we |
| 33 | +/// do not emit a diagnostic, we then transform the IR so that the move only |
| 34 | +/// address uses are in said form. This then guarantees that despite SILGen |
| 35 | +/// emitting move only types with copies, in the end, our move only types are |
| 36 | +/// never copied. As an additional check, once the pass has run we emit an extra |
| 37 | +/// diagnostic if we find any copies of move only types so that the user can be |
| 38 | +/// sure that any accepted program does not copy move only types. |
| 39 | +/// |
| 40 | +/// Simple Move Only Address Form |
| 41 | +/// ----------------------------- |
| 42 | +/// |
| 43 | +/// We define a memory location to be in "simple move only address" form (SMOA |
| 44 | +/// form for ease of typing) to mean that along any path from an init of the |
| 45 | +/// address to a consume of the address, all uses are guaranteed to be semantic |
| 46 | +/// "borrow uses" instead of semantic "copy uses". Additionally, SMOA does not |
| 47 | +/// consider destroy_addr to be a true consuming use since it will rewrite |
| 48 | +/// destroy_addr as necessary so the consuming uses are defined by consuming |
| 49 | +/// uses modulo destroy_addr. |
| 50 | +/// |
| 51 | +/// An example of a memory location in "simple move only address form" is the |
| 52 | +/// following: |
| 53 | +/// |
| 54 | +/// ``` |
| 55 | +/// // Memory is defined |
| 56 | +/// %0 = alloc_stack $Type |
| 57 | +/// |
| 58 | +/// // Initial initialization. |
| 59 | +/// store %input to [init] %0 : $Type |
| 60 | +/// |
| 61 | +/// // Sequence of borrow uses. |
| 62 | +/// %1 = load_borrow %0 : $Type |
| 63 | +/// apply %f(%1) : $@convention(thin) (@guaranteed Type) -> () |
| 64 | +/// end_borrow %1 |
| 65 | +/// apply %f2(%0) : $@convention(thin) (@in_guaranteed Type) -> () |
| 66 | +/// |
| 67 | +/// // Assign is ok since we are just consuming the value. |
| 68 | +/// store %input2 to [assign] %0 : $*Type |
| 69 | +/// |
| 70 | +/// // More borrow uses. |
| 71 | +/// %3 = load_borrow %0 : $*Type |
| 72 | +/// apply %f(%3) : $@convention(thin) (@guaranteed Type) -> () |
| 73 | +/// end_borrow %1 |
| 74 | +/// apply %f2(%0) : $@convention(thin) (@in_guaranteed Type) -> () |
| 75 | +/// |
| 76 | +/// // Final destroy |
| 77 | +/// destroy_addr %0 : $Type |
| 78 | +/// ``` |
| 79 | +/// |
| 80 | +/// An example of an instruction not in SMOA form is: |
| 81 | +/// |
| 82 | +/// ``` |
| 83 | +/// // Memory is defined |
| 84 | +/// %0 = alloc_stack $Type |
| 85 | +/// |
| 86 | +/// // Initial initialization. |
| 87 | +/// store %input to [init] %0 : $*Type |
| 88 | +/// |
| 89 | +/// // Perform a load + copy of %0 to pass as an argument to %f. |
| 90 | +/// %1 = load [copy] %0 : $*Type |
| 91 | +/// apply %f(%1) : $@convention(thin) (@guaranteed Type) -> () |
| 92 | +/// destroy_value %1 : $Type |
| 93 | +/// |
| 94 | +/// // Initialize other variable. |
| 95 | +/// %otherVar = alloc_stack $Type |
| 96 | +/// copy_addr %0 to [initialization] %otherVar : $*Type |
| 97 | +/// ... |
| 98 | +/// |
| 99 | +/// // Final destroy that is not part of the use set. |
| 100 | +/// destroy_addr %0 : $*Type |
| 101 | +/// ``` |
| 102 | +/// |
| 103 | +/// The variants of SMOA form can be classified by the specific mark_must_check |
| 104 | +/// kind put on the the checker mark instruction and are as follows: |
| 105 | +/// |
| 106 | +/// 1. no_consume_or_assign. This means that the address can only be consumed by |
| 107 | +/// destroy_addr and otherwise is only read from. This simulates guaranteed |
| 108 | +/// semantics. |
| 109 | +/// |
| 110 | +/// 2. consumable_and_assignable. This means that the address can be consumed |
| 111 | +/// (e.x.: take/pass to a +1 function) or assigned to. Additionally, the value |
| 112 | +/// is supposed to have its lifetime end along all program paths locally in the |
| 113 | +/// function. This simulates a local var's semantics. |
| 114 | +/// |
| 115 | +/// 3. assignable_but_not_consumable. This means that the address can be |
| 116 | +/// assigned over, but cannot be taken from. It additionally must have a valid |
| 117 | +/// value in it and the end of its lifetime. This simulates accesses to class |
| 118 | +/// fields, globals, and escaping mutable captures where we want the user to be |
| 119 | +/// able to update the value, but allowing for escapes of the value would break |
| 120 | +/// memory safety. In all cases where this is used, the mark_must_check is used |
| 121 | +/// as the initial def of the value lifetime. Example: |
| 122 | +/// |
| 123 | +/// 4. initable_but_not_consumable. This means that the address can only be |
| 124 | +/// initialized once but cannot be taken from or assigned over. It is assumed |
| 125 | +/// that the initial def will always be the mark_must_check and that the value |
| 126 | +/// will be uninitialized at that point. Example: |
| 127 | +/// |
| 128 | +/// Algorithm Stages In Detail |
| 129 | +/// -------------------------- |
31 | 130 | ///
|
32 | 131 | /// To implement this, our algorithm works in 4 stages: a use classification
|
33 | 132 | /// stage, a dataflow stage, and then depending on success/failure one of two
|
34 |
| -/// transform stagesWe describe them below. |
| 133 | +/// transform stages. |
35 | 134 | ///
|
36 | 135 | /// Use Classification Stage
|
37 | 136 | /// ~~~~~~~~~~~~~~~~~~~~~~~~
|
|
75 | 174 | /// Dataflow Stage
|
76 | 175 | /// ~~~~~~~~~~~~~~
|
77 | 176 | ///
|
78 |
| -/// To perform our dataflow, we do the following: |
79 |
| -/// |
80 |
| -/// 1. We walk each block from top to bottom performing the single block version |
81 |
| -/// of the algorithm and preparing field sensitive pruned liveness. |
82 |
| -/// |
83 |
| -/// 2. If we need to, we then use field sensitive pruned liveness to perform |
84 |
| -/// global dataflow to determine if any of our takeOrCopies are within the |
85 |
| -/// boundary lifetime implying a violation. |
| 177 | +/// To perform our dataflow, we take our classified uses and initialize field |
| 178 | +/// sensitive pruned liveness with the data. We then use field sensitive pruned |
| 179 | +/// liveness and our check kinds to determine if all of our copy uses that could |
| 180 | +/// not be changed into borrows are on the liveness boundary of the memory. If |
| 181 | +/// they are within the liveness boundary, then we know a copy is needed and we |
| 182 | +/// emit an error to the user. Otherwise, we know that we can change them |
| 183 | +/// semantically into a take. |
86 | 184 | ///
|
87 | 185 | /// Success Transformation
|
88 | 186 | /// ~~~~~~~~~~~~~~~~~~~~~~
|
89 | 187 | ///
|
90 |
| -/// Upon success Now that we know that we can change our address into "simple |
91 |
| -/// move only address form", we transform the IR in the following way: |
| 188 | +/// Now that we know that we can change our address into "simple move only |
| 189 | +/// address form", we transform the IR in the following way: |
92 | 190 | ///
|
93 | 191 | /// 1. Any load [copy] that are classified as borrows are changed to
|
94 |
| -/// load_borrow. |
| 192 | +/// load_borrow. |
95 | 193 | /// 2. Any load [copy] that are classified as takes are changed to load [take].
|
96 | 194 | /// 3. Any copy_addr [init] temporary allocation are eliminated with their
|
97 | 195 | /// destroy_addr. All uses are placed on the source address.
|
|
107 | 205 | /// are going to fail anyways at this point, so it is ok to do so since we will
|
108 | 206 | /// fail before attempting to codegen into LLVM IR.
|
109 | 207 | ///
|
| 208 | +/// Final Black Box Checks on Success |
| 209 | +/// ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 210 | +/// |
| 211 | +/// Finally since we want to be able to guarantee to users 100% that the |
| 212 | +/// compiler will reject programs even if the checker gives a false success for |
| 213 | +/// some reason due to human compiler writer error, we do a last pass over the |
| 214 | +/// IR and emit an error diagnostic on any copies of move only types that we |
| 215 | +/// see. The error states to the user that this is a compiler bug and to file a |
| 216 | +/// bug report. Since it is a completely separate, simple implementation, this |
| 217 | +/// gives the user of our implementation the confidence to know that the |
| 218 | +/// compiler even in the face of complexity in the checker will emit correct |
| 219 | +/// code. |
| 220 | +/// |
110 | 221 | //===----------------------------------------------------------------------===//
|
111 | 222 |
|
112 | 223 | #define DEBUG_TYPE "sil-move-only-checker"
|
|
0 commit comments