Skip to content

Commit aad71d3

Browse files
committed
cmd/compile: reorganise and improve ssa/README.md
Since the initial version was written, I've gotten help writing cmd/compile/README.md and I've also learned some more on my own, so it's time to organise this document better and expand it. First, split up the document in sections, starting from the simplest ideas that can be explained on their own. From there, build all the way up into SSA functions and how they are compiled. Each of the sections also gets more detail now; most ideas that were a paragraph are now a section with several paragraphs. No new major sections have been added in this CL. While at it, add a copyright notice and make better use of markdown, just like in the other README.md. Also fix a file path in value.go, which I noticed to be stale while reading godocs to write the document. Finally, leave a few TODO comments for areas that would benefit from extra input from people familiar with the SSA package. They will be taken care of in future CLs. Change-Id: I85e7a69a0b3260e72139991a625d926099624f71 Reviewed-on: https://go-review.googlesource.com/110067 Reviewed-by: Keith Randall <[email protected]>
1 parent 2ee6bfb commit aad71d3

File tree

2 files changed

+192
-42
lines changed

2 files changed

+192
-42
lines changed
Lines changed: 191 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -1,59 +1,209 @@
1-
This package contains the compiler's Static Single Assignment form
2-
component. If you're not familiar with SSA, Wikipedia is a good starting
3-
point:
1+
<!---
2+
// Copyright 2018 The Go Authors. All rights reserved.
3+
// Use of this source code is governed by a BSD-style
4+
// license that can be found in the LICENSE file.
5+
-->
46

5-
https://en.wikipedia.org/wiki/Static_single_assignment_form
7+
## Introduction to the Go compiler's SSA backend
68

7-
SSA is useful to perform transformations and optimizations, which can be
8-
found in this package in the form of compiler passes and rewrite rules.
9-
The former can be found in the "passes" array in compile.go, while the
10-
latter are generated from gen/*.rules.
9+
This package contains the compiler's Static Single Assignment form component. If
10+
you're not familiar with SSA, its [Wikipedia
11+
article](https://en.wikipedia.org/wiki/Static_single_assignment_form) is a good
12+
starting point.
1113

12-
Like most other SSA forms, funcs consist of blocks and values. Values
13-
perform an operation, which is encoded in the form of an operator and a
14-
number of arguments. The semantics of each Op can be found in
15-
gen/*Ops.go.
14+
It is recommended that you first read [cmd/compile/README.md](../../README.md)
15+
if you are not familiar with the Go compiler already. That document gives an
16+
overview of the compiler, and explains what is SSA's part and purpose in it.
1617

17-
gen/* is used to generate code in the ssa package. This includes
18-
opGen.go from gen/*Ops.go, and the rewrite*.go files from gen/*.rules.
19-
To regenerate these files, see gen/README.
18+
### Key concepts
2019

21-
Blocks can have multiple forms. For example, BlockPlain will always hand
22-
the control flow to another block, and BlockIf will flow to one of two
23-
blocks depending on a value. See block.go for more details.
20+
The names described below may be loosely related to their Go counterparts, but
21+
note that they are not equivalent. For example, a Go block statement has a
22+
variable scope, yet SSA has no notion of variables nor variable scopes.
2423

25-
Values also have types. For example, a constant boolean value will have
26-
a Bool type, and a variable definition value will have a memory type.
24+
It may also be surprising that values and blocks are named after their unique
25+
sequential IDs. They rarely correspond to named entities in the original code,
26+
such as variables or function parameters. The sequential IDs also allow the
27+
compiler to avoid maps, and it is always possible to track back the values to Go
28+
code using debug and position information.
2729

28-
The memory type is special - it represents the global memory state. For
29-
example, an Op that takes a memory argument depends on that memory
30-
state, and an Op which has the memory type impacts the state of memory.
31-
This is important so that memory operations are kept in the right order.
30+
#### Values
3231

33-
For example, take this program:
32+
Values are the basic building blocks of SSA. Per SSA's very definition, a
33+
value is defined exactly once, but it may be used any number of times. A value
34+
mainly consists of a unique identifier, an operator, a type, and some arguments.
3435

35-
func f(a, b *int) {
36-
*a = 3
37-
*b = *a
38-
}
36+
An operator or `Op` describes the operation that computes the value. The
37+
semantics of each operator can be found in `gen/*Ops.go`. For example, `OpAdd8`
38+
takes two value arguments holding 8-bit integers and results in their addition.
39+
Here is a possible SSA representation of the addition of two `uint8` values:
3940

40-
The two generated stores may show up as follows:
41+
// var c uint8 = a + b
42+
v4 = Add8 <uint8> v2 v3
4143

42-
v10 (4) = Store <mem> {int} v6 v8 v1
43-
v14 (5) = Store <mem> {int} v7 v8 v10
44+
A value's type will usually be a Go type. For example, the value in the example
45+
above has a `uint8` type, and a constant boolean value will have a `bool` type.
46+
However, certain types don't come from Go and are special; below we will cover
47+
`memory`, the most common of them.
4448

45-
Since the second store has a memory argument v10, it cannot be reordered
46-
before the first store, which sets that global memory state. And the
47-
logic translates to the code; reordering the two assignments would
48-
result in a different program.
49+
See [value.go](value.go) for more information.
50+
51+
#### Memory types
52+
53+
`memory` represents the global memory state. An `Op` that takes a memory
54+
argument depends on that memory state, and an `Op` which has the memory type
55+
impacts the state of memory. This ensures that memory operations are kept in the
56+
right order. For example:
57+
58+
// *a = 3
59+
// *b = *a
60+
v10 = Store <mem> {int} v6 v8 v1
61+
v14 = Store <mem> {int} v7 v8 v10
62+
63+
Here, `Store` stores its second argument (of type `int`) into the first argument
64+
(of type `*int`). The last argument is the memory state; since the second store
65+
depends on the memory value defined by the first store, the two stores cannot be
66+
reordered.
67+
68+
See [cmd/compile/internal/types/type.go](../types/type.go) for more information.
69+
70+
#### Blocks
71+
72+
A block represents a basic block in the control flow graph of a function. It is,
73+
essentially, a list of values that define the operation of this block. Besides
74+
the list of values, blocks mainly consist of a unique identifier, a kind, and a
75+
list of successor blocks.
76+
77+
The simplest kind is a `plain` block; it simply hands the control flow to
78+
another block, thus its successors list contains one block.
79+
80+
Another common block kind is the `exit` block. These have a final value, called
81+
control value, which must return a memory state. This is necessary for functions
82+
to return some values, for example - the caller needs some memory state to
83+
depend on, to ensure that it receives those return values correctly.
84+
85+
The last important block kind we will mention is the `if` block. Its control
86+
value must be a boolean value, and it has exactly two successor blocks. The
87+
control flow is handed to the first successor if the bool is true, and to the
88+
second otherwise.
89+
90+
Here is a sample if-else control flow represented with basic blocks:
91+
92+
// func(b bool) int {
93+
// if b {
94+
// return 2
95+
// }
96+
// return 3
97+
// }
98+
b1:
99+
v1 = InitMem <mem>
100+
v2 = SP <uintptr>
101+
v5 = Addr <*int> {~r1} v2
102+
v6 = Arg <bool> {b}
103+
v8 = Const64 <int> [2]
104+
v12 = Const64 <int> [3]
105+
If v6 -> b2 b3
106+
b2: <- b1
107+
v10 = VarDef <mem> {~r1} v1
108+
v11 = Store <mem> {int} v5 v8 v10
109+
Ret v11
110+
b3: <- b1
111+
v14 = VarDef <mem> {~r1} v1
112+
v15 = Store <mem> {int} v5 v12 v14
113+
Ret v15
114+
115+
<!---
116+
TODO: can we come up with a shorter example that still shows the control flow?
117+
-->
118+
119+
See [block.go](block.go) for more information.
120+
121+
#### Functions
122+
123+
A function represents a function declaration along with its body. It mainly
124+
consists of a name, a type (its signature), a list of blocks that form its body,
125+
and the entry block within said list.
126+
127+
When a function is called, the control flow is handed to its entry block. If the
128+
function terminates, the control flow will eventually reach an exit block, thus
129+
ending the function call.
130+
131+
Note that a function may have zero or multiple exit blocks, just like a Go
132+
function can have any number of return points, but it must have exactly one
133+
entry point block.
134+
135+
Also note that some SSA functions are autogenerated, such as the hash functions
136+
for each type used as a map key.
137+
138+
For example, this is what an empty function can look like in SSA, with a single
139+
exit block that returns an uninteresting memory state:
140+
141+
foo func()
142+
b1:
143+
v1 = InitMem <mem>
144+
Ret v1
145+
146+
See [func.go](func.go) for more information.
147+
148+
### Compiler passes
149+
150+
Having a program in SSA form is not very useful on its own. Its advantage lies
151+
in how easy it is to write optimizations that modify the program to make it
152+
better. The way the Go compiler accomplishes this is via a list of passes.
153+
154+
Each pass transforms a SSA function in some way. For example, a dead code
155+
elimination pass will remove blocks and values that it can prove will never be
156+
executed, and a nil check elimination pass will remove nil checks which it can
157+
prove to be redundant.
158+
159+
Compiler passes work on one function at a time, and by default run sequentially
160+
and exactly once.
161+
162+
The `lower` pass is special; it converts the SSA representation from being
163+
machine-independent to being machine-dependent. That is, some abstract operators
164+
are replaced with their non-generic counterparts, potentially reducing or
165+
increasing the final number of values.
166+
167+
<!---
168+
TODO: Probably explain here why the ordering of the passes matters, and why some
169+
passes like deadstore have multiple variants at different stages.
170+
-->
171+
172+
See the `passes` list defined in [compile.go](compile.go) for more information.
173+
174+
### Playing with SSA
49175

50176
A good way to see and get used to the compiler's SSA in action is via
51-
GOSSAFUNC. For example, to see func Foo's initial SSA form and final
177+
`GOSSAFUNC`. For example, to see func `Foo`'s initial SSA form and final
52178
generated assembly, one can run:
53179

54180
GOSSAFUNC=Foo go build
55181

56-
The generated ssa.html file will also contain the SSA func at each of
57-
the compile passes, making it easy to see what each pass does to a
58-
particular program. You can also click on values and blocks to highlight
59-
them, to help follow the control flow and values.
182+
The generated `ssa.html` file will also contain the SSA func at each of the
183+
compile passes, making it easy to see what each pass does to a particular
184+
program. You can also click on values and blocks to highlight them, to help
185+
follow the control flow and values.
186+
187+
<!---
188+
TODO: need more ideas for this section
189+
-->
190+
191+
### Hacking on SSA
192+
193+
While most compiler passes are implemented directly in Go code, some others are
194+
code generated. This is currently done via rewrite rules, which have their own
195+
syntax and are maintained in `gen/*.rules`. Simpler optimizations can be written
196+
easily and quickly this way, but rewrite rules are not suitable for more complex
197+
optimizations.
198+
199+
To read more on rewrite rules, have a look at the top comments in
200+
[gen/generic.rules](gen/generic.rules) and [gen/rulegen.go](gen/rulegen.go).
201+
202+
Similarly, the code to manage operators is also code generated from
203+
`gen/*Ops.go`, as it is easier to maintain a few tables than a lot of code.
204+
After changing the rules or operators, see [gen/README](gen/README) for
205+
instructions on how to generate the Go code again.
206+
207+
<!---
208+
TODO: more tips and info could likely go here
209+
-->

src/cmd/compile/internal/ssa/value.go

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ type Value struct {
2525
Op Op
2626

2727
// The type of this value. Normally this will be a Go type, but there
28-
// are a few other pseudo-types, see type.go.
28+
// are a few other pseudo-types, see ../types/type.go.
2929
Type *types.Type
3030

3131
// Auxiliary info for this value. The type of this information depends on the opcode and type.

0 commit comments

Comments
 (0)