You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This document describes the idea and implementation details for parallel type-checking of independent files in the F# compiler.
4
3
5
4
Performance of F# compilation and code analysis is one of the concerns for big codebases.
6
-
7
-
Despite several recent improvements in this area, there is still appetite and potential for change.
8
-
9
-
One idea for such an improvement was originally described in https://github.com/dotnet/fsharp/discussions/11634 by @kerams .
10
-
It is going to be the main topic of this page.
11
-
12
-
But before we dive into the details of the proposal, let's first discuss how the things work at the moment.
5
+
One way to speed it up was originally described in https://github.com/dotnet/fsharp/discussions/11634 by @kerams .
6
+
That is going to be the main topic of this page.
7
+
But before we dive into the details, let's first discuss how the things work at the moment.
13
8
14
9
## Context and the current state of the compiler
15
10
@@ -29,8 +24,9 @@ This instance is incrementally built on as more and more files have been process
29
24
30
25
### Recent addition - "Parallel type checking for impl files with backing sig files"
31
26
32
-
A recent [change](https://github.com/dotnet/fsharp/pull/13737) introduced in the compiler introduced a level of parallelism in type-checking.
33
-
It allows for parallel type-checking of all `.fs` files backed by `.fsi` files. Such `.fs` files by definition cannot be depended upon by any other files w.r.t. type-checking, since all the necessary information is exposed by the corresponding `.fsi` files.
27
+
A recent [change](https://github.com/dotnet/fsharp/pull/13737) introduced in the compiler added a level of parallelism in type-checking (behind an experimental feature flag).
28
+
It allows for parallel type-checking of implementation files backed by signature files.
29
+
Such files by definition cannot be depended upon by any other files w.r.t. type-checking, since all the necessary information is exposed by the corresponding `.fsi` files.
34
30
35
31
The new feature, when enabled, allows partial parallelisation of type-checking as follows:
36
32
1. All `.fsi` files and `.fs` files without backing `.fsi` files are type-checked in sequence, as before.
The feature is opt-in and can be enabled in the compiler via a CLI arg & MSBuild property.
48
43
49
44
## The importance of using Server GC for parallel work
50
45
51
46
By default .NET processes use Workstation GC, which is single-threaded. What this means is it can become a bottleneck for highly-parallel operations, due to increased GC pressure and the cost of GC pauses being multiplied by the number of threads waiting.
52
47
That is why when increasing parallelisation of the compiler and the compiler service it is important to note the GC mode being used and consider enabling Server GC.
48
+
This is no different for parallel type-checking - any performance tests of the feature should be done using Server GC.
53
49
54
50
Below is an example showing the difference it can make for a parallel workflow.
55
51
@@ -67,10 +63,10 @@ For more details see https://github.com/dotnet/fsharp/pull/13521
67
63
68
64
The main idea is quite simple:
69
65
- process files in a graph order instead of sequential order
70
-
- reduce the dependency graph used for type-checking, increasing parallelism
71
-
- implement branching and merging of type-checking information
66
+
-quickly reduce the dependency graph used for type-checking, increasing parallelism possible
67
+
- implement delta-based type-checking that allows building a 'fresh' TcState copy from a list of delta-based results.
72
68
73
-
Below is some quasi-theoretical background on this.
69
+
Below is some quasi-theoretical background on type-checking in general.
74
70
75
71
### Background
76
72
Files in an F# project are ordered and processed from the top (first) and the bottom (last) file.
@@ -88,15 +84,15 @@ By default, they are type-checked in the order of appearance: `[A.fs, B.fs, C.fs
88
84
Let's define `allowed dependency` as follows:
89
85
> If the contents of 'B.fs' _can_, based on its position in the project hierarchy, influence the type-checking process of 'A.fs', then 'A.fs' -> 'B.fs' is an _allowed dependency_
90
86
91
-
The graph of dependencies between the files that the compiler _allows_ looks as follows:
87
+
The _allowed dependencies graph_ looks as follows:
92
88
```
93
89
A.fs -> []
94
90
B.fs -> [A.fs]
95
91
C.fs -> [B.fs; A.fs]
96
92
D.fs -> [C.fs; B.fs; A.fs]
97
93
```
98
94
99
-
Type-checking files in the appearance order guarantees that when processing a given file, any files it _might_depend on w.r.t. type-checking have already been type-checked and their type information is available.
95
+
Sequential type-checking of files in the appearance order guarantees that when processing a given file, any files it _might_need w.r.t. type-checking have already been type-checked and their type information is available.
100
96
101
97
### Necessary dependencies
102
98
@@ -107,8 +103,9 @@ And finally a `dependency graph` as follows:
107
103
> A _dependency graph_ is any graph that is a subset of the `allowed dependencies` graph and a superset of the `necessary dependencies` graph
108
104
109
105
A few slightly imprecise/vague statements about all the graphs:
106
+
1. Any dependency graph is a directed, acycling graph (DAG).
110
107
1. The _Necessary dependencies_ graph is a subgraph of the _allowed dependencies_ graph.
111
-
2. If there is no path between 'B.fs' and 'C.fs' in the _necessary dependencies_ graph, they can be type-checked in parallel, as long as there is a way to maintain and merge more than one instance of type-checking information.
108
+
2. If there is no path between 'B.fs' and 'C.fs' in the _necessary dependencies_ graph, they can be type-checked in parallel (as long as there is a way to maintain more than one instance of type-checking information).
112
109
3. Type-checking _must_ process files in an order that is compatible with the topological order in the _necessary dependencies_ graph.
113
110
4. If using a dependency graph as an ordering mechanism for (parallel) type-checking, the closer it is to the _necessary dependencies_ graph, the higher parallelism is possible.
114
111
5. Type-checking files in appearance order is equivalent to using the `allowed dependencies` graph for ordering.
@@ -118,20 +115,33 @@ Let's look at point `6.` in more detail.
118
115
119
116
### The impact of reducing the dependency graph on type-checking parallelisation and wall-clock time.
120
117
121
-
Let's make a few definitions and assumptions:
122
-
1. Time it takes to type-check file f = 'T(f)'
123
-
2. Time it takes to type-check files f1...fn in parallel = 'T(f1+...fn)'
124
-
3. Time it takes to type-check a file f and all its dependencies = 'D(f)'
125
-
4. Type-checking is performed on a machine with infinite number of parallel processors.
126
-
5. There is no slowdowns due to parallel processing, ie. T(f1+...+fn) = max(T(f1),...,T(fn))
118
+
Let us make a few definitions and simplifications:
119
+
1. Time it takes to type-check file f = `T(f)`
120
+
2. Time it takes to type-check files f1...fn in parallel = `T(f1+...fn)`
121
+
3. Time it takes to type-check a file f and all its dependencies = `D(f)`
122
+
4. Time it takes to type-check the graph G = `D(G)`
123
+
5. Type-checking is performed on a machine with infinite number of parallel processors.
124
+
6. There is no slowdowns due to parallel processing, ie. T(f1+...+fn) = max(T(f1),...,T(fn))
127
125
128
126
With the above it can be observed that:
129
127
```
130
-
D(f) = max(D(n)) + T(f), for n = any necessary dependency of f
128
+
D(G) = max(D(f)), for any file 'f'
129
+
130
+
and
131
+
132
+
D(f) = max(D(n)) + T(f) for n = any necessary dependency of 'f'
131
133
```
132
134
In other words wall-clock time for type-checking using a given dependency graph is equal to the "longest" path in the graph.
133
135
134
-
Therefore the main goal that the idea presented here aims to achieve is to replace the _allowed dependencies_ graph as currently used with a reduced graph that's much closer to the _necessary dependencies_ graph, therefore optimising the type-checking process.
136
+
For the _allowed dependencies graph_ the following holds:
137
+
```
138
+
D(f) = T(f) + sum(T(g)), for all files 'g' above file 'f'
139
+
```
140
+
In other words, the longest path's length = the sum of times to type-check all files.
141
+
142
+
Therefore the change that parallel type-checking brings is the replacement of the _allowed dependencies_ graph as currently used with a reduced graph that is:
143
+
- much more similar to the _necessary dependencies_ graph,
144
+
- providing a smaller value of `D(G)`.
135
145
136
146
## A way to reduce the dependency graph used
137
147
@@ -141,93 +151,73 @@ However, there exist cheaper solutions that reduce the initial graph significant
141
151
142
152
As noted in https://github.com/dotnet/fsharp/discussions/11634 , scanning the ASTs can provide a lot information that helps narrow down the set of types, modules/namespaces and files that a given file _might_ depend on.
143
153
144
-
This is the approach we're taking.
154
+
This is the approach used in this solution.
145
155
146
156
The dependency detection algorithm can be summarised as follows:
147
-
1. For each parsed file in parallel:
148
-
1. Extract its top-level module or a list of top-level namespaces
149
-
2. Find all partial module references in the AST by traversing it once. Consider `AutoOpens`, module abbreviations, partial opens etc.
150
-
2. Build a single [Trie](https://en.wikipedia.org/wiki/Trie) by adding all top-level items extracted in 1.i. Every module/namespace _segment_ (eg. `FSharp, Compiler, Service in FSharp.Compiler.Service`) is represented by a single edge in the Trie. Note down positions of all added files in the Trie.
151
-
3. For each file in parallel:
152
-
1. Clone the Trie.
153
-
2. Start by marking the root as 'reachable'.
154
-
3. Process all partial module references found in this file in 1.ii one-by-one, in order of appearance in the AST:
155
-
1. For a given partial reference:
156
-
1. Start at every 'reachable' node.
157
-
2. 'Extend' the node with the new partial reference by walking down the Trie. If a leaf is reached, do not go further/do not extend the Trie.
158
-
3. Mark the reached node as 'reachable'
159
-
4. Collect all reachable nodes and their ancestors.
160
-
5. Find all files that added one or more modules/namespaces to any of the nodes found in 5.
161
-
6. Return those as dependencies.
162
-
163
-
### Graph optimisation - going deeper than just top-level module
164
-
165
-
In 1.i of the above algorithm we only consider the top-level module/namespace(s) for each file.
166
-
As soon as some other file is able to navigate to that top-level item, we consider it a dependency.
167
-
168
-
A more or less straightforward optimisation of this behaviour would be to:
169
-
1. In 1.i Collect a tree of modules/namespaces that contain any types/exceptions that can be used for type inference.
170
-
2. In 2. add all the leaves from the tree to the Trie.
171
-
172
-
This should further reduce the graph.
157
+
1. For each parsed file in parallel, use its parsed AST to extract the following:
158
+
1. Top-level definitions (modules and namespaces). Consider `AutoOpens`.
2. Build a single [Trie](https://en.wikipedia.org/wiki/Trie) by adding all top-level items extracted in 1.1. Edges in the Trie represent module/namespace _segments_ (eg. `FSharp, Compiler, Service in FSharp.Compiler.Service`).
For each node, keep track of any files that define the prefix it represents.
163
+
3. For each file, in parallel:
164
+
1. Process all partial module references found in this file in 1.2. one-by-one, in order of appearance in the AST.
165
+
2. For each reference, identify what nodes in the Trie can be located using it.
166
+
3. Collect all nodes reached this way and all files with any items located in those nodes.
167
+
4. Return those files as dependencies.
173
168
174
169
### Edge-case 1. - `[<AutoOpen>]`
175
170
176
-
Modules with `[<AutoOpen>]` are in a way 'transparent', meaning that all the types/nested modules inside them are surfaced as if they were on a level above.
177
-
178
-
The algorithm needs to consider that.
171
+
Modules with `[<AutoOpen>]` are in a way 'transparent', meaning that all the types/nested modules inside them are surfaced as if they were on a level above
179
172
180
-
The main problem with that is that `AutoOpenAttribute` could be aliased and hide behind a different name.
173
+
The main problem with that is that `System.AutoOpenAttribute` could be aliased and hide behind a different name.
181
174
Therefore it's not easy to see whether the attribute is being used based only on the AST.
182
175
183
176
There are ways to evaluate this, which involve scanning all module abbreviations in the project and in any referenced dlls.
177
+
However, currently the algorithm uses a shortcut: it checks whether the attribute type name is on a hardcoded list of "suspicious" names. This is not fully reliable, as an arbitrary type alias, eg. `type X = System.AutoOpenAttribute` will not be recognised correctly.
184
178
185
-
However, for now, the algorithm simply treats every module with _any_ attributes as an `[<AutoOpen>]` module.
186
-
187
-
This retains correctness, but reduces efficiency of the graph reduction. Optimisation ideas here are welcome.
179
+
The alternatives are:
180
+
- Consider any module with any attribute to be an `AutoOpen` module
181
+
- Identify any type aliases in any referenced code to be able to fully reliably determine whether a given attribute is in fact the `AutoOpenAttribute`
188
182
189
183
### Edge-case 2. - module abbreviations
190
184
191
-
At the moment the algorithm doesn't support files with module abbreviations. Adding support should be doable.
185
+
Module abbreviations do not require any special handling in the current algorithm.
186
+
Consider the following example:
187
+
```
188
+
// F1.fs
189
+
module A
190
+
module B = let x = 1
191
+
192
+
// F2.fs
193
+
module C
194
+
open A
195
+
module D = B
196
+
```
197
+
Here, the line `module D = B` generates the `F2.fs -> F1.fs` link, so no special code is needed to add it.
192
198
193
199
### Performance
200
+
There are two main factors w.r.t. performance of the graph-based type-checking:
201
+
1. The level of parallelisation allowed by the resolved dependency graph.
202
+
2. The overhead of creating the dependency graph and graph-based processing of the graph.
203
+
At minimum, to make this feature useful, any overhead (2.) cost should in the vast majority of usecases be significantly lower than the speedup generated by 1.
194
204
195
-
Little to no effort has been invested in optimising the algorithm.
196
-
197
-
However, initial tests show that in its current unoptimised form it is very performant.
Initial timings showed that the graph-based type-checking was significantly faster than sequential type-checking and faster than the two-phase type-checking feature.
206
+
Projects that were tested included:
207
+
-`FSharp.Compiler.Service`
208
+
-`Fantomas.Core`
209
+
-`FSharp.Compiler.ComponentTests`
209
210
210
-
Things that can be easily improved:
211
-
- Quicker AST traversal - tail-recursion, not using `seq`, avoid allocations
212
-
- Quicker Trie operations
213
-
214
-
#### Overhead of dispatching work and branching/merging state
215
-
216
-
On top of dependency resolution, the feature will add some overhead for dispatching work and allowing separation of type-checking state in the graph.
217
-
218
-
No timings are available at the moment.
211
+
TODO: Provide detailed timings
219
212
220
213
## The problem of maintaining multiple instances of type-checking information
221
214
222
215
The parallel type-checking idea generates a problem that needs to be solved.
223
-
Instead of one instance of the type-checking information, we now have to:
224
-
- 'clone' an instance multiple times when passing the state from one file to one or more of its dependants
225
-
- 'merge' an instance when receiving state instances from one or more dependencies
226
-
227
-
We believe this is doable, although no implementation exists as of yet.
216
+
Instead of one instance of the type-checking information, we now have to maintain multiple instances - one for each node in the graph.
217
+
We solve it in the following way:
218
+
1. Each file's type-checking results in a 'delta' function `'State -> 'State` which adds information to the state.
219
+
2. When type-checking a new file, its input state is built from scratch by evaluating delta functions of all its dependencies.
228
220
229
221
### Ordering of diagnostics/errors
230
222
231
-
As noted in https://github.com/dotnet/fsharp/pull/13737#issuecomment-1224124532 , disrupting the processing order makes it difficult to retain the original behaviour when it comes to the order in which diagnostics are presented and suppressed.
232
-
233
-
The importance of this problem and potential solutions are open questions.
223
+
Any changes in scheduling of work that can produce diagnostics can change the order in which diagnostics appear to the end user. To retain existing ordering of diagnostics, we use a mechanism where each work item first uses a dedicated logger, and at the end individual loggers are sequentially replayed into the single logger, in the desired order. This mechanism is used in a few places in the compiler already.
0 commit comments