Skip to content

Commit 0fc8ba9

Browse files
authored
Merge pull request #82 from matthewhammer/master
Update profile queries documentation
2 parents 0d2b8ca + b0bfd21 commit 0fc8ba9

File tree

4 files changed

+139862
-75497
lines changed

4 files changed

+139862
-75497
lines changed

profile-queries.md

Lines changed: 154 additions & 58 deletions
Original file line numberDiff line numberDiff line change
@@ -11,8 +11,13 @@ how we profile its performance. We intend this profiling effort to address
1111

1212
## Quick Start
1313

14+
### 0. Enable debug assertions
15+
```
16+
./configure --enable-debug-assertions
17+
```
18+
1419
### 1. Compile `rustc`
15-
Compile the compiler, as usual:
20+
Compile the compiler, up to at least stage 1:
1621

1722
```
1823
python x.py --stage 1
@@ -22,31 +27,40 @@ python x.py --stage 1
2227
Run the compiler on a source file, supplying two additional debugging flags with `-Z`:
2328

2429
```
25-
rustc -Z profile-queries -Z dump-dep-graph foo.rs
30+
rustc -Z profile-queries -Z incremental=cache foo.rs
2631
```
2732

2833
Regarding the two additional parameters:
2934

3035
- `-Z profile-queries` tells the compiler to run a separate thread
31-
that profiles the queries made by the main compiler thread(s).
32-
- `-Z dump-dep-graph` tells the compiler to "dump" various files that
33-
describe the compilation dependencies.
36+
that profiles the queries made by the main compiler thread(s).
37+
- `-Z incremental=cache` tells the compiler to "cache" various files
38+
that describe the compilation dependencies, in the subdirectory
39+
`cache`.
3440

3541
This command will generate the following files:
3642

3743
- `profile_queries.html` consists of an HTML-based representation of
3844
the [trace of queries](#trace-of-queries).
3945
- `profile_queries.counts.txt` consists of a histogram, where each histogram "bucket" is a query provider.
4046

41-
### 3. Inspect the output
42-
3(a). Open the HTML file (`profile_queries.html`) with a browser. See [this section](#interpret-the-html-output) for an explanation of this file.
4347

44-
3(b). Open the data file (`profile_queries.counts.txt`) with a text editor, or spreadsheet. See [this section](#interpret-the-data-output) for an explanation of this file.
48+
### 3. Run `rustc`, with `-Z time-passes`:
49+
50+
- This additional flag will add all timed passes to the output files
51+
mentioned above, in step 2. As described below, these passes appear
52+
visually distinct from the queries in the HTML output (they
53+
currently appear as green boxes, via CSS).
4554

46-
3(c). Older stuff, also generated as output (you can _ignore these files_; we won't discuss them further here):
55+
### 4. Inspect the output
4756

48-
- `dep_graph.dot` consists of old stuff: a representation of dependencies that are _outside_ the newer query model.
49-
- `dep_graph.txt` consists of old stuff: a representation of dependencies that are _outside_ the newer query model.
57+
- 4(a). Open the HTML file (`profile_queries.html`) with a browser.
58+
See [this section](#interpret-the-html-output) for an explanation of
59+
this file.
60+
- 4(b). Open the data file (`profile_queries.counts.txt`) with a text
61+
editor, or spreadsheet. See [this
62+
section](#interpret-the-data-output) for an explanation of this
63+
file.
5064

5165

5266
## Interpret the HTML Output
@@ -69,27 +83,28 @@ The trace of the queries has a formal structure; see
6983

7084
We style this formal structure as follows:
7185

72-
- Blue dots represent query hits. They consist of leaves in the
73-
trace's tree. (CSS class: `hit`).
74-
- Red boxes represent query misses. They consist of internal nodes in
75-
the trace's tree. (CSS class: `miss`).
76-
- Many red boxes contain _nested boxes and dots_. This nesting structure
77-
reflects that some providers _depend on_ results from other
78-
providers, which consist of their nested children.
79-
- For example, the red box labeled as `typeck_tables_of` depends
80-
on the one labeled `adt_dtorck_constraint`, which itself
81-
depends on one labeled `coherent_trait`.
82-
- Some red boxes are _labeled_ with text, and have highlighted borders
83-
(light red, and bolded). (See [heuristics](#heuristics) for
84-
details). Where they are present, the labels give the following
85-
information:
86+
- **Timed passes:** Green boxes, when present (via `-Z time-passes`), represent _timed
87+
passes_ in the compiler. In future versions, these passes may be
88+
replaced by queries, explained below.
89+
- **Labels:** Some green and red boxes are labeled with text. Where they are
90+
present, the labels give the following information:
8691
- The [query's _provider_](#queries), sans its _key_ and its _result_,
8792
which are often too long to include in these labels.
8893
- The _duration_ of the provider, as a fraction of the total time
8994
(for the entire trace). This fraction includes the query's
9095
entire extent (that is, the sum total of all of its
9196
sub-queries).
92-
97+
- **Query hits:** Blue dots represent query hits. They consist of leaves in the
98+
trace's tree. (CSS class: `hit`).
99+
- **Query misses:** Red boxes represent query misses. They consist of internal nodes in
100+
the trace's tree. (CSS class: `miss`).
101+
- **Nesting structure:** Many red boxes contain _nested boxes and
102+
dots_. This nesting structure reflects that some providers _depend
103+
on_ results from other providers, which consist of their nested
104+
children.
105+
- Some red boxes are _labeled_ with text, and have highlighted borders
106+
(light red, and bolded). (See [heuristics](#heuristics) for
107+
details).
93108

94109
## Heuristics
95110

@@ -101,12 +116,17 @@ Heuristics-based CSS Classes:
101116
but easy to modify). Important nodes are styled with textual
102117
labels, and highlighted borders (light red, and bolded).
103118

119+
- `frac-50`, `-40`, ... -- Trace nodes whose total duration (self and
120+
children) take a large fraction of the total duration, at or above
121+
50%, 40%, and so on. We style nodes these with larger font and
122+
padding.
123+
104124
## Interpret the Data Output
105125

106126
The file `profile_queries.counts.txt` contains a table of information
107127
about the queries, organized around their providers.
108128

109-
For each provider, we produce:
129+
For each provider (or timed pass, when `-Z time-passes` is present), we produce:
110130

111131
- A total **count** --- the total number of times this provider was
112132
queried
@@ -125,45 +145,113 @@ The following example `profile_queries.counts.txt` file results from
125145
running on a hello world program (a single main function that uses
126146
`println` to print `"hellow world").
127147

128-
As explained above, the columns consist of `provider`, `count`, `duration`:
148+
As explained above, the columns consist of `provider/pass`, `count`, `duration`:
129149

130150
```
131-
symbol_name,2441,0.362
132-
def_symbol_name,2414,0.129
133-
item_attrs,5300,0.060
134-
type_of,4841,0.059
135-
generics_of,7216,0.049
136-
impl_trait_ref,2898,0.037
137-
def_span,20381,0.030
138-
adt_def,1142,0.028
139-
is_foreign_item,2425,0.021
140-
adt_dtorck_constraint,2,0.016
141-
typeck_tables_of,33,0.014
142-
typeck_item_bodies,1,0.010
143-
coherent_trait,7,0.008
144-
adt_destructor,10,0.008
145-
borrowck,4,0.008
146-
mir_validated,4,0.007
147-
impl_parent,306,0.003
148-
trait_def,216,0.001
149-
mir_const,2,0.001
150-
optimized_mir,6,0.000
151-
adt_sized_constraint,9,0.000
152-
predicates_of,82,0.000
153-
privacy_access_levels,5,0.000
151+
translation,1,0.891
152+
symbol_name,2658,0.733
153+
def_symbol_name,2556,0.268
154+
item_attrs,5566,0.162
155+
type_of,6922,0.117
156+
generics_of,8020,0.084
157+
serialize dep graph,1,0.079
158+
relevant_trait_impls_for,50,0.063
159+
def_span,24875,0.061
160+
expansion,1,0.059
161+
const checking,1,0.055
162+
adt_def,1141,0.048
163+
trait_impls_of,32,0.045
164+
is_copy_raw,47,0.045
165+
is_foreign_item,2638,0.042
166+
fn_sig,2172,0.033
167+
adt_dtorck_constraint,2,0.023
168+
impl_trait_ref,2434,0.023
169+
typeck_tables_of,29,0.022
170+
item-bodies checking,1,0.017
171+
typeck_item_bodies,1,0.017
172+
is_default_impl,2320,0.017
173+
borrow checking,1,0.014
174+
borrowck,4,0.014
175+
mir_validated,4,0.013
176+
adt_destructor,10,0.012
177+
layout_raw,258,0.010
178+
load_dep_graph,1,0.007
179+
item-types checking,1,0.005
180+
mir_const,2,0.005
181+
name resolution,1,0.004
182+
is_object_safe,35,0.003
183+
is_sized_raw,89,0.003
184+
parsing,1,0.003
185+
is_freeze_raw,11,0.001
186+
privacy checking,1,0.001
187+
privacy_access_levels,5,0.001
188+
resolving dependency formats,1,0.001
189+
adt_sized_constraint,9,0.001
190+
wf checking,1,0.001
191+
liveness checking,1,0.001
192+
compute_incremental_hashes_map,1,0.001
193+
match checking,1,0.001
194+
type collecting,1,0.001
195+
param_env,31,0.000
196+
effect checking,1,0.000
197+
trait_def,140,0.000
198+
lowering ast -> hir,1,0.000
199+
predicates_of,70,0.000
200+
extern_crate,319,0.000
201+
lifetime resolution,1,0.000
202+
is_const_fn,6,0.000
203+
intrinsic checking,1,0.000
204+
translation item collection,1,0.000
154205
impl_polarity,15,0.000
155-
trait_of_item,7,0.000
156-
region_maps,11,0.000
206+
creating allocators,1,0.000
207+
language item collection,1,0.000
208+
crate injection,1,0.000
209+
early lint checks,1,0.000
210+
indexing hir,1,0.000
211+
maybe creating a macro crate,1,0.000
212+
coherence checking,1,0.000
213+
optimized_mir,6,0.000
214+
is_panic_runtime,33,0.000
157215
associated_item_def_ids,7,0.000
216+
needs_drop_raw,10,0.000
217+
lint checking,1,0.000
218+
complete gated feature checking,1,0.000
219+
stability index,1,0.000
220+
region_maps,11,0.000
158221
super_predicates_of,8,0.000
159-
variances_of,12,0.000
222+
coherent_trait,2,0.000
223+
AST validation,1,0.000
224+
loop checking,1,0.000
225+
static item recursion checking,1,0.000
226+
variances_of,11,0.000
227+
associated_item,5,0.000
228+
plugin loading,1,0.000
229+
looking for plugin registrar,1,0.000
230+
stability checking,1,0.000
231+
describe_def,15,0.000
232+
variance testing,1,0.000
233+
codegen unit partitioning,1,0.000
234+
looking for entry point,1,0.000
235+
checking for inline asm in case the target doesn't support it,1,0.000
236+
inherent_impls,1,0.000
160237
crate_inherent_impls,1,0.000
161-
is_exported_symbol,2,0.000
162-
associated_item,3,0.000
238+
trait_of_item,7,0.000
163239
crate_inherent_impls_overlap_check,1,0.000
240+
attribute checking,1,0.000
241+
internalize symbols,1,0.000
242+
impl wf inference,1,0.000
243+
death checking,1,0.000
244+
reachability checking,1,0.000
164245
reachable_set,1,0.000
165-
is_mir_available,1,0.000
166-
inherent_impls,1,0.000
246+
is_exported_symbol,3,0.000
247+
is_mir_available,2,0.000
248+
unused lib feature checking,1,0.000
249+
maybe building test harness,1,0.000
250+
recursion limit,1,0.000
251+
write allocator module,1,0.000
252+
assert dep graph,1,0.000
253+
plugin registration,1,0.000
254+
write metadata,1,0.000
167255
```
168256

169257
# Background
@@ -250,5 +338,13 @@ too).
250338

251339
## Links
252340

341+
Related design ideas, and tracking issues:
342+
343+
- Design document: [On-demand Rustc incremental design doc](https://github.com/nikomatsakis/rustc-on-demand-incremental-design-doc/blob/master/0000-rustc-on-demand-and-incremental.md)
344+
- Tracking Issue: ["Red/Green" dependency tracking in compiler](https://github.com/rust-lang/rust/issues/42293)
345+
346+
More discussion and issues:
347+
348+
- https://github.com/rust-lang/rust/issues/42633
253349
- https://internals.rust-lang.org/t/incremental-compilation-beta/4721
254350
- https://blog.rust-lang.org/2016/09/08/incremental.html

profile-queries/example0.png

47.8 KB
Loading

0 commit comments

Comments
 (0)