Skip to content

Commit 122ec0c

Browse files
authored
Tool for exploring performance by varying JIT behavior (#381)
Initial version of a tool that can run BenchmarkDotNet (BDN) over a set of benchmarks in a feedback loop. The tool can vary JIT behavior, observe the impact this modification on jitted code or benchmark perf, and then plan and try out further variations in pursuit of some goal (say higher perf, or smaller code, etc). Requires access to InstructionsRetiredExplorer as a helper tool, for parsing the ETW that BDN produces. Also requires a local enlistment of the performance repo. You will need to modify file paths within the source to adapt all this to your local setup. Must be run with admin priveleges so that BDN can collect ETW. The only supported variation right now is modification of which CSEs we allow the JIT to perform for the hottest Tier-1 method in each benchmark. If a benchmark does not have a sufficiently hot Tier-1 method, then it is effectively left out of the experiment. The experiments on each benchmark are prioritized to explore variations in performance for subsets of currently performed CSEs. For methods with many CSEs we can realistically afford to only explore a small fraction of all possibilities. So we try and bias the exploration towards CSEs that have higher performance impacts. Results are locally cached so that rerunning the tool will not rerun experiments. Experiments are summarized by CSV file with a schema that lists benchmark name, number of CSEs, code size, perf score, and perf.
1 parent 7e6e1dc commit 122ec0c

File tree

10 files changed

+1108
-3
lines changed

10 files changed

+1108
-3
lines changed

README.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,14 +11,16 @@ Current tools include:
1111
2. [CI jobs information](doc/cijobs.md): cijobs.
1212
3. [JIT source code formatting](doc/formatting.md): jit-format.
1313
4. [General tools](doc/tools.md): pmi
14+
5. [Experimental tools](src/performance-explorer/README.md): performance-explorer
15+
1416
## Getting started
1517

1618
1. Clone the jitutils repo:
1719
```
1820
git clone https://github.com/dotnet/jitutils
1921
```
2022

21-
2. Install the 2.1 .NET Core SDK (including the `dotnet` command-line interface, or CLI) from [here](https://dot.net).
23+
2. Install a recent .NET Core SDK (including the `dotnet` command-line interface, or CLI) from [here](https://dot.net).
2224

2325
3. Build the tools:
2426
```

build.cmd

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,7 @@ REM Do as many builds as possible; don't stop on first failure (if any).
4444
set __ExitCode=0
4545

4646
REM Declare the list of projects
47-
set projects=jit-diff jit-dasm jit-analyze jit-format pmi jit-dasm-pmi jit-decisions-analyze
47+
set projects=jit-diff jit-dasm jit-analyze jit-format pmi jit-dasm-pmi jit-decisions-analyze performance-explorer
4848

4949
REM Build each project
5050
for %%p in (%projects%) do (

build.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,7 @@ while getopts "hpb:" opt; do
4848
done
4949

5050
# declare the array of projects
51-
declare -a projects=(jit-dasm jit-diff jit-analyze jit-format pmi jit-dasm-pmi jit-decisions-analyze)
51+
declare -a projects=(jit-dasm jit-diff jit-analyze jit-format pmi jit-dasm-pmi jit-decisions-analyze performance-explorer)
5252

5353
# for each project either build or publish
5454
for proj in "${projects[@]}"

src/performance-explorer/README.md

Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
### Performance Explorer
2+
3+
Performance Explorer is a tool to examine the impact of changing JIT behavior on key methods in a benchmark.
4+
It is currently specialized to explore varying the CSEs in the most dominant Tier-1 method of a benchmark.
5+
6+
### Setup
7+
8+
This tool currently only works on Windows.
9+
10+
To run Performance Explorer, you must have local enlistments of:
11+
* [the runtime repo](https://github.com/dotnet/runtime)
12+
* [the performance repo](https://github.com/dotnet/performance)
13+
* [instructions retired explorer](https://github.com/AndyAyersMS/InstructionsRetiredExplorer)
14+
15+
You will need to do both release and checked builds of the runtime repo, and create the associated
16+
test directories (aka Core_Roots).
17+
18+
You will need to build the instructions retired explorer.
19+
20+
You will need to modify file paths in the performance explorer code to refer to the locations
21+
of the above repos and builds, an to specify a results directory.
22+
23+
Finally, you will likely want to customize the list of benchmarks to explore; the names of these
24+
are the names used in the performance repo. Note the names often contain quotes or other special
25+
characters so you will likely need to read up on how to handle these when they appear in C# literal strings.
26+
27+
Once you have made these modifications, you can then build the performance explorer.
28+
29+
The tool must be run as admin, in order to perform the necessary profiling.
30+
31+
### How It Works
32+
33+
For each benchmark in the list, performance explorer will:
34+
* run the benchmark from the perf directory, with `-p ETW` so that profile data is collected
35+
* parse the profile data using instructions retired explorer to find the hot methods
36+
* also parse the BenchmarkDotNet json to determine the performance of the benchmark
37+
* determine if there's a hot method that would be a good candidate for exploration. Currently we look for a Tier-1 method that accounts for at least 20% of the benchmark time.
38+
* if there is a suitable hot method:
39+
* run an SPMI collection for that benchmark
40+
* use that SPMI to get an assembly listing for the hot method
41+
* determine from that listing how many CSEs were performed (the "default set" of N CSEs)
42+
* if there were any CSEs, start the experimentation process:
43+
* run the benchmark with all CSEs disabled (0 CSEs), and measure perf. Add to the exploration queue.
44+
* then, repeatedly, until we have run out of experiment to try, or hit some predetermined limit
45+
* pick the best performing experiment from the queue
46+
* Determine which CSEs in the default set were not done in the experiment. Say there are M (<=N) of these
47+
* Run M more experiments, each adding one of the missing CSEs
48+
49+
Each benchmark's data is stored in a subfolder in the results directory; we also create disassembly for all the
50+
experiments tried, and copies of all the intermediate files.
51+
52+
There is also a master results.csv that has data from all experiments in all benchmarks, suitable for use
53+
in excel or as input to a machine learning algorithm.
54+
55+
If you re-run the tool with the same benchmark list and results directory, it will use the cached copies of
56+
data and won't re-run the experiments.
57+
58+
If along the way anything goes wrong then an "error.txt" file is added to the results subdirectory for
59+
that benchmark, and future runs will skip that benchmark.
60+
61+
So say there are 2 CSEs by default. The explorer will run:
62+
* one experiment with 0 CSEs
63+
* two experiments each with 1 CSE
64+
* one experiment with 2 CSEs
65+
and then stop as all possibilities have been explored.
66+
67+
For larger values of N the number of possible experiments 2^N grows rapidly and we cannot hope to explore
68+
the full space. The exploration process is intended to prioritize for those experiments that likely have
69+
the largest impact on performance.
70+
71+
### Future Enhancements
72+
73+
* add option to offload benchmark runs to the perf lab
74+
* capture more details about CSEs so we can use the data to develop better CSE heuristics
75+
* generalize the experiment processing to allow other kinds of experiments
76+
* parameterize the config settings so we don't need to modify the sources
77+
* add options to characterize the noise level of benchmarks and (perhaps) do more runs if noisy
78+
* leverage SPMI instead of perf runs, if we can trust perf scores
79+
Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
// Licensed to the .NET Foundation under one or more agreements.
2+
// The .NET Foundation licenses this file to you under the MIT license.
3+
// See the LICENSE file in the project root for more information.
4+
5+
using System.IO;
6+
using System;
7+
8+
public class BenchmarkInfo
9+
{
10+
public string Name { get; init; }
11+
public double Ratio { get; set; }
12+
13+
public string CleanName
14+
{
15+
get
16+
{
17+
string cleanName = Name;
18+
if (cleanName.Length > 100)
19+
{
20+
int parensIndex = cleanName.IndexOf('(');
21+
static string Last(string s, int num) => s.Length < num ? s : s[^num..];
22+
if (parensIndex == -1)
23+
{
24+
cleanName = Last(cleanName, 100);
25+
}
26+
else
27+
{
28+
string benchmarkName = cleanName[..parensIndex];
29+
string paramsStr = cleanName[(parensIndex + 1)..^1];
30+
cleanName = Last(benchmarkName, Math.Max(50, 100 - paramsStr.Length)) + "(" + Last(paramsStr, Math.Max(50, 100 - benchmarkName.Length)) + ")";
31+
}
32+
}
33+
34+
foreach (char illegalChar in Path.GetInvalidFileNameChars())
35+
{
36+
cleanName = cleanName.Replace(illegalChar, '_');
37+
}
38+
39+
cleanName = cleanName.Replace(' ', '_');
40+
41+
return cleanName;
42+
}
43+
}
44+
45+
public string CsvName => CleanName.Replace(',', '_');
46+
47+
}
Lines changed: 177 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,177 @@
1+
// Licensed to the .NET Foundation under one or more agreements.
2+
// The .NET Foundation licenses this file to you under the MIT license.
3+
// See the LICENSE file in the project root for more information.
4+
5+
// Classes for deserializing BenchmarkDotNet .json result files
6+
7+
using System;
8+
using System.Collections.Generic;
9+
using System.IO;
10+
using System.Text.Json;
11+
12+
public class ChronometerFrequency
13+
{
14+
public int Hertz { get; set; }
15+
}
16+
17+
public class HostEnvironmentInfo
18+
{
19+
public string BenchmarkDotNetCaption { get; set; }
20+
public string BenchmarkDotNetVersion { get; set; }
21+
public string OsVersion { get; set; }
22+
public string ProcessorName { get; set; }
23+
public int? PhysicalProcessorCount { get; set; }
24+
public int? PhysicalCoreCount { get; set; }
25+
public int? LogicalCoreCount { get; set; }
26+
public string RuntimeVersion { get; set; }
27+
public string Architecture { get; set; }
28+
public bool? HasAttachedDebugger { get; set; }
29+
public bool? HasRyuJit { get; set; }
30+
public string Configuration { get; set; }
31+
public string JitModules { get; set; }
32+
public string DotNetCliVersion { get; set; }
33+
public ChronometerFrequency ChronometerFrequency { get; set; }
34+
public string HardwareTimerKind { get; set; }
35+
}
36+
37+
public class ConfidenceInterval
38+
{
39+
public int N { get; set; }
40+
public double Mean { get; set; }
41+
public double StandardError { get; set; }
42+
public int Level { get; set; }
43+
public double Margin { get; set; }
44+
public double Lower { get; set; }
45+
public double Upper { get; set; }
46+
}
47+
48+
public class Percentiles
49+
{
50+
public double P0 { get; set; }
51+
public double P25 { get; set; }
52+
public double P50 { get; set; }
53+
public double P67 { get; set; }
54+
public double P80 { get; set; }
55+
public double P85 { get; set; }
56+
public double P90 { get; set; }
57+
public double P95 { get; set; }
58+
public double P100 { get; set; }
59+
}
60+
61+
public class Statistics
62+
{
63+
public double[] OriginalValues { get; set; }
64+
public int N { get; set; }
65+
public double Min { get; set; }
66+
public double LowerFence { get; set; }
67+
public double Q1 { get; set; }
68+
public double Median { get; set; }
69+
public double Mean { get; set; }
70+
public double Q3 { get; set; }
71+
public double UpperFence { get; set; }
72+
public double Max { get; set; }
73+
public double InterquartileRange { get; set; }
74+
public List<double> LowerOutliers { get; set; }
75+
public List<double> UpperOutliers { get; set; }
76+
public List<double> AllOutliers { get; set; }
77+
public double StandardError { get; set; }
78+
public double Variance { get; set; }
79+
public double StandardDeviation { get; set; }
80+
public double? Skewness { get; set; }
81+
public double? Kurtosis { get; set; }
82+
public ConfidenceInterval ConfidenceInterval { get; set; }
83+
public Percentiles Percentiles { get; set; }
84+
}
85+
86+
public class Memory
87+
{
88+
public int Gen0Collections { get; set; }
89+
public int Gen1Collections { get; set; }
90+
public int Gen2Collections { get; set; }
91+
public long TotalOperations { get; set; }
92+
public long BytesAllocatedPerOperation { get; set; }
93+
}
94+
95+
public class Measurement
96+
{
97+
public string IterationStage { get; set; }
98+
public int LaunchIndex { get; set; }
99+
public int IterationIndex { get; set; }
100+
public long Operations { get; set; }
101+
public double Nanoseconds { get; set; }
102+
}
103+
104+
public class Metric
105+
{
106+
public double Value { get; set; }
107+
public MetricDescriptor Descriptor { get; set; }
108+
}
109+
110+
public class MetricDescriptor
111+
{
112+
public string Id { get; set; }
113+
public string DisplayName { get; set; }
114+
public string Legend { get; set; }
115+
public string NumberFormat { get; set; }
116+
public int UnitType { get; set; }
117+
public string Unit { get; set; }
118+
public bool TheGreaterTheBetter { get; set; }
119+
public int PriorityInCategory { get; set; }
120+
}
121+
122+
public class Benchmark
123+
{
124+
public string DisplayInfo { get; set; }
125+
public string Namespace { get; set; }
126+
public string Type { get; set; }
127+
public string Method { get; set; }
128+
public string MethodTitle { get; set; }
129+
public string Parameters { get; set; }
130+
public string FullName { get; set; }
131+
public Statistics Statistics { get; set; }
132+
public Memory Memory { get; set; }
133+
public List<Measurement> Measurements { get; set; }
134+
public List<Metric> Metrics { get; set; }
135+
}
136+
137+
public class BdnResult
138+
{
139+
public string Title { get; set; }
140+
public HostEnvironmentInfo HostEnvironmentInfo { get; set; }
141+
public List<Benchmark> Benchmarks { get; set; }
142+
}
143+
144+
public class BdnParser
145+
{
146+
// Return performance of this benchmark (in microseconds)
147+
public static double GetPerf(string bdnJsonFile)
148+
{
149+
double perf = 0;
150+
string bdnJsonLines = File.ReadAllText(bdnJsonFile);
151+
BdnResult bdnResult = JsonSerializer.Deserialize<BdnResult>(bdnJsonLines)!;
152+
153+
// Assume all runs are for the same benchmark
154+
// Handle possibility of multiple runs (via --LaunchCount)
155+
//
156+
foreach (Benchmark b in bdnResult.Benchmarks)
157+
{
158+
double sum = 0;
159+
long ops = 0;
160+
161+
foreach (Measurement m in b.Measurements)
162+
{
163+
if (!m.IterationStage.Equals("Result"))
164+
{
165+
continue;
166+
}
167+
168+
sum += m.Nanoseconds;
169+
ops += m.Operations;
170+
}
171+
172+
perf = (sum / ops) / 1000;
173+
}
174+
175+
return perf;
176+
}
177+
}
Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
// Licensed to the .NET Foundation under one or more agreements.
2+
// The .NET Foundation licenses this file to you under the MIT license.
3+
// See the LICENSE file in the project root for more information.
4+
5+
6+
using PerformanceExplorer;
7+
using System.Globalization;
8+
using System.IO;
9+
using System.Text.Json;
10+
using System;
11+
using System.Xml.Serialization;
12+
13+
public class CseExperiment
14+
{
15+
public BenchmarkInfo Benchmark { get; set; }
16+
public CseExperiment Baseline { get; set; }
17+
public HotFunction Method { get; set; }
18+
public uint Mask { get; set; }
19+
public uint NumCse { get; set; }
20+
public uint CodeSize { get; set; }
21+
public double PerfScore { get; set; }
22+
public double Perf { get; set; }
23+
public bool Explored { get; set; }
24+
25+
public string Hash { get; set; }
26+
27+
public uint Index { get; set; }
28+
29+
public bool IsImprovement { get { return Perf < Baseline.Perf; } }
30+
31+
public static string Schema
32+
{
33+
get
34+
{
35+
return "Benchmark,Index,Mask,NumCse,CodeSize,PerfScore,PerfScoreRatio,Perf,PerfRatio";
36+
}
37+
}
38+
39+
public string Info
40+
{
41+
get
42+
{
43+
double perfRatio = (Baseline == null) ? 1.0 : Perf / Baseline.Perf;
44+
double perfScoreRatio = (Baseline == null) ? 1.0 : PerfScore / Baseline.PerfScore;
45+
return $"{Benchmark.CsvName},{Index},{Mask:x8},{NumCse},{CodeSize},{PerfScore:F2},{perfScoreRatio:F3},{Perf:F4},{perfRatio:F3}";
46+
}
47+
}
48+
}

0 commit comments

Comments
 (0)