Tool for exploring performance by varying JIT behavior (#381)

AndyAyersMS · web-flow · commit 122ec0c0d056 · 2023-10-03T11:43:54.000-07:00
Initial version of a tool that can run BenchmarkDotNet (BDN) over a set
of benchmarks in a feedback loop. The tool can vary JIT behavior,
observe the impact this modification on jitted code or benchmark perf,
and then plan and try out further variations in pursuit of some goal
(say higher perf, or smaller code, etc).

Requires access to InstructionsRetiredExplorer as a helper tool, for
parsing the ETW that BDN produces. Also requires a local enlistment of
the performance repo. You will need to modify file paths within the
source to adapt all this to your local setup. Must be run with admin
priveleges so that BDN can collect ETW.

The only supported variation right now is modification of which CSEs we
allow the JIT to perform for the hottest Tier-1 method in each
benchmark. If a benchmark does not have a sufficiently hot Tier-1
method, then it is effectively left out of the experiment.

The experiments on each benchmark are prioritized to explore variations
in performance for subsets of currently performed CSEs. For methods with
many CSEs we can realistically afford to only explore a small fraction
of all possibilities. So we try and bias the exploration towards CSEs
that have higher performance impacts.

Results are locally cached so that rerunning the tool will not rerun
experiments.

Experiments are summarized by CSV file with a schema that lists
benchmark name, number of CSEs, code size, perf score, and perf.
diff --git a/README.md b/README.md
@@ -11,14 +11,16 @@ Current tools include:
 2. [CI jobs information](doc/cijobs.md): cijobs.
 3. [JIT source code formatting](doc/formatting.md): jit-format.
 4. [General tools](doc/tools.md): pmi
+5. [Experimental tools](src/performance-explorer/README.md): performance-explorer
+
 ## Getting started
 
 1. Clone the jitutils repo:
 ```
     git clone https://github.com/dotnet/jitutils
 ```
 
-2. Install the 2.1 .NET Core SDK (including the `dotnet` command-line interface, or CLI) from [here](https://dot.net).
+2. Install a recent .NET Core SDK (including the `dotnet` command-line interface, or CLI) from [here](https://dot.net).
 
 3. Build the tools:
 ```
diff --git a/build.cmd b/build.cmd
@@ -44,7 +44,7 @@ REM Do as many builds as possible; don't stop on first failure (if any).
 set __ExitCode=0
 
 REM Declare the list of projects
-set projects=jit-diff jit-dasm jit-analyze jit-format pmi jit-dasm-pmi jit-decisions-analyze
+set projects=jit-diff jit-dasm jit-analyze jit-format pmi jit-dasm-pmi jit-decisions-analyze performance-explorer
 
 REM Build each project
 for %%p in (%projects%) do (
diff --git a/build.sh b/build.sh
@@ -48,7 +48,7 @@ while getopts "hpb:" opt; do
 done
 
 # declare the array of projects   
-declare -a projects=(jit-dasm jit-diff jit-analyze jit-format pmi jit-dasm-pmi jit-decisions-analyze)
+declare -a projects=(jit-dasm jit-diff jit-analyze jit-format pmi jit-dasm-pmi jit-decisions-analyze performance-explorer)
 
 # for each project either build or publish
 for proj in "${projects[@]}"
diff --git a/src/performance-explorer/README.md b/src/performance-explorer/README.md
@@ -0,0 +1,79 @@
+### Performance Explorer
+
+Performance Explorer is a tool to examine the impact of changing JIT behavior on key methods in a benchmark.
+It is currently specialized to explore varying the CSEs in the most dominant Tier-1 method of a benchmark.
+
+### Setup
+
+This tool currently only works on Windows.
+
+To run Performance Explorer, you must have local enlistments of:
+* [the runtime repo](https://github.com/dotnet/runtime)
+* [the performance repo](https://github.com/dotnet/performance)
+* [instructions retired explorer](https://github.com/AndyAyersMS/InstructionsRetiredExplorer)
+
+You will need to do both release and checked builds of the runtime repo, and create the associated
+test directories (aka Core_Roots).
+
+You will need to build the instructions retired explorer.
+
+You will need to modify file paths in the performance explorer code to refer to the locations
+of the above repos and builds, an to specify a results directory.
+
+Finally, you will likely want to customize the list of benchmarks to explore; the names of these
+are the names used in the performance repo. Note the names often contain quotes or other special
+characters so you will likely need to read up on how to handle these when they appear in C# literal strings.
+
+Once you have made these modifications, you can then build the performance explorer.
+
+The tool must be run as admin, in order to perform the necessary profiling.
+
+### How It Works
+
+For each benchmark in the list, performance explorer will:
+* run the benchmark from the perf directory, with `-p ETW` so that profile data is collected
+* parse the profile data using instructions retired explorer to find the hot methods
+* also parse the BenchmarkDotNet json to determine the performance of the benchmark
+* determine if there's a hot method that would be a good candidate for exploration. Currently we look for a Tier-1 method that accounts for at least 20% of the benchmark time.
+* if there is a suitable hot method:
+  * run an SPMI collection for that benchmark
+  * use that SPMI to get an assembly listing for the hot method
+  * determine from that listing how many CSEs were performed (the "default set" of N CSEs)
+  * if there were any CSEs, start the experimentation process:
+    * run the benchmark with all CSEs disabled (0 CSEs), and measure perf. Add to the exploration queue.
+    * then, repeatedly, until we have run out of experiment to try, or hit some predetermined limit
+      * pick the best performing experiment from the queue
+      * Determine which CSEs in the default set were not done in the experiment. Say there are M (<=N) of these
+      * Run M more experiments, each adding one of the missing CSEs
+
+Each benchmark's data is stored in a subfolder in the results directory; we also create disassembly for all the 
+experiments tried, and copies of all the intermediate files.
+
+There is also a master results.csv that has data from all experiments in all benchmarks, suitable for use
+in excel or as input to a machine learning algorithm.
+
+If you re-run the tool with the same benchmark list and results directory, it will use the cached copies of
+data and won't re-run the experiments.
+
+If along the way anything goes wrong then an "error.txt" file is added to the results subdirectory for
+that benchmark, and future runs will skip that benchmark.
+
+So say there are 2 CSEs by default. The explorer will run:
+* one experiment with 0 CSEs
+* two experiments each with 1 CSE
+* one experiment with 2 CSEs
+and then stop as all possibilities have been explored.
+
+For larger values of N the number of possible experiments 2^N grows rapidly and we cannot hope to explore
+the full space. The exploration process is intended to prioritize for those experiments that likely have
+the largest impact on performance.
+
+### Future Enhancements
+
+* add option to offload benchmark runs to the perf lab
+* capture more details about CSEs so we can use the data to develop better CSE heuristics
+* generalize the experiment processing to allow other kinds of experiments
+* parameterize the config settings so we don't need to modify the sources
+* add options to characterize the noise level of benchmarks and (perhaps) do more runs if noisy
+* leverage SPMI instead of perf runs, if we can trust perf scores
+
diff --git a/src/performance-explorer/benchmark-info.cs b/src/performance-explorer/benchmark-info.cs
@@ -0,0 +1,47 @@
+﻿// Licensed to the .NET Foundation under one or more agreements.
+// The .NET Foundation licenses this file to you under the MIT license.
+// See the LICENSE file in the project root for more information.
+
+using System.IO;
+using System;
+
+public class BenchmarkInfo
+{
+    public string Name { get; init; }
+    public double Ratio { get; set; }
+
+    public string CleanName
+    {
+        get
+        {
+            string cleanName = Name;
+            if (cleanName.Length > 100)
+            {
+                int parensIndex = cleanName.IndexOf('(');
+                static string Last(string s, int num) => s.Length < num ? s : s[^num..];
+                if (parensIndex == -1)
+                {
+                    cleanName = Last(cleanName, 100);
+                }
+                else
+                {
+                    string benchmarkName = cleanName[..parensIndex];
+                    string paramsStr = cleanName[(parensIndex + 1)..^1];
+                    cleanName = Last(benchmarkName, Math.Max(50, 100 - paramsStr.Length)) + "(" + Last(paramsStr, Math.Max(50, 100 - benchmarkName.Length)) + ")";
+                }
+            }
+
+            foreach (char illegalChar in Path.GetInvalidFileNameChars())
+            {
+                cleanName = cleanName.Replace(illegalChar, '_');
+            }
+
+            cleanName = cleanName.Replace(' ', '_');
+
+            return cleanName;
+        }
+    }
+
+    public string CsvName => CleanName.Replace(',', '_');
+
+}
diff --git a/src/performance-explorer/benchmark-json.cs b/src/performance-explorer/benchmark-json.cs
@@ -0,0 +1,177 @@
+// Licensed to the .NET Foundation under one or more agreements.
+// The .NET Foundation licenses this file to you under the MIT license.
+// See the LICENSE file in the project root for more information.
+
+// Classes for deserializing BenchmarkDotNet .json result files
+
+using System;
+using System.Collections.Generic;
+using System.IO;
+using System.Text.Json;
+
+public class ChronometerFrequency
+{
+    public int Hertz { get; set; }
+}
+
+public class HostEnvironmentInfo
+{
+    public string BenchmarkDotNetCaption { get; set; }
+    public string BenchmarkDotNetVersion { get; set; }
+    public string OsVersion { get; set; }
+    public string ProcessorName { get; set; }
+    public int? PhysicalProcessorCount { get; set; }
+    public int? PhysicalCoreCount { get; set; }
+    public int? LogicalCoreCount { get; set; }
+    public string RuntimeVersion { get; set; }
+    public string Architecture { get; set; }
+    public bool? HasAttachedDebugger { get; set; }
+    public bool? HasRyuJit { get; set; }
+    public string Configuration { get; set; }
+    public string JitModules { get; set; }
+    public string DotNetCliVersion { get; set; }
+    public ChronometerFrequency ChronometerFrequency { get; set; }
+    public string HardwareTimerKind { get; set; }
+}
+
+public class ConfidenceInterval
+{
+    public int N { get; set; }
+    public double Mean { get; set; }
+    public double StandardError { get; set; }
+    public int Level { get; set; }
+    public double Margin { get; set; }
+    public double Lower { get; set; }
+    public double Upper { get; set; }
+}
+
+public class Percentiles
+{
+    public double P0 { get; set; }
+    public double P25 { get; set; }
+    public double P50 { get; set; }
+    public double P67 { get; set; }
+    public double P80 { get; set; }
+    public double P85 { get; set; }
+    public double P90 { get; set; }
+    public double P95 { get; set; }
+    public double P100 { get; set; }
+}
+
+public class Statistics
+{
+    public double[] OriginalValues { get; set; }
+    public int N { get; set; }
+    public double Min { get; set; }
+    public double LowerFence { get; set; }
+    public double Q1 { get; set; }
+    public double Median { get; set; }
+    public double Mean { get; set; }
+    public double Q3 { get; set; }
+    public double UpperFence { get; set; }
+    public double Max { get; set; }
+    public double InterquartileRange { get; set; }
+    public List<double> LowerOutliers { get; set; }
+    public List<double> UpperOutliers { get; set; }
+    public List<double> AllOutliers { get; set; }
+    public double StandardError { get; set; }
+    public double Variance { get; set; }
+    public double StandardDeviation { get; set; }
+    public double? Skewness { get; set; }
+    public double? Kurtosis { get; set; }
+    public ConfidenceInterval ConfidenceInterval { get; set; }
+    public Percentiles Percentiles { get; set; }
+}
+
+public class Memory
+{
+    public int Gen0Collections { get; set; }
+    public int Gen1Collections { get; set; }
+    public int Gen2Collections { get; set; }
+    public long TotalOperations { get; set; }
+    public long BytesAllocatedPerOperation { get; set; }
+}
+
+public class Measurement
+{
+    public string IterationStage { get; set; }
+    public int LaunchIndex { get; set; }
+    public int IterationIndex { get; set; }
+    public long Operations { get; set; }
+    public double Nanoseconds { get; set; }
+}
+
+public class Metric
+{
+    public double Value { get; set; }
+    public MetricDescriptor Descriptor { get; set; }
+}
+
+public class MetricDescriptor
+{
+    public string Id { get; set; }
+    public string DisplayName { get; set; }
+    public string Legend { get; set; }
+    public string NumberFormat { get; set; }
+    public int UnitType { get; set; }
+    public string Unit { get; set; }
+    public bool TheGreaterTheBetter { get; set; }
+    public int PriorityInCategory { get; set; }
+}
+
+public class Benchmark
+{
+    public string DisplayInfo { get; set; }
+    public string Namespace { get; set; }
+    public string Type { get; set; }
+    public string Method { get; set; }
+    public string MethodTitle { get; set; }
+    public string Parameters { get; set; }
+    public string FullName { get; set; }
+    public Statistics Statistics { get; set; }
+    public Memory Memory { get; set; }
+    public List<Measurement> Measurements { get; set; }
+    public List<Metric> Metrics { get; set; }
+}
+
+public class BdnResult
+{
+    public string Title { get; set; }
+    public HostEnvironmentInfo HostEnvironmentInfo { get; set; }
+    public List<Benchmark> Benchmarks { get; set; }
+}
+
+public class BdnParser
+{
+    // Return performance of this benchmark (in microseconds)
+    public static double GetPerf(string bdnJsonFile)
+    {
+		double perf = 0;
+		string bdnJsonLines = File.ReadAllText(bdnJsonFile);
+		BdnResult bdnResult = JsonSerializer.Deserialize<BdnResult>(bdnJsonLines)!;
+           
+        // Assume all runs are for the same benchmark
+        // Handle possibility of multiple runs (via --LaunchCount)
+        //
+		foreach (Benchmark b in bdnResult.Benchmarks)
+		{
+			double sum = 0;
+			long ops = 0;
+
+			foreach (Measurement m in b.Measurements)
+			{
+				if (!m.IterationStage.Equals("Result"))
+				{
+					continue;
+				}
+
+				sum += m.Nanoseconds;
+				ops += m.Operations;
+			}
+
+			perf = (sum / ops) / 1000;
+		}
+
+        return perf;
+	}
+}
diff --git a/src/performance-explorer/cse-experiment.cs b/src/performance-explorer/cse-experiment.cs
@@ -0,0 +1,48 @@
+﻿// Licensed to the .NET Foundation under one or more agreements.
+// The .NET Foundation licenses this file to you under the MIT license.
+// See the LICENSE file in the project root for more information.
+
+
+using PerformanceExplorer;
+using System.Globalization;
+using System.IO;
+using System.Text.Json;
+using System;
+using System.Xml.Serialization;
+
+public class CseExperiment
+{
+    public BenchmarkInfo Benchmark { get; set; }
+    public CseExperiment Baseline { get; set; }
+    public HotFunction Method { get; set; }
+    public uint Mask { get; set; }
+    public uint NumCse { get; set; }
+    public uint CodeSize { get; set; }
+    public double PerfScore { get; set; }
+    public double Perf { get; set; }
+    public bool Explored { get; set; }
+
+    public string Hash { get; set; }
+
+    public uint Index { get; set; }
+
+    public bool IsImprovement { get { return Perf < Baseline.Perf; } }
+
+    public static string Schema
+    {
+        get
+        {
+            return "Benchmark,Index,Mask,NumCse,CodeSize,PerfScore,PerfScoreRatio,Perf,PerfRatio";
+        }
+    }
+
+    public string Info
+    {
+        get
+        {
+            double perfRatio = (Baseline == null) ? 1.0 : Perf / Baseline.Perf;
+            double perfScoreRatio = (Baseline == null) ? 1.0 : PerfScore / Baseline.PerfScore;
+            return $"{Benchmark.CsvName},{Index},{Mask:x8},{NumCse},{CodeSize},{PerfScore:F2},{perfScoreRatio:F3},{Perf:F4},{perfRatio:F3}";
+        }
+    }
+}
diff --git a/src/performance-explorer/hot-function.cs b/src/performance-explorer/hot-function.cs
diff --git a/src/performance-explorer/performance-explorer.cs b/src/performance-explorer/performance-explorer.cs
diff --git a/src/performance-explorer/performance-explorer.csproj b/src/performance-explorer/performance-explorer.csproj