Skip to content

Commit 444e624

Browse files
committed
[CI][Benchmarks] Benches OS setup guide
1 parent 46b04db commit 444e624

File tree

2 files changed

+147
-0
lines changed

2 files changed

+147
-0
lines changed
Lines changed: 142 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,142 @@
1+
# System Performance Tuning Guide
2+
3+
This guide provides recommendations for optimizing system performance when running SYCL and Unified Runtime benchmarks.
4+
For framework-specific information, see [README.md](README.md) and [CONTRIB.md](CONTRIB.md).
5+
6+
## Table of Contents
7+
8+
- [Overview](#overview)
9+
- [System Configuration](#system-configuration)
10+
- [CPU Tuning](#cpu-tuning)
11+
- [GPU Configuration](#gpu-configuration)
12+
- [Driver and Runtime Optimization](#driver-and-runtime-optimization)
13+
- [Environment Variables](#environment-variables)
14+
15+
## Overview
16+
17+
Performance benchmarking requires a stable and optimized system environment to produce reliable and reproducible results. This guide covers essential system tuning steps for reducing run-to-run variance in benchmark results.
18+
19+
## System Configuration
20+
21+
### Kernel Parameters
22+
23+
Add the following to `/etc/default/grub` in `GRUB_CMDLINE_LINUX`:
24+
```
25+
# Disable CPU frequency scaling
26+
# intel_pstate=disable
27+
28+
# Isolate CPUs for benchmark workloads (example: reserve cores 2-7), preventing other processes
29+
# from using them.
30+
# isolcpus=2-7
31+
32+
GRUB_CMDLINE_LINUX="intel_pstate=disable isolcpus=2-7 <other_options>"
33+
```
34+
35+
Update GRUB and reboot:
36+
```bash
37+
sudo update-grub
38+
sudo reboot
39+
```
40+
41+
## CPU Tuning
42+
43+
### CPU Frequency Scaling
44+
45+
The performance governor ensures that the CPU runs at maximum frequency.
46+
```bash
47+
# Set performance governor for all CPUs
48+
sudo cpupower frequency-set --governor performance
49+
# Apply changes to system
50+
sudo sysctl --system
51+
52+
# Check current governor
53+
sudo cpupower frequency-info
54+
```
55+
56+
To preserve these settings after reboot, create a systemd service which runs the above commands at startup:
57+
```bash
58+
# Create a systemd service file
59+
sudo vim /etc/systemd/system/cpupower_governor.service
60+
```
61+
Add the following content:
62+
```
63+
[Unit]
64+
Description=Set CPU governor to Performance
65+
After=multi-user.target
66+
67+
[Service]
68+
Type=oneshot
69+
ExecStart=/usr/bin/cpupower frequency-set --governor performance && sysctl --system
70+
71+
[Install]
72+
WantedBy=multi-user.target
73+
```
74+
Enable and start the service:
75+
```bash
76+
sudo systemctl enable cpupower_governor.service
77+
sudo systemctl start cpupower_governor.service
78+
```
79+
80+
### CPU Affinity
81+
82+
Bind benchmark processes to specific CPU cores to reduce context switching and improve cache locality.
83+
Make sure that isolated CPUs are located on the same NUMA node as the GPU being used.
84+
```bash
85+
# Run benchmark on specific CPU cores
86+
taskset -c 2-7 ./main.py ~/benchmarks_workdir/ --sycl ~/llvm/build/
87+
```
88+
89+
## GPU Configuration
90+
91+
### GPU Frequency Control
92+
Setting the GPU to run at maximum frequency can significantly improve benchmark performance and stability.
93+
94+
First, find which card relates to the GPU you want to tune (e.g., card1). List of known Device IDs for
95+
Intel GPU cards can be found at https://dgpu-docs.intel.com/devices/hardware-table.html#gpus-with-supported-drivers.
96+
```bash
97+
# Print card1 Device ID
98+
cat /sys/class/drm/card1/device/vendor # Should be 0x8086 for Intel
99+
cat /sys/class/drm/card1/device/device # Device ID
100+
```
101+
102+
Verify the max frequency is set to the true max. For Arc B580, the maximum frequency is 2850 MHz. To see this value, run “cat /sys/class/drm/card1/device/tile0/gt0/freq0/max_freq”. If the above value is not equal to the max frequency, set it as such:
103+
```bash
104+
# Arc B580 (Battlemage)
105+
echo 2850 > /sys/class/drm/card1/device/tile0/gt0/freq0/max_freq
106+
107+
# Set the min frequency to the max frequency, so it is fixed
108+
echo 2850 > /sys/class/drm/card1/device/tile0/gt0/freq0/min_freq
109+
```
110+
111+
```bash
112+
# Check GPU frequencies for GPU Max 1100 (Ponte Vecchio)
113+
cat /sys/class/drm/card1/gt_max_freq_mhz
114+
cat /sys/class/drm/card1/gt_min_freq_mhz
115+
116+
# Set maximum GPU frequency
117+
max_freq=$(cat /sys/class/drm/card1/gt_max_freq_mhz)
118+
echo $max_freq | sudo tee /sys/class/drm/card1/gt_min_freq_mhz
119+
```
120+
121+
The result can be verified using tools such as oneprof or unitrace to track frequency over time for some arbitrary benchmark (many iterations of a small problem size is recommended). The frequency should remain fixed assuming thermal throttling does not occur.
122+
123+
## Driver version
124+
Make sure you are using the latest driver (Ubuntu)
125+
```bash
126+
sudo apt update && sudo apt upgrade
127+
```
128+
129+
## Environment Variables
130+
131+
### Level Zero Environment Variables
132+
Use GPU affinity to bind benchmarks to a specific GPU. Use CPUs from the same NUMA node as the GPU to reduce latency.
133+
```bash
134+
export ZE_AFFINITY_MASK=0
135+
```
136+
137+
### SYCL Runtime Variables
138+
For consistency, limit available devices to a specific gpu runtime. For Level Zero, it is recommended to use v2 version of the runtime library.
139+
```bash
140+
export ONEAPI_DEVICE_SELECTOR="level_zero:gpu"
141+
export SYCL_UR_USE_LEVEL_ZERO_V2=1
142+
```

devops/scripts/benchmarks/README.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -143,6 +143,11 @@ IGC (Ubuntu):
143143
`$ sudo apt-get install flex bison libz-dev cmake libc6 libstdc++6 python3-pip`
144144

145145

146+
## Performance Tuning
147+
148+
For stable benchmark results and system configuration recommendations, see the
149+
[Performance Tuning Guide](PERFORMANCE_TUNING.md).
150+
146151
## Contribution
147152

148153
The requirements and instructions above are for building the project from source

0 commit comments

Comments
 (0)