Use a higher tier-up threshold for JIT code

Our current tier-up threshold is 16, which was chosen a while ago because:
- in theory, it gives some of our 16-bit branch counters time to stabilize
- it seemed to work fine in practice

It turns out that we're leaving significant performance and memory improvements on the table by not using higher thresholds. Here are the results of some experiments I ran:

|  warmup                                                                                                                          | speedup | memory | traces created | traces executed | uops executed |
| -------------------------------------------------------------------------------------------------------------------------------- | ------- | ------ | -------------- | --------------- | ------------- |
|    [64](https://github.com/faster-cpython/benchmarking-public/blob/main/results/bm-20241108-3.14.0a1%2B-48ade84-JIT/README.md)   |   +0.3% |  -1.2% |          -8.0% |           -0.1% |         +0.2% |
|   [256](https://github.com/faster-cpython/benchmarking-public/blob/main/results/bm-20241110-3.14.0a1%2B-29895e9-JIT/README.md)   |   +1.0% |  -2.6% |         -22.0% |           -0.7% |         -1.3% |
|  [1024](https://github.com/faster-cpython/benchmarking-public/blob/main/results/bm-20241111-3.14.0a1%2B-aaa9ae0-JIT/README.md)   |   +1.2% |  -3.2% |         -38.6% |           -3.0% |         -1.5% |
|  [2048](https://github.com/faster-cpython/benchmarking-public/blob/main/results/bm-20241112-3.14.0a1%2B-f863657-JIT/README.md)   |   +1.1% |  -3.3% |         -44.9% |          -12.4% |         -3.8% |
|  [4096](https://github.com/faster-cpython/benchmarking-public/blob/main/results/bm-20241111-3.14.0a1%2B-a2be6fd-JIT/README.md)   |   +2.1% |  -3.6% |         -52.2% |          -11.2% |         -3.1% | **
|  [8192](https://github.com/faster-cpython/benchmarking-public/blob/main/results/bm-20241112-3.14.0a1%2B-1236a9d-JIT/README.md)\* |   +2.0% |  -3.4% |         -59.2% |          -12.8% |         -3.1% |
| [16384](https://github.com/faster-cpython/benchmarking-public/blob/main/results/bm-20241111-3.14.0a1%2B-1723e00-JIT/README.md)\* |   +2.0% |  -3.6% |         -65.2% |          -14.5% |         -4.7% |
| [32768](https://github.com/faster-cpython/benchmarking-public/blob/main/results/bm-20241112-3.14.0a1%2B-c561277-JIT/README.md)\* |   +1.8% |  -3.8% |         -73.1% |          -18.3% |         -7.1% |
| [65536](https://github.com/faster-cpython/benchmarking-public/blob/main/results/bm-20241112-3.14.0a1%2B-c17f578-JIT/README.md)\* |   +1.4% |  -3.9% |         -79.7% |          -21.9% |         -9.2% |

\* For warmups above 4096, exponential backoff is disabled.

Based on these numbers, I think 4096 as a new threshold makes sense (2% faster and 3% less memory without significant hits to the amount of work we actually do in JIT code). I'll open a PR.

My next steps will be conducting similar experiments with higher side-exit warmup values, and then lastly with different `JIT_CLEANUP_THRESHOLD` values.


### Linked PRs
* gh-126816
* gh-127155

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Use a higher tier-up threshold for JIT code #126795

Linked PRs

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

warmup	speedup	memory	traces created	traces executed	uops executed
64	+0.3%	-1.2%	-8.0%	-0.1%	+0.2%
256	+1.0%	-2.6%	-22.0%	-0.7%	-1.3%
1024	+1.2%	-3.2%	-38.6%	-3.0%	-1.5%
2048	+1.1%	-3.3%	-44.9%	-12.4%	-3.8%
4096	+2.1%	-3.6%	-52.2%	-11.2%	-3.1%
8192*	+2.0%	-3.4%	-59.2%	-12.8%	-3.1%
16384*	+2.0%	-3.6%	-65.2%	-14.5%	-4.7%
32768*	+1.8%	-3.8%	-73.1%	-18.3%	-7.1%
65536*	+1.4%	-3.9%	-79.7%	-21.9%	-9.2%

Uh oh!

Use a higher tier-up threshold for JIT code #126795

Description

Linked PRs

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions