Collect specialization statistics from running benchmarks #140

mdboom · 2022-10-13T13:37:08Z

This will collect statistics on CPython builds with --enable-pyperf, and only during the run of the benchmarks themselves.

Cc @markshannon

vstinner

If running benchmarks on a CPython built with the --enable-pystats flag, pyperf will automatically collect statistics from about the bytecode specializer during the run of benchmark code.

I don't see any code to store statistics, so I don't get the purpose of this change. pyperf spawns many subprocesses, how are statistics collected? Do you compute the average? I don't get it.

vstinner · 2022-10-13T15:13:26Z

pyperf/__init__.py

+# Reset the stats collection if running a --enable-pystats build
+try:
+    sys._stats_off()
+    sys._stats_clear()


Please don't execute code on "import pyperf". It should be a deliberate action to clear such cache.

I can probably get away with putting this in the Runner constructor -- it's possible, I suppose, to have more than one of those, but in practice that doesn't happen in the pyperformance benchmarks.

And to be clear, this doesn't affect anything on disk -- it is only the in-memory statistics collected so far. (This is basically to remove the effect of the statistics collected during pyperf's startup and book-keeping).

vstinner · 2022-10-13T15:15:57Z

pyperf/_runner.py

        if not self._check_worker_task():
            return None

+        func = stats_wrapper(func)


On a Python built with sys._stats_on(), this wrapper adds a fixed overhead. If you want to execute code before/after the benchmark, it should be done before/after code measuring time.

See task_func() below.

In this case, the benchmarking timings are useless anyway (due to the stats overhead), but it would have a (minor) impact on the statistics themselves. I'll see if I can refactor this, though. It probably requires a copy-and-paste of task_func to really minimize the overhead -- I'm not sure it's worth it, to be honest.

Having tried this, I'm not sure how to remove the overhead without having duplicate versions of task_func, and also its async equivalent below, which could be hard to keep up-to-date down the road. I think it's better as-is, since the performance impact really doesn't matter in the stats-collecting case (and there is no performance impact in the existing non-stats-collecting case).

If we are gathering stats, the timings are meaningless.
To maximize the accuracy of the stats, we want to turn them on and off as close to the function call as possible.
This looks about as good as we can get.

mdboom · 2022-10-13T16:24:52Z

The statistics are collected in the specializing optimizer in the interpreter loop in CPython itself when built with --enable-pystats. The statistics are dumped in separate files at process shutdown in /tmp/py_stats (which is how this works with multiple processes). You can see some example output of the summarize_stats.py script here: https://github.com/faster-cpython/ideas/blob/main/stats.md

vstinner

@markshannon:

If we are gathering stats, the timings are meaningless. To maximize the accuracy of the stats, we want to turn them on and off as close to the function call as possible. This looks about as good as we can get.

Hum, ok. I'm not against this change in this case, but I still have some concerns about the documentation of this change.

vstinner · 2022-10-20T13:25:23Z

doc/run_benchmark.rst

+Specializer statistics (`pystats`)
+==================================

+If running benchmarks on a CPython built with the ``--enable-pystats`` flag, pyperf will automatically collect statistics from about the bytecode specializer during the run of benchmark code.


I dislike "will automatically collect statistics". pyperf usually puts everything in the JSON file, but here all it does is to call sys._stats_on() before and sys._stats_off() after. It doesn't clear existing statistics in /tmp/py_stats, it doesn't store the in the JSON file, it doesn't compute average or anything.

Would you mind to mention sys._stats_on() and sys._stats_off() functions instead?

Can you try to add a link to https://docs.python.org/dev/using/configure.html#cmdoption-enable-pystats which is the current official documentation for pystats?

I've updated the docs to hopefully reflect this better.

vstinner · 2022-10-20T13:27:58Z

doc/run_benchmark.rst


+If running benchmarks on a CPython built with the ``--enable-pystats`` flag, pyperf will automatically collect statistics from about the bytecode specializer during the run of benchmark code.
+
+Due to the overhead of collecting the statistics, it is unlikely the timing results will be useful.


Well, Mark's word is stronger: meaningless. What do you think of saying meaningless instead?

Sure, that's fine.

mdboom · 2022-10-26T19:40:33Z

Anything else you'd like to see on this, @vstinner?

mdboom · 2022-10-27T15:44:43Z

I'm converting this to a draft -- on full A/A testing, it turns out the results from this a quite non-deterministic. I think the noise is coming from the warmup / calibration stages, which don't always run the same number of times. I think by excluding these runs, we'll get more stable results, but checking that over here first.

mdboom · 2022-10-28T16:15:56Z

This is ready for review. With this change, we get consistent results running the pyperformance benchmark suite multiple times. See this prototype.

vstinner · 2022-10-31T15:34:20Z

doc/run_benchmark.rst

+==================================

+``pyperf`` has built-in support for `specializer statistics (``pystats``) <https://docs.python.org/dev/using/configure.html#cmdoption-enable-pystats>`_.
+If running benchmarks on a CPython built with the ``--enable-pystats`` flag, pyperf will automatically collect ``pystats`` on the benchmark code by calling ``sys._stats_on`` immediately before the benchmark and calling ``sys._stats_off`` immediately after.


Since you go into details, you should also mention that they start by calling _stats_clear(). Moreover, you mention "only if we aren't warming up or calibrating".

I added "Stats are not collected when running pyperf's own code or when warming up or calibrating the benchmarks." I don't think mentioning _stats_clear() is necessary -- the point of that is to not include pyperf's own code.

vstinner · 2022-10-31T15:34:46Z

pyperf/_collect_metadata.py


+    # pystats enabled?
+    if hasattr(sys, "_stats_clear"):
+        metadata['python_pystats'] = 'enabled'


IMO "python_" prefix is redundant with "py" prefix:

Suggested change

metadata['python_pystats'] = 'enabled'

metadata['pystats'] = 'enabled'

Makes sense.

vstinner

LGTM. Thanks for the multiple updates ;-)

mdboom added 2 commits October 13, 2022 09:28

Collect specialization statistics from running benchmarks

bb6c82c

Use try/finally instead

0c1e647

vstinner requested changes Oct 13, 2022

View reviewed changes

mdboom mentioned this pull request Oct 13, 2022

Tooling for automated stats gathering faster-cpython/tools#115

Closed

13 tasks

Move setup to Runner.__init__

e4cdd0a

mdboom requested a review from vstinner October 13, 2022 16:54

mdboom added 3 commits October 13, 2022 12:56

Fix grammar in docs

fe1f73c

Fix formatting in docs

ded3ff0

Remove extra import sys

559b236

markshannon approved these changes Oct 19, 2022

View reviewed changes

vstinner reviewed Oct 20, 2022

View reviewed changes

Improve docs based on feedback in PR

b23f428

mdboom requested a review from vstinner October 20, 2022 15:03

Correct link syntax

c2315d6

Only collect stats when not warming up / calibrating

925f07b

mdboom marked this pull request as draft October 27, 2022 15:43

mdboom marked this pull request as ready for review October 28, 2022 16:15

mdboom requested review from markshannon and vstinner and removed request for markshannon and vstinner October 28, 2022 16:16

Run the correct task_func

cce258d

vstinner reviewed Oct 31, 2022

View reviewed changes

mdboom added 2 commits October 31, 2022 13:17

Add more detail to the docs

60a873e

Change metadata field name

e88e95e

mdboom requested a review from vstinner October 31, 2022 17:20

vstinner approved these changes Nov 3, 2022

View reviewed changes

vstinner merged commit 93cac74 into psf:main Nov 3, 2022


		If running benchmarks on a CPython built with the ``--enable-pystats`` flag, pyperf will automatically collect statistics from about the bytecode specializer during the run of benchmark code.

		Due to the overhead of collecting the statistics, it is unlikely the timing results will be useful.

	metadata['python_pystats'] = 'enabled'
	metadata['pystats'] = 'enabled'

Collect specialization statistics from running benchmarks #140

Collect specialization statistics from running benchmarks #140

Uh oh!

Conversation

mdboom commented Oct 13, 2022

Uh oh!

vstinner left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mdboom commented Oct 13, 2022

Uh oh!

vstinner left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mdboom commented Oct 26, 2022

Uh oh!

mdboom commented Oct 27, 2022

Uh oh!

mdboom commented Oct 28, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vstinner left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants