Skip to content

Conversation

@mdboom
Copy link
Collaborator

@mdboom mdboom commented Oct 13, 2022

This will collect statistics on CPython builds with --enable-pyperf, and only during the run of the benchmarks themselves.

Cc @markshannon

Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If running benchmarks on a CPython built with the --enable-pystats flag, pyperf will automatically collect statistics from about the bytecode specializer during the run of benchmark code.

I don't see any code to store statistics, so I don't get the purpose of this change. pyperf spawns many subprocesses, how are statistics collected? Do you compute the average? I don't get it.

# Reset the stats collection if running a --enable-pystats build
try:
sys._stats_off()
sys._stats_clear()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't execute code on "import pyperf". It should be a deliberate action to clear such cache.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can probably get away with putting this in the Runner constructor -- it's possible, I suppose, to have more than one of those, but in practice that doesn't happen in the pyperformance benchmarks.

And to be clear, this doesn't affect anything on disk -- it is only the in-memory statistics collected so far. (This is basically to remove the effect of the statistics collected during pyperf's startup and book-keeping).

if not self._check_worker_task():
return None

func = stats_wrapper(func)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a Python built with sys._stats_on(), this wrapper adds a fixed overhead. If you want to execute code before/after the benchmark, it should be done before/after code measuring time.

See task_func() below.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, the benchmarking timings are useless anyway (due to the stats overhead), but it would have a (minor) impact on the statistics themselves. I'll see if I can refactor this, though. It probably requires a copy-and-paste of task_func to really minimize the overhead -- I'm not sure it's worth it, to be honest.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having tried this, I'm not sure how to remove the overhead without having duplicate versions of task_func, and also its async equivalent below, which could be hard to keep up-to-date down the road. I think it's better as-is, since the performance impact really doesn't matter in the stats-collecting case (and there is no performance impact in the existing non-stats-collecting case).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are gathering stats, the timings are meaningless.
To maximize the accuracy of the stats, we want to turn them on and off as close to the function call as possible.
This looks about as good as we can get.

@mdboom
Copy link
Collaborator Author

mdboom commented Oct 13, 2022

The statistics are collected in the specializing optimizer in the interpreter loop in CPython itself when built with --enable-pystats. The statistics are dumped in separate files at process shutdown in /tmp/py_stats (which is how this works with multiple processes). You can see some example output of the summarize_stats.py script here: https://github.com/faster-cpython/ideas/blob/main/stats.md

@mdboom mdboom requested a review from vstinner October 13, 2022 16:54
Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@markshannon:

If we are gathering stats, the timings are meaningless. To maximize the accuracy of the stats, we want to turn them on and off as close to the function call as possible. This looks about as good as we can get.

Hum, ok. I'm not against this change in this case, but I still have some concerns about the documentation of this change.

Specializer statistics (`pystats`)
==================================

If running benchmarks on a CPython built with the ``--enable-pystats`` flag, pyperf will automatically collect statistics from about the bytecode specializer during the run of benchmark code.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dislike "will automatically collect statistics". pyperf usually puts everything in the JSON file, but here all it does is to call sys._stats_on() before and sys._stats_off() after. It doesn't clear existing statistics in /tmp/py_stats, it doesn't store the in the JSON file, it doesn't compute average or anything.

Would you mind to mention sys._stats_on() and sys._stats_off() functions instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you try to add a link to https://docs.python.org/dev/using/configure.html#cmdoption-enable-pystats which is the current official documentation for pystats?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the docs to hopefully reflect this better.


If running benchmarks on a CPython built with the ``--enable-pystats`` flag, pyperf will automatically collect statistics from about the bytecode specializer during the run of benchmark code.

Due to the overhead of collecting the statistics, it is unlikely the timing results will be useful.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, Mark's word is stronger: meaningless. What do you think of saying meaningless instead?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, that's fine.

@mdboom mdboom requested a review from vstinner October 20, 2022 15:03
@mdboom
Copy link
Collaborator Author

mdboom commented Oct 26, 2022

Anything else you'd like to see on this, @vstinner?

@mdboom mdboom marked this pull request as draft October 27, 2022 15:43
@mdboom
Copy link
Collaborator Author

mdboom commented Oct 27, 2022

I'm converting this to a draft -- on full A/A testing, it turns out the results from this a quite non-deterministic. I think the noise is coming from the warmup / calibration stages, which don't always run the same number of times. I think by excluding these runs, we'll get more stable results, but checking that over here first.

@mdboom mdboom marked this pull request as ready for review October 28, 2022 16:15
@mdboom
Copy link
Collaborator Author

mdboom commented Oct 28, 2022

This is ready for review. With this change, we get consistent results running the pyperformance benchmark suite multiple times. See this prototype.

@mdboom mdboom requested review from markshannon and vstinner and removed request for markshannon and vstinner October 28, 2022 16:16
==================================

``pyperf`` has built-in support for `specializer statistics (``pystats``) <https://docs.python.org/dev/using/configure.html#cmdoption-enable-pystats>`_.
If running benchmarks on a CPython built with the ``--enable-pystats`` flag, pyperf will automatically collect ``pystats`` on the benchmark code by calling ``sys._stats_on`` immediately before the benchmark and calling ``sys._stats_off`` immediately after.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you go into details, you should also mention that they start by calling _stats_clear(). Moreover, you mention "only if we aren't warming up or calibrating".

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added "Stats are not collected when running pyperf's own code or when warming up or calibrating the benchmarks." I don't think mentioning _stats_clear() is necessary -- the point of that is to not include pyperf's own code.


# pystats enabled?
if hasattr(sys, "_stats_clear"):
metadata['python_pystats'] = 'enabled'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO "python_" prefix is redundant with "py" prefix:

Suggested change
metadata['python_pystats'] = 'enabled'
metadata['pystats'] = 'enabled'

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense.

@mdboom mdboom requested a review from vstinner October 31, 2022 17:20
Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the multiple updates ;-)

@vstinner vstinner merged commit 93cac74 into psf:main Nov 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants