-
-
Notifications
You must be signed in to change notification settings - Fork 198
Add additional gc benchmark with pickletools (#437) #438
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
These tests (and the PR) has N = 1'000'000. The downside is that running the benchmark (with Python 3.14) takes almost 10 minutes. I could reduce the size of the instance to lower the overall running time, but it seems like the garbage collector bug doesn't "kick in" until we reach a certain size. With N = 100'000, the slowdown is not as noticable:
|
pyperformance/data-files/benchmarks/bm_pickle_opt/run_benchmark.py
Outdated
Show resolved
Hide resolved
pyperformance/data-files/benchmarks/bm_pickle_opt/run_benchmark.py
Outdated
Show resolved
Hide resolved
pyperformance/data-files/benchmarks/bm_pickle_opt/run_benchmark.py
Outdated
Show resolved
Hide resolved
|
@sergey-miryanov Thanks for the review. I have fixed all issues you pointed out. |
|
@sergey-miryanov Something strange happens here. Even though I use the context manager ( I am not able to reproduce this behavior when not running with It sounds like a bug, but I can't tell where. |
sergey-miryanov
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks good to me.
|
@pgdr Thanks! It is up to |
Taking 10 minutes would be too long. However, it only takes about 6 seconds for me to run this on Python 3.14.0, on my hardware. Perhaps the 10 minutes is for when N = 10e6? The regression I see from 3.13 to 3.14 with N = 1e6 seems large enough (1.5 seconds vs 6 seconds, roughly). Nice work on this benchmark. I think it's good because |
|
Small suggestion: it would be simpler to use You could use |
|
@nascheme Thanks a lot, that saved a whole bunch of complexity. Running some tests and then I'll fix it. Something like this: import pickle
import pickletools
import pyperf
def setup(N: int) -> bytes:
x = {i: f"ii{i:>07}" for i in range(N)}
return pickle.dumps(x, protocol=4)
def run(p: bytes) -> None:
pickletools.optimize(p)
if __name__ == "__main__":
runner = pyperf.Runner()
runner.metadata["description"] = "Pickletools optimize"
N = 100_000
payload = setup(N)
runner.bench_func("pickle_opt", run, payload) |
Adds a benchmark reproducing the Python 3.14 garbage collector regression described in cpython/#140175.
This real-world case uses
pickletoolsto demonstrate the performance issue.Fixes #437.