Skip to content

Conversation

yoney
Copy link
Contributor

@yoney yoney commented Aug 28, 2025

This change uses critical sections to make cProfile (_lsprof) thread-safe when the GIL is disabled. It does this by applying the @critical_section clinic directive to ProfilerObject.

Protecting ProfilerObject prevents crashes, and the thread sanitizer does not report errors for the new FT test. However, the stats still show incorrect recursive call information.

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
10000/538    0.085    0.000    0.186    0.000 {method 'insert' of 'list' objects}

This happens because of limitations in the cProfile module, and these issues are not directly related to the FT Python build. By adjusting sys.setswitchinterval(), you can see similar incorrect recursive call reports even in a non-FT Python build.

This is a quick fix to prevent crashes. It is also a request for comments on whether we should make bigger changes to the cProfile module to handle multithreading better, such as collecting profile data per thread.

cc: @mpage, @colesbury, @Yhg1s

@mpage
Copy link
Contributor

mpage commented Sep 3, 2025

@yoney - LGTM. I think it's worth doing the work to make cProfile work well in a multi-threaded environment in the free-threaded builds. Do you have a sense of the scope of the changes that will be required to make cProfile work correctly in a multi-threaded setting? Unless they're similar in scope to these changes I think it's probably worth merging this PR (and requesting a backport to 3.14) since it shouldn't introduce too much additional churn.

@yoney
Copy link
Contributor Author

yoney commented Sep 3, 2025

@yoney - LGTM. I think it's worth doing the work to make cProfile work well in a multi-threaded environment in the free-threaded builds. Do you have a sense of the scope of the changes that will be required to make cProfile work correctly in a multi-threaded setting? Unless they're similar in scope to these changes I think it's probably worth merging this PR (and requesting a backport to 3.14) since it shouldn't introduce too much additional churn.

@mpage Thanks for the review! I think collecting profiling data per thread and merging it lazily before reporting could work. This kind of refactoring would likely benefit subinterpreters too. I’m happy to continue working on those changes as a follow-up, but I agree that the scope would be different from this PR.

For now, this PR addresses the immediate issue that causes crashes. While the recursive-count might still be incorrect, the other counts like call-count are accurate, and this fix resolves the crash. I think it makes sense to land this now and consider the follow-up separately.

@kumaraditya303 kumaraditya303 added the needs backport to 3.14 bugs and security fixes label Sep 6, 2025
@kumaraditya303 kumaraditya303 merged commit 8554c09 into python:main Sep 6, 2025
47 checks passed
@miss-islington-app
Copy link

Thanks @yoney for the PR, and @kumaraditya303 for merging it 🌮🎉.. I'm working now to backport this PR to: 3.14.
🐍🍒⛏🤖

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Sep 6, 2025
(cherry picked from commit 8554c0917e25a7abe12b3000f1589b6566c91a25)

Co-authored-by: Alper <[email protected]>
Co-authored-by: Kumar Aditya <[email protected]>
@bedevere-app
Copy link

bedevere-app bot commented Sep 6, 2025

GH-138575 is a backport of this pull request to the 3.14 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.14 bugs and security fixes label Sep 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants