Skip to content

gh-128013: fix data race in PyUnicode_AsUTF8AndSize on free-threading #128021

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Dec 19, 2024

Conversation

kumaraditya303
Copy link
Contributor

@kumaraditya303 kumaraditya303 commented Dec 17, 2024

@kumaraditya303 kumaraditya303 changed the title gh128013: fix data race in PyUnicode_AsUTF8AndSize on free-threading gh-128013: fix data race in PyUnicode_AsUTF8AndSize on free-threading Dec 17, 2024
@kumaraditya303 kumaraditya303 force-pushed the utf8 branch 2 times, most recently from 87bdaea to 0f692ce Compare December 17, 2024 12:35
@colesbury colesbury self-requested a review December 17, 2024 15:30
Copy link
Contributor

@colesbury colesbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's still a bug where PyUnicode_UTF8() is checked outside the lock, but the condition may change once the lock is acquired (because some other thread filled in the utf8 field).

I think we should refactor out the check into something like unicode_ensure_utf8 that does the double-checked locking:

static int
unicode_ensure_utf8(PyObject *unicode)
{
    int err = 0;
    if (PyUnicode_UTF8(unicode) == NULL) {
        Py_BEGIN_CRITICAL_SECTION(unicode);
        if (PyUnicode_UTF8(unicode) == NULL) {
            err = unicode_fill_utf8(unicode);
        }
        Py_END_CRITICAL_SECTION();
    }
    return err;
}

unicode_fill_utf8 should assert that the critical section is held.

@vstinner
Copy link
Member

I wrote PR gh-128061 "Convert unicodeobject.c macros to functions" to prepare the code for this change.

@kumaraditya303 kumaraditya303 marked this pull request as ready for review December 18, 2024 15:39
@kumaraditya303
Copy link
Contributor Author

I have updated the PR to use the new static inline functions and it now uses acquire/release semantics for utf8 member. I have tested the reproducer from issue and now there aren't any data races AFAICS.

Copy link
Contributor

@colesbury colesbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good, I think there's just one issue in _PyUnicode_CheckConsistency.

Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kumaraditya303 kumaraditya303 merged commit 3c168f7 into python:main Dec 19, 2024
39 checks passed
@kumaraditya303 kumaraditya303 deleted the utf8 branch December 19, 2024 11:38
@bedevere-bot
Copy link

⚠️⚠️⚠️ Buildbot failure ⚠️⚠️⚠️

Hi! The buildbot AMD64 CentOS9 NoGIL Refleaks 3.x has failed when building commit 3c168f7.

What do you need to do:

  1. Don't panic.
  2. Check the buildbot page in the devguide if you don't know what the buildbots are or how they work.
  3. Go to the page of the buildbot that failed (https://buildbot.python.org/#/builders/1610/builds/545) and take a look at the build logs.
  4. Check if the failure is related to this commit (3c168f7) or if it is a false positive.
  5. If the failure is related to this commit, please, reflect that on the issue and make a new Pull Request with a fix.

You can take a look at the buildbot page here:

https://buildbot.python.org/#/builders/1610/builds/545

Failed tests:

  • test_free_threading

Summary of the results of the build (if available):

==

Click to see traceback logs
remote: Enumerating objects: 15, done.        
remote: Counting objects:   6% (1/15)        
remote: Counting objects:  13% (2/15)        
remote: Counting objects:  20% (3/15)        
remote: Counting objects:  26% (4/15)        
remote: Counting objects:  33% (5/15)        
remote: Counting objects:  40% (6/15)        
remote: Counting objects:  46% (7/15)        
remote: Counting objects:  53% (8/15)        
remote: Counting objects:  60% (9/15)        
remote: Counting objects:  66% (10/15)        
remote: Counting objects:  73% (11/15)        
remote: Counting objects:  80% (12/15)        
remote: Counting objects:  86% (13/15)        
remote: Counting objects:  93% (14/15)        
remote: Counting objects: 100% (15/15)        
remote: Counting objects: 100% (15/15), done.        
remote: Compressing objects:  14% (1/7)        
remote: Compressing objects:  28% (2/7)        
remote: Compressing objects:  42% (3/7)        
remote: Compressing objects:  57% (4/7)        
remote: Compressing objects:  71% (5/7)        
remote: Compressing objects:  85% (6/7)        
remote: Compressing objects: 100% (7/7)        
remote: Compressing objects: 100% (7/7), done.        
remote: Total 8 (delta 7), reused 1 (delta 1), pack-reused 0 (from 0)        
From https://github.com/python/cpython
 * branch                    main       -> FETCH_HEAD
Note: switching to '3c168f7f79d1da2323d35dcf88c2d3c8730e5df6'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -c with the switch command. Example:

  git switch -c <new-branch-name>

Or undo this operation with:

  git switch -

Turn off this advice by setting config variable advice.detachedHead to false

HEAD is now at 3c168f7f79d gh-128013: fix data race in `PyUnicode_AsUTF8AndSize` on free-threading (#128021)
Switched to and reset branch 'main'

configure: WARNING: no system libmpdecimal found; falling back to bundled libmpdecimal (deprecated and scheduled for removal in Python 3.15)

make: *** [Makefile:2321: buildbottest] Error 2

@vstinner
Copy link
Member

The buildbot AMD64 CentOS9 NoGIL Refleaks 3.x has failed when building commit 3c168f7.

The failure looks unrelated:

0:12:32 load avg: 8.03 [466/481/1] test_free_threading worker non-zero exit code (Exit code -6 (SIGABRT)) -- running (3): (...)

(...)

Races assigning to __dict__ should be thread safe ...

python: Objects/obmalloc.c:1219: process_queue: Assertion `buf->rd_idx == buf->wr_idx' failed.
Fatal Python error: Aborted

Thread 0x00007fbb557fa640 (most recent call first):
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/test/test_free_threading/test_dict.py", line 162 in writer_func
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/threading.py", line 996 in run
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/threading.py", line 1054 in _bootstrap_inner
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/threading.py", line 1016 in _bootstrap

Current thread 0x00007fbb55ffb640 (most recent call first):
  Garbage-collecting
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/test/test_free_threading/test_dict.py", line 163 in writer_func
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/threading.py", line 996 in run
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/threading.py", line 1054 in _bootstrap_inner
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/threading.py", line 1016 in _bootstrap

Thread 0x00007fbb567fc640 (most recent call first):
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/test/test_free_threading/test_dict.py", line 162 in writer_func
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/threading.py", line 996 in run
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/threading.py", line 1054 in _bootstrap_inner
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/threading.py", line 1016 in _bootstrap

Thread 0x00007fbb56ffd640 (most recent call first):
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/test/test_free_threading/test_dict.py", line 158 in writer_func
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/threading.py", line 996 in run
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/threading.py", line 1054 in _bootstrap_inner
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/threading.py", line 1016 in _bootstrap

Thread 0x00007fbb577fe640 (most recent call first):
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/test/test_free_threading/test_dict.py", line 158 in writer_func
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/threading.py", line 996 in run
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/threading.py", line 1054 in _bootstrap_inner
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/threading.py", line 1016 in _bootstrap

Thread 0x00007fbb57fff640 (most recent call first):
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/test/test_free_threading/test_dict.py", line 162 in writer_func
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/threading.py", line 996 in run
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/threading.py", line 1054 in _bootstrap_inner
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/threading.py", line 1016 in _bootstrap

Thread 0x00007fbb16ffd640 (most recent call first):
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/test/test_free_threading/test_dict.py", line 158 in writer_func
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/threading.py", line 996 in run
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/threading.py", line 1054 in _bootstrap_inner
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/threading.py", line 1016 in _bootstrap

Thread 0x00007fbb15ffb640 (most recent call first):
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/test/test_free_threading/test_dict.py", line 158 in writer_func
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/threading.py", line 996 in run
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/threading.py", line 1054 in _bootstrap_inner
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/threading.py", line 1016 in _bootstrap

Thread 0x00007fbb5d092740 (most recent call first):
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/threading.py", line 1105 in join
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/test/test_free_threading/test_dict.py", line 188 in test_racing_set_object_dict
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/unittest/case.py", line 606 in _callTestMethod
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/unittest/case.py", line 660 in run
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/unittest/case.py", line 716 in __call__
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/unittest/suite.py", line 122 in run
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/unittest/suite.py", line 84 in __call__
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/unittest/suite.py", line 122 in run
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/unittest/suite.py", line 84 in __call__
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/unittest/suite.py", line 122 in run
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/unittest/suite.py", line 84 in __call__
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/unittest/suite.py", line 122 in run
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/unittest/suite.py", line 84 in __call__
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/unittest/runner.py", line 259 in run
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/test/libregrtest/single.py", line 58 in _run_suite
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/test/libregrtest/single.py", line 38 in run_unittest
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/test/libregrtest/single.py", line 136 in test_func
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/test/libregrtest/refleak.py", line 132 in runtest_refleak
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/test/libregrtest/single.py", line 88 in regrtest_runner
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/test/libregrtest/single.py", line 139 in _load_run_test
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/test/libregrtest/single.py", line 184 in _runtest_env_changed_exc
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/test/libregrtest/single.py", line 284 in _runtest
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/test/libregrtest/single.py", line 313 in run_single_test
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/test/libregrtest/worker.py", line 83 in worker_process
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/test/libregrtest/worker.py", line 118 in main
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/test/libregrtest/worker.py", line 122 in <module>
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/runpy.py", line 88 in _run_code
  File "/home/buildbot/buildarea/3.x.itamaro-centos-aws.refleak.nogil/build/Lib/runpy.py", line 198 in _run_module_as_main

Extension modules: _testcapi (total: 1)

srinivasreddy pushed a commit to srinivasreddy/cpython that referenced this pull request Dec 23, 2024
kumaraditya303 added a commit to kumaraditya303/cpython that referenced this pull request Jan 2, 2025
@bedevere-app
Copy link

bedevere-app bot commented Jan 2, 2025

GH-128417 is a backport of this pull request to the 3.13 branch.

srinivasreddy pushed a commit to srinivasreddy/cpython that referenced this pull request Jan 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants