Skip to content

nogil set clear causes concurrent __str__ to print as empty dict #129967

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
luisggpina opened this issue Feb 10, 2025 · 2 comments
Closed

nogil set clear causes concurrent __str__ to print as empty dict #129967

luisggpina opened this issue Feb 10, 2025 · 2 comments
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-free-threading type-bug An unexpected behavior, bug, or error

Comments

@luisggpina
Copy link

luisggpina commented Feb 10, 2025

Bug report

Bug description:

Hi,

We're a research group focused on testing concurrent runtimes. Our work-in-progress prototype found that the current nogil build __str__ can return "{}" (empty dict) instead of the expected "set()" when there is a concurrent clear() operation. The program below shows the wrong behavior:

import threading
import sys
def t0(b1,s,res):
    b1.wait()
    s.clear()

def t1(b1,s,res):
    b1.wait()
    res.append(s.__str__())

def Test():
  s =  {17, 18, 'a', 'b', 'c', 'd', 'e'}
  threads=[]
  barrier = threading.Barrier(2)
  res = []
  threads.append(threading.Thread(target= t0, args=(barrier, s,res)))
  threads.append(threading.Thread(target= t1, args=(barrier, s,res)))
  for i in range(0, len(threads)):
      threads[i].start()
  for i in range(0, len(threads)):
      threads[i].join()
  if res[0] == "{}":
      print("found bug: " + res[0])

print("test begin...")
for i in range(0,50000):
  threads = []
  if i % 1000 == 0:
      print(i)
  for i in range(0,100):
      threads.append(threading.Thread(target= Test))
  for t in threads:
      t.start()
  for t in threads:
      t.join()
print("test Done")

Sample output:

test begin...
0
found bug: {}
found bug: {}
found bug: {}
found bug: {}
found bug: {}
found bug: {}
found bug: {}
found bug: {}
found bug: {}
found bug: {}
found bug: {}
found bug: {}
found bug: {}
found bug: {}
found bug: {}
found bug: {}
found bug: {}
found bug: {}

This behavior can be observed quite readily. We tested it on a number of x86_64 and one ARM machine.

@flypoodles and @overlorde are part of the team, adding them so they get notified about further discussion.

CPython versions tested on:

3.14, CPython main branch

Operating systems tested on:

Linux

Linked PRs

@luisggpina luisggpina added the type-bug An unexpected behavior, bug, or error label Feb 10, 2025
@picnixz picnixz added interpreter-core (Objects, Python, Grammar, and Parser dirs) infra CI, GitHub Actions, buildbots, Dependabot, etc. topic-free-threading and removed infra CI, GitHub Actions, buildbots, Dependabot, etc. labels Feb 10, 2025
@picnixz picnixz marked this as a duplicate of #129968 Feb 10, 2025
@colesbury
Copy link
Contributor

Thanks - this is a bit of a weird one:

  • We lock the set before calling str() or repr()
  • However, when printing the set we call PySequence_List, which internally tries to lock both the newly created list and the set a second time. This leads to the unlocking + relocking behavior, which allows another thread to clear the set in the meantime. @Yhg1s's recursive critical section optimization doesn't apply here because it's a two-object critical section, not a simple one object critical section.

This can happen in the GIL-enabled build too, at least in Python 3.11 and earlier, where PySequence_List() can trigger arbitrary code by the GC running.

I think we should do two things:

  1. We should make set_repr_lock_held() more robust and check the size of the list that results from PySequence_List.
  2. We should attempt to avoid the slow relocking behavior in set_repr_lock_held.

colesbury added a commit to colesbury/cpython that referenced this issue Feb 10, 2025
The call to `PySequence_List()` could temporarily unlock and relock the
set, allowing the items to be cleared and return the incorrect
notation `{}` for a empty set (it should be `set()`).
colesbury added a commit to colesbury/cpython that referenced this issue Feb 11, 2025
colesbury added a commit that referenced this issue Feb 11, 2025
The call to `PySequence_List()` could temporarily unlock and relock the
set, allowing the items to be cleared and return the incorrect
notation `{}` for a empty set (it should be `set()`).

Co-authored-by: T. Wouters <[email protected]>
colesbury added a commit to colesbury/cpython that referenced this issue Feb 11, 2025
…29978)

The call to `PySequence_List()` could temporarily unlock and relock the
set, allowing the items to be cleared and return the incorrect
notation `{}` for a empty set (it should be `set()`).
(cherry picked from commit a7427f2)

Co-authored-by: Sam Gross <[email protected]>
Co-authored-by: T. Wouters <[email protected]>
colesbury added a commit that referenced this issue Feb 11, 2025
…30020)

The call to `PySequence_List()` could temporarily unlock and relock the
set, allowing the items to be cleared and return the incorrect
notation `{}` for a empty set (it should be `set()`).

(cherry picked from commit a7427f2)

Co-authored-by: T. Wouters <[email protected]>
@colesbury
Copy link
Contributor

Thanks @luisggpina - this is fixed now in the main and 3.13 branches.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-free-threading type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

3 participants