Skip to content

bpo-34410: itertools.tee not thread-safe; can segfault interpreter wh… #9075

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 9 commits into from

Conversation

hongweipeng
Copy link
Contributor

@hongweipeng hongweipeng commented Sep 6, 2018

@tirkarthi
Copy link
Member

Can the tests attached at https://bugs.python.org/msg323817 by @zhangyangyu be added as part of test suite? I checked out the branch locally and the script segfaults or prints Fatal Python error.

➜  cpython git:(fix-issue-34410) ./python.exe
Python 3.8.0a0 (heads/fix-issue-34410:74ed2cc061, Sep  6 2018, 11:38:49)
[Clang 7.0.2 (clang-700.1.81)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>>
➜  cpython git:(fix-issue-34410) ./python.exe ../backups/bpo34410.py
[1]    18986 segmentation fault  ./python.exe ../backups/bpo34410.py
➜  cpython git:(fix-issue-34410) ./python.exe ../backups/bpo34410.py
Fatal Python error: ./Modules/itertoolsmodule.c:500 object at 0x7fdfaa459540 has negative ref count -2604246222170760230

Thread 0x0000000104a10000 (most recent call first):
  File "../backups/bpo34410.py", line 7 in __next__
  File "../backups/bpo34410.py", line 11 in test
  File "/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/threading.py", line 865 in run
  File "/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/threading.py", line 917 in _bootstrap_inner
  File "/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/threading.py", line 885 in _bootstrap

Current thread 0x00007fff76c77300 (most recent call first):
  File "../backups/bpo34410.py", line 15 in <module>
[1]    19004 abort      ./python.exe ../backups/bpo34410.py

bpo34410.py

import threading
import itertools

class C:
    def __iter__(self):
        return self
    def __next__(self):
        return 1

def test(i):
    print(list(i))

i1, i2 = itertools.tee(C())
threading.Thread(target=test, args=(i1,)).start()
print(list(i2))

Thanks

@hongweipeng
Copy link
Contributor Author

@tirkarthi thanks for testing. I only lock when it runs PyIter_Next, but it seems other thead has already set tdo->values[i] . So I expand the critical section.

@tirkarthi
Copy link
Member

tirkarthi commented Sep 6, 2018

Thanks. I checked out your latest commit. It seems there are some more cases to handle. It now hangs instead of segfault that I need to do Ctrl + C. I have limited knowledge about C and locking so I don't know what's happening. Attaching the traceback so that it can be helpful.

➜  cpython git:(fix-issue-34410) ./python.exe -VV
Python 3.8.0a0 (heads/fix-issue-34410:10590a53a2, Sep  6 2018, 12:00:35)
[Clang 7.0.2 (clang-700.1.81)]
➜  cpython git:(fix-issue-34410) ./python.exe ../backups/bpo34410.py
^CTraceback (most recent call last):
  File "../backups/bpo34410.py", line 15, in <module>
    print(list(i2))
  File "../backups/bpo34410.py", line 7, in __next__
    def __next__(self):
KeyboardInterrupt
^CException ignored in: <module 'threading' from '/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/threading.py'>
Traceback (most recent call last):
  File "/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/threading.py", line 1273, in _shutdown
    t.join()
  File "/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/threading.py", line 1032, in join
    self._wait_for_tstate_lock()
  File "/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/threading.py", line 1048, in _wait_for_tstate_lock
    elif lock.acquire(block, timeout):
KeyboardInterrupt

Also I think this will also help in a NEWS entry since it's a segfault also affecting other versions but I will let it for the reviewers to decide though.

Thanks

@hongweipeng
Copy link
Contributor Author

You need Ctrl + C twice because the child thread runs in daemon=False ,in this state, mian thread will wait for the child thread exits before it exits. The effect you want should be threading.Thread(target=test, args=(i1,), daemon=True).start().

In addition, what should I do in NEWS. This is my first rp in this repo.

@tirkarthi
Copy link
Member

Thanks. You can find more about the NEWS entry here : https://devguide.python.org/committing/#what-s-new-and-news-entries . It's a short piece explaining the changes made. You can use the blurb tool generate one. I don't know if it's a requirement and I will resort to the reviewers but adding one will always be helpful.

There seems to be some test failure related test_generators in CI that hangs but I cannot reproduce them locally.

➜  cpython git:(fix-issue-34410) ./python.exe -m unittest -v test.test_generators
test_except_gen_except (test.test_generators.ExceptionTest) ... ok
test_except_next (test.test_generators.ExceptionTest) ... ok
test_except_throw (test.test_generators.ExceptionTest) ... ok
test_except_throw_exception_context (test.test_generators.ExceptionTest) ... ok
test_return_stopiteration (test.test_generators.ExceptionTest) ... ok
test_return_tuple (test.test_generators.ExceptionTest) ... ok
test_stopiteration_error (test.test_generators.ExceptionTest) ... ok
test_tutorial_stopiteration (test.test_generators.ExceptionTest) ... ok
test_frame_resurrect (test.test_generators.FinalizationTest) ... ok
test_lambda_generator (test.test_generators.FinalizationTest) ... ok
test_refcycle (test.test_generators.FinalizationTest) ... ok
test_copy (test.test_generators.GeneratorTest) ... ok
test_name (test.test_generators.GeneratorTest) ... ok
test_pickle (test.test_generators.GeneratorTest) ... ok
test_raise_and_yield_from (test.test_generators.SignalAndYieldFromTest) ... ok
test_generator_gi_yieldfrom (test.test_generators.YieldFromTests) ... ok

----------------------------------------------------------------------
Ran 16 tests in 0.109s

OK

Thanks

@hongweipeng
Copy link
Contributor Author

Thanks for reminding. test_generators work in gcc and clang in my pc.I cannot reproduce them either.I need to take some time to study it.

@hongweipeng
Copy link
Contributor Author

After several revisions, I am finally satisfied. Locking is only done when PyIter_Next. Anyone willing to review it?

@zhangyangyu
Copy link
Member

I am not sure lock is a good resolution here(of course it's a resolution) because itertools is a highly optimized module. That's why I don't propose a patch using lock at first. Since @serhiy-storchaka say he takes on this, I'd like to wait and see if he has any other approaches to solve this problem.

@hongweipeng
Copy link
Contributor Author

Thanks for reply.I take some optimizations to improve single-threaded performance. In a single-threaded case, the lock is only locked once during the entire iteration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants