Skip to content

Fix segfault when reloading interpreter with external modules #1092

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 11, 2018

Conversation

jagerman
Copy link
Member

When embedding the interpreter and loading external modules in that
embedded interpreter, the external module correctly shares its
internals_ptr with the one in the embedded interpreter. When the
interpreter is shut down, however, only the internals_ptr local to
the embedded code is actually reset to nullptr: the external module
remains set.

The result is that loading an external pybind11 module, letting the
interpreter go through a finalize/initialize, then attempting to use
something in the external module fails because this external module is
still trying to use the old (destroyed) internals. This causes
undefined behaviour (typically a segfault).

This commit fixes it by adding a level of indirection in the internals
path, converting the local internals variable to internals ** instead
of internals *. With this change, we can detect a stale internals
pointer and reload the internals pointer (either from a capsule or by
creating a new internals instance).

(No issue number: this was reported on gitter by @henryiii and @aoloe; thanks to both for the issue and reproducible test case).

When embedding the interpreter and loading external modules in that
embedded interpreter, the external module correctly shares its
internals_ptr with the one in the embedded interpreter.  When the
interpreter is shut down, however, only the `internals_ptr` local to
the embedded code is actually reset to nullptr: the external module
remains set.

The result is that loading an external pybind11 module, letting the
interpreter go through a finalize/initialize, then attempting to use
something in the external module fails because this external module is
still trying to use the old (destroyed) internals.  This causes
undefined behaviour (typically a segfault).

This commit fixes it by adding a level of indirection in the internals
path, converting the local internals variable to `internals **` instead
of `internals *`.  With this change, we can detect a stale internals
pointer and reload the internals pointer (either from a capsule or by
creating a new internals instance).

(No issue number: this was reported on gitter by @henryiii and @aoloe).
`cmake -E env` was added in 3.2.  This drops the built external module
in the source directory instead.  (This is a bit ugly, but it's the
same ugliness we already do for the `pytest` tests).
@aoloe
Copy link

aoloe commented Sep 17, 2017

bot the test code and the original code that raised the issue work correctly with this pull request.

thanks!

@jagerman
Copy link
Member Author

Marking this for v2.2.2; this doesn't change what gets stored in the builtins, and so should be perfectly compatible with v2.2.[01] modules.

@jagerman jagerman added this to the v2.2.2 milestone Sep 19, 2017
@jagerman
Copy link
Member Author

Cc @dean0x7d for comments. What is here works, but maybe you have a nicer idea than the double pointer.

@wjakob
Copy link
Member

wjakob commented Nov 16, 2017

I feel like I'm missing important to understand this PR. When the interpreter shuts down, how does the internals pointer get marked as stale? I just see two leaking memory allocations (one for internals, one for the pointer to it) but no changes to shutdown/GC-related code.

@jagerman
Copy link
Member Author

The critical change for that is in finalize_interpreter: the *internals_ptr_ptr that gets deleted and set to nullptr is, with this PR, shared across all modules and the embedded interpreter with the same internals version. Previously each module/interpret had its own pointer, so the delete *internals_ptr_ptr destroyed it, but the *internals_ptr_ptr = nullptr; was only setting the local version to nullptr: outside modules still had their internals pointer set.

Basically, there is now only one pointer to the internals data, where previously there were n pointers to it: each module/interpreter only stores a local pointer to that pointer so that when something deletes it (and sets the one-true-pointer to nullptr) they all see it.

@wjakob
Copy link
Member

wjakob commented Nov 16, 2017

Aha, understood now. Thank you for the clarification.

@jagerman jagerman merged commit 326deef into pybind:master Jan 11, 2018
jagerman added a commit to jagerman/pybind11 that referenced this pull request Jan 12, 2018
…#1092)

* Fix segfault when reloading interpreter with external modules

When embedding the interpreter and loading external modules in that
embedded interpreter, the external module correctly shares its
internals_ptr with the one in the embedded interpreter.  When the
interpreter is shut down, however, only the `internals_ptr` local to
the embedded code is actually reset to nullptr: the external module
remains set.

The result is that loading an external pybind11 module, letting the
interpreter go through a finalize/initialize, then attempting to use
something in the external module fails because this external module is
still trying to use the old (destroyed) internals.  This causes
undefined behaviour (typically a segfault).

This commit fixes it by adding a level of indirection in the internals
path, converting the local internals variable to `internals **` instead
of `internals *`.  With this change, we can detect a stale internals
pointer and reload the internals pointer (either from a capsule or by
creating a new internals instance).

(No issue number: this was reported on gitter by @henryiii and @aoloe).
wjakob pushed a commit that referenced this pull request Feb 7, 2018
* Fix segfault when reloading interpreter with external modules

When embedding the interpreter and loading external modules in that
embedded interpreter, the external module correctly shares its
internals_ptr with the one in the embedded interpreter.  When the
interpreter is shut down, however, only the `internals_ptr` local to
the embedded code is actually reset to nullptr: the external module
remains set.

The result is that loading an external pybind11 module, letting the
interpreter go through a finalize/initialize, then attempting to use
something in the external module fails because this external module is
still trying to use the old (destroyed) internals.  This causes
undefined behaviour (typically a segfault).

This commit fixes it by adding a level of indirection in the internals
path, converting the local internals variable to `internals **` instead
of `internals *`.  With this change, we can detect a stale internals
pointer and reload the internals pointer (either from a capsule or by
creating a new internals instance).

(No issue number: this was reported on gitter by @henryiii and @aoloe).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants