Skip to content

common: Prevent module re-import duplication in Python3 #1558

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

EricCousineau-TRI
Copy link
Collaborator

@EricCousineau-TRI EricCousineau-TRI commented Oct 7, 2018

Resolves #1559

With my system's version of Python3 (Ubuntu 16.04, Python 3.5.2), if I import a given module while that module is being constructed, it will duplicate the module object.

This captures this behavior in a test, and provides a potential workaround.

@wjakob
Copy link
Member

wjakob commented Nov 9, 2018

I can't seem to match up your description to the contents of the PR. In particular, PYBIND11_CHECK_REIMPORT only has a non-empty definition when compiling for Python 2.x. In any case, it would be important to fix the issue both 2.x and 3.x versions of Python (is one not affected for some reason?).

@EricCousineau-TRI
Copy link
Collaborator Author

EricCousineau-TRI commented Nov 11, 2018

Ah, oops! Sorry, the comment at the #else was misleading; I've revised it to make it a bit clearer.

And it seems that this is specific to Python3's method of constructing C extension modules.
In Python2, I believe PYBIND11_PLUGIN_IMPL is handed a module that is already registered in sys.modules, thus importing the module in itself will not cause a duplication error.
However, In Python3, since it constructs the module before its registered, Python3 offers no natural mechanism to prevent self-importing.

EDIT: Nevermind, not sure exactly why, but something along these lines seem to be happening, as the first commit in this PR is no problem for Python2.
EDIT 2: Seems like PEP 489 indicates that this is suffering from single-phase initialization pitfalls:
https://www.python.org/dev/peps/pep-0489/#motivation
and could benefit from multi-phase initialization. If that's the case, then perhaps this patch is a small workaround, but maybe the better thing is to support multi-phase at some point in the future?

All that being said, it seems that my patch is causing the embedded interpreter tests to fail, specifically when checking cleanup (using static py::module was a, uh, bad idea).
I'll try to think of some workarounds that are hopefully not too... crappy.

Fixed, per below comment.

@EricCousineau-TRI
Copy link
Collaborator Author

EricCousineau-TRI commented Nov 11, 2018

Updated to use get_or_create_shared_data to handle interpret restart, leveraging a static variable to kind-of get a unique key for a plugin.

EDIT: Hm... Just one failure for PYTHON=3.5 CPP=14 GCC=6 DEBUG=1...
EDIT 2: D'oh, just forgot an inc_ref().

@EricCousineau-TRI
Copy link
Collaborator Author

Any chance this might be able to move forward? Got bit by this in our lib again.
(Sorry to be a pest!)

Copy link
Collaborator

@rwgk rwgk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @henryiii (slack channel) that this would be good to merge.

Comment on lines +177 to +181
auto& modules = pybind11::get_or_create_shared_data< \
std::unordered_map<int*, pybind11::handle>>("__pybind11_modules"); \
pybind11::handle& original = modules[&module_key]; \
if (original) return original.inc_ref().ptr(); \
original = m; \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uncertain myself, but asking anyway: could it make sense to move this code to an inline function in the pybind11::detail namespace?
It would also be nice to add a terse comment (here or in the function) with a hint why this is needed, so that people don't have to dig up this PR to find out.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I can delegate most of this code to a function, and could add a comment, pending other discussions - thanks!

@henryiii henryiii added this to the v2.6.0 milestone Oct 3, 2020
@YannickJadoul
Copy link
Collaborator

This does feel like a rather heavy fix (always keeping track of all modules in a shared variable) for a small window of time where things can go wrong (during import of the module). Just curious if there's no other way; this is not a pybind11-specific problem, so how do other frameworks (or even the raw C API) approach this?

Secondary part of this concern: storing py::handles means that at some point they could start referring to garbage-collected data. Are we safe, here?

@wjakob
Copy link
Member

wjakob commented Oct 5, 2020

I agree with Yannick, this strikes me as to heavy of a change given the benefit that it provides. It's not just the runtime cost of creating yet another hash table (negligible), but the fact that we have to generate and include code for another hash table variant with template arguments int*, pybind11::handle.

Can't we do this with a simple global flag to prevent the kind of re-entrant behavior you mention?

@EricCousineau-TRI
Copy link
Collaborator Author

EricCousineau-TRI commented Oct 6, 2020

Agreed with y'all's points on heavy-handedness and non-reentrancy of the current solution.

Rather than trying to address those in this current incantation, perhaps the right thing is ensuring that, when available, Python 3 modules use multi-phase initialization by default, as Yannick pointed out in #2548. I would vote for either re-catering this PR towards that issue, or closing this out and opening a new PR addressing that more specifically?

(I think this would slip on the release schedule, but we're fine in Drake for doing this manually, and not sure if other people have encountered this as a pain point?)

@EricCousineau-TRI
Copy link
Collaborator Author

Can't we do this with a simple global flag to prevent the kind of re-entrant behavior you mention?

Not sure if I understand how this current solution would be solvable via a global flag. @wjakob If you can, can you speak to this?

@YannickJadoul
Copy link
Collaborator

YannickJadoul commented Oct 6, 2020

perhaps the right thing is ensuring that, when available, Python 3 modules use multi-phase initialization by default, as Yannick pointed out in #2548. I would vote for either re-catering this PR towards that issue, or closing this out and opening a new PR addressing that more specifically?

👍 Sounds good to me. But maybe better to start a new PR whenever someone has a bit of time, rather than repurpose this? That PR should probably also follow/coordinate with #2552 & #2251?

(I think this would slip on the release schedule, but we're fine in Drake for doing this manually, and not sure if other people have encountered this as a pain point?)

If you would need an easy workaround included in pybind11, I'd be fine with that, but so far, this is the only time I heard about problems in this case. So if you don't mind keeping your own workaround, it might be better to solve this with the multi-phase initialization in 2.7.0 or so?

@EricCousineau-TRI
Copy link
Collaborator Author

So if you don't mind keeping your own workaround, it might be better to solve this with the multi-phase initialization in 2.7.0 or so?

Yup, sounds great! I'm totally fine with us keeping our workaround!

@EricCousineau-TRI
Copy link
Collaborator Author

That PR should probably also follow/coordinate with #2552 & #2251?

Er, did you mean #2551?

@YannickJadoul
Copy link
Collaborator

Er, did you mean #2551?

Yes, sorry! (#2251's title is appropriately "Typo" :-D )

@henryiii henryiii removed this from the v2.6.0 milestone Oct 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Python3: Modules can be duplicated if they are imported while being constructed
5 participants