Skip to content

Fast attribute access for module subclasses #103951

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ntessore opened this issue Apr 27, 2023 · 14 comments
Closed

Fast attribute access for module subclasses #103951

ntessore opened this issue Apr 27, 2023 · 14 comments
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage type-feature A feature request or enhancement

Comments

@ntessore
Copy link
Contributor

ntessore commented Apr 27, 2023

Feature or enhancement

There is a specialisation that allows for fast attribute access on the module type, and only the module type, since it is guarded by PyModule_CheckExact(). This specialisation could be enabled for module subclasses.

Pitch

There are occasionally reasons to subclass ModuleType to create module types which are more "class-like", e.g. to equip them with magic methods. These subclasses suffer from a performance penalty on attribute access because the specialisation is guarded against subclasses. I do not see the reason for this; in fact, the specialisation can be applied to all module subclasses with no ill effects (that I can observe) using a tiny patch:

Patch
diff --git a/Python/bytecodes.c b/Python/bytecodes.c
index 9de0d92..ab948f5 100644
--- a/Python/bytecodes.c
+++ b/Python/bytecodes.c
@@ -1676,7 +1676,7 @@ dummy_func(
         }
 
         inst(LOAD_ATTR_MODULE, (unused/1, type_version/2, index/1, unused/5, owner -- res2 if (oparg & 1), res)) {
-            DEOPT_IF(!PyModule_CheckExact(owner), LOAD_ATTR);
+            DEOPT_IF(!PyModule_Check(owner), LOAD_ATTR);
             PyDictObject *dict = (PyDictObject *)((PyModuleObject *)owner)->md_dict;
             assert(dict != NULL);
             DEOPT_IF(dict->ma_keys->dk_version != type_version, LOAD_ATTR);
diff --git a/Python/generated_cases.c.h b/Python/generated_cases.c.h
index 864a4f7..96c764f 100644
--- a/Python/generated_cases.c.h
+++ b/Python/generated_cases.c.h
@@ -2338,7 +2338,7 @@
             uint32_t type_version = read_u32(&next_instr[1].cache);
             uint16_t index = read_u16(&next_instr[3].cache);
             #line 1679 "Python/bytecodes.c"
-            DEOPT_IF(!PyModule_CheckExact(owner), LOAD_ATTR);
+            DEOPT_IF(!PyModule_Check(owner), LOAD_ATTR);
             PyDictObject *dict = (PyDictObject *)((PyModuleObject *)owner)->md_dict;
             assert(dict != NULL);
             DEOPT_IF(dict->ma_keys->dk_version != type_version, LOAD_ATTR);
diff --git a/Python/specialize.c b/Python/specialize.c
index 33a3c45..527b454 100644
--- a/Python/specialize.c
+++ b/Python/specialize.c
@@ -752,7 +752,7 @@ _Py_Specialize_LoadAttr(PyObject *owner, _Py_CODEUNIT *instr, PyObject *name)
         SPECIALIZATION_FAIL(LOAD_ATTR, SPEC_FAIL_OTHER);
         goto fail;
     }
-    if (PyModule_CheckExact(owner)) {
+    if (PyModule_Check(owner)) {
         if (specialize_module_load_attr(owner, instr, name))
         {
             goto fail;
@@ -925,7 +925,7 @@ _Py_Specialize_StoreAttr(PyObject *owner, _Py_CODEUNIT *instr, PyObject *name)
         SPECIALIZATION_FAIL(STORE_ATTR, SPEC_FAIL_OTHER);
         goto fail;
     }
-    if (PyModule_CheckExact(owner)) {
+    if (PyModule_Check(owner)) {
         SPECIALIZATION_FAIL(STORE_ATTR, SPEC_FAIL_OVERRIDDEN);
         goto fail;
     }

Previous discussion

There are two ongoing discussions at the time of writing to make modules more class-like, here and here. While both of these proposals would benefit from this change, this change is generally useful. In fact, if implemented, it might be an argument for making the linked proposals external packages, and not language features.

Linked PRs

@ntessore ntessore added the type-feature A feature request or enhancement label Apr 27, 2023
@Fidget-Spinner
Copy link
Member

Fidget-Spinner commented Apr 27, 2023

Hi, thanks for taking a look at the specialised instruction for module attribute loading.

The general idea behind specialised instructions is that their guards must be cheap. PyModule_CheckExact is a single pointer comparison. PyModule_Check allows subyptes, so it will walk the entire MRO of the type just to check if it's a subtype of a module. This is generally significantly slower and the guard will thus become slower. This will especially negatively impact the guard failure case.

As such, I'm -1 on this change. The increase in specialisation coverage is not worth the significant slowdown in the fast path.

@ntessore
Copy link
Contributor Author

Hi, I had considered this. Perhaps one could check just direct subclasses, Py_IS_TYPE((op), &PyModule_Type) || Py_IS_TYPE((op)->ob_type->tp_base, &PyModule_Type). This would allow the extra degree of freedom to achieve class-like modules.

@ntessore
Copy link
Contributor Author

An alternative check could be more empirical and apply the specialisation for anything that uses the module tp_getattro, which should be a pretty exclusive club.

@skirpichev
Copy link
Member

The general idea behind specialised instructions is that their guards must be cheap. PyModule_CheckExact is a single pointer comparison. PyModule_Check allows subyptes, so it will walk the entire MRO of the type just to check if it's a subtype of a module.

On another hand, _Py_Specialize_LoadAttr() uses PyType_Check() exactly for a similar purpose...

@Fidget-Spinner
Copy link
Member

Fidget-Spinner commented Apr 28, 2023

The general idea behind specialised instructions is that their guards must be cheap. PyModule_CheckExact is a single pointer comparison. PyModule_Check allows subyptes, so it will walk the entire MRO of the type just to check if it's a subtype of a module.

On another hand, _Py_Specialize_LoadAttr() uses PyType_Check() exactly for a similar purpose...

I assume you mean LOAD_ATTR specialisations, because it's fine for _Py_Specialize_LoadAttr to be slow, since the specialisation function is only called once every so often, while specialised instructions are the ones that are actually executed frequently.

PyType_Check is a simple bitwise & comparison with Py_TPFLAGS_TYPE_SUBCLASS. It's a fast constant time operation and does not require looking up the MRO. All subtypes of type set a special flag indicating they are so. Whereas module subtypes don't do that, so require traversing the MRO during PyModule_Check. We don't just add flags for everything because there's a limited number of bits reserved for these flags. So they're precious and only used when we really need to.

@ntessore
Copy link
Contributor Author

Checking for the module type's tp_getattro isn't such a bad solution: anything with module's __getattr__ semantics is a candidate for the module specialisation. If you can stomach it.

Patch
diff --git a/Include/moduleobject.h b/Include/moduleobject.h
index 555564e..6cfdd30 100644
--- a/Include/moduleobject.h
+++ b/Include/moduleobject.h
@@ -101,6 +101,9 @@ struct PyModuleDef {
 // Internal C API
 #ifdef Py_BUILD_CORE
 extern int _PyModule_IsExtension(PyObject *obj);
+
+// Check if op uses the module attribute getter, used in specialization
+#define _PyModule_CheckGetAttr(op) (Py_TYPE(op)->tp_getattro == PyModule_Type.tp_getattro)
 #endif
 
 #ifdef __cplusplus
diff --git a/Python/bytecodes.c b/Python/bytecodes.c
index 9de0d92..5acae62 100644
--- a/Python/bytecodes.c
+++ b/Python/bytecodes.c
@@ -1676,7 +1676,7 @@ dummy_func(
         }
 
         inst(LOAD_ATTR_MODULE, (unused/1, type_version/2, index/1, unused/5, owner -- res2 if (oparg & 1), res)) {
-            DEOPT_IF(!PyModule_CheckExact(owner), LOAD_ATTR);
+            DEOPT_IF(!_PyModule_CheckGetAttr(owner), LOAD_ATTR);
             PyDictObject *dict = (PyDictObject *)((PyModuleObject *)owner)->md_dict;
             assert(dict != NULL);
             DEOPT_IF(dict->ma_keys->dk_version != type_version, LOAD_ATTR);
diff --git a/Python/generated_cases.c.h b/Python/generated_cases.c.h
index 864a4f7..67030e8 100644
--- a/Python/generated_cases.c.h
+++ b/Python/generated_cases.c.h
@@ -2338,7 +2338,7 @@
             uint32_t type_version = read_u32(&next_instr[1].cache);
             uint16_t index = read_u16(&next_instr[3].cache);
             #line 1679 "Python/bytecodes.c"
-            DEOPT_IF(!PyModule_CheckExact(owner), LOAD_ATTR);
+            DEOPT_IF(!_PyModule_CheckGetAttr(owner), LOAD_ATTR);
             PyDictObject *dict = (PyDictObject *)((PyModuleObject *)owner)->md_dict;
             assert(dict != NULL);
             DEOPT_IF(dict->ma_keys->dk_version != type_version, LOAD_ATTR);
diff --git a/Python/specialize.c b/Python/specialize.c
index 33a3c45..8c5d2d8 100644
--- a/Python/specialize.c
+++ b/Python/specialize.c
@@ -752,7 +752,7 @@ _Py_Specialize_LoadAttr(PyObject *owner, _Py_CODEUNIT *instr, PyObject *name)
         SPECIALIZATION_FAIL(LOAD_ATTR, SPEC_FAIL_OTHER);
         goto fail;
     }
-    if (PyModule_CheckExact(owner)) {
+    if (_PyModule_CheckGetAttr(owner)) {
         if (specialize_module_load_attr(owner, instr, name))
         {
             goto fail;
@@ -925,7 +925,7 @@ _Py_Specialize_StoreAttr(PyObject *owner, _Py_CODEUNIT *instr, PyObject *name)
         SPECIALIZATION_FAIL(STORE_ATTR, SPEC_FAIL_OTHER);
         goto fail;
     }
-    if (PyModule_CheckExact(owner)) {
+    if (_PyModule_CheckGetAttr(owner)) {
         SPECIALIZATION_FAIL(STORE_ATTR, SPEC_FAIL_OVERRIDDEN);
         goto fail;
     }

@skirpichev
Copy link
Member

@Fidget-Spinner, what do think on alternatives, presented by @ntessore?

@joaoe
Copy link

joaoe commented Mar 7, 2024

PEP 726 was rejected so this patch became much more relevant.
Thank you for your work.

Context:
https://discuss.python.org/t/pep-726-module-setattr-and-delattr/32640
https://peps.python.org/pep-0726/
Without the module __setattr__, when the only way to support that feature is to wrap the module with another object, hence this patch being relevant not to worsen performance.

@skirpichev skirpichev added the performance Performance or resource usage label Oct 30, 2024
@skirpichev skirpichev self-assigned this Oct 30, 2024
skirpichev added a commit to skirpichev/cpython that referenced this issue Nov 1, 2024
…ule subclasses

This patch relax specialization requirements from
``PyModule_CheckExact(op)`` to
``Py_TYPE(op)->tp_getattro != PyModule_Type.tp_getattro``.

Benchmarks:

```py
import pyperf
import b
import c
runner = pyperf.Runner()
runner.timeit('b.x', 'b.x', globals=globals())
runner.timeit('c.x', 'c.x', globals=globals())
```
```py
x = 1
```
```py
import sys, types
x = 1
class _Foo(types.ModuleType): pass
sys.modules[__name__].__class__ = _Foo
```

On the main:
```
$ python a.py -q
b.x: Mean +- std dev: 50.2 ns +- 2.7 ns
c.x: Mean +- std dev: 132 ns +- 7 ns
```

With the patch:
```
$ python a.py -q
b.x: Mean +- std dev: 52.9 ns +- 3.6 ns
c.x: Mean +- std dev: 52.6 ns +- 2.7 ns
```

Co-authored-by: Nicolas Tessore <[email protected]>
@skirpichev
Copy link
Member

Ok, I took liberty to propose a pr: #126264

@ntessore, I hope you are ok with this.

@skirpichev skirpichev removed their assignment Nov 1, 2024
Fidget-Spinner pushed a commit that referenced this issue Nov 15, 2024
@Fidget-Spinner
Copy link
Member

Thanks Sergey.

@skirpichev
Copy link
Member

Thanks to Nicolas;)

@ntessore
Copy link
Contributor Author

Thanks @skirpichev for doing the actual work.

I have also been wondering what a good idiom would be for module classes specified in the module itself. I thought __class__ wasn't a bad choice but I haven't thought about it deeply.

# ModuleClass does sys.module patching
class __class__(ModuleClass):
    ...
# moduleclass does sys.module patching
@moduleclass
class __class__:
    ...

@skirpichev
Copy link
Member

skirpichev commented Nov 15, 2024

# ModuleClass does sys.module patching
class class(ModuleClass):

Are you sure? Doesn't seems to be working c.f. suggested in the docs way (see VerboseModule example): https://docs.python.org/3/reference/datamodel.html#customizing-module-attribute-access

@ntessore
Copy link
Contributor Author

Sorry, I should have been more clear. It would be something to the effect of

class ModuleClass(ModuleType):
    def __init_subclass__(cls):
        sys.modules[cls.__module__].__class__ = cls

picnixz pushed a commit to picnixz/cpython that referenced this issue Dec 8, 2024
ebonnal pushed a commit to ebonnal/cpython that referenced this issue Jan 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

5 participants