Skip to content

bpo-43916: Add Py_TPFLAGS_DISALLOW_INSTANTIATION type flag #25721

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 30, 2021
Merged

bpo-43916: Add Py_TPFLAGS_DISALLOW_INSTANTIATION type flag #25721

merged 1 commit into from
Apr 30, 2021

Conversation

vstinner
Copy link
Member

@vstinner vstinner commented Apr 29, 2021

Add a new Py_TPFLAGS_DISALLOW_INSTANTIATION type flag to disallow
creating type instances: set tp_new to NULL and don't create the
"new" key in the type dictionary.

The flag is set automatically on static types if tp_base is NULL or
&PyBaseObject_Type and tp_new is NULL.

Use the flag on the following types:

  • _curses.ncurses_version type
  • _curses_panel.panel
  • _tkinter.Tcl_Obj
  • _tkinter.tkapp
  • _tkinter.tktimertoken
  • _xxsubinterpretersmodule.ChannelID
  • sys.flags type
  • sys.getwindowsversion() type
  • sys.version_info type

Update MyStr example in the C API documentation to use
Py_TPFLAGS_DISALLOW_INSTANTIATION.

Add _PyStructSequence_InitType() function to create a structseq type
with the Py_TPFLAGS_DISALLOW_INSTANTIATION flag set.

https://bugs.python.org/issue43916

@vstinner
Copy link
Member Author

This flag can be used outside CPython code base. For example, the pyrsistent project sets tp_new to NULL:

static PyObject* pyrsistent_pvectorc_moduleinit(void)
{
    ...
  // Only allow creation/initialization through factory method pvec
  PVectorType.tp_init = NULL;
  PVectorType.tp_new = NULL;

  if (PyType_Ready(&PVectorType) < 0) {
    return NULL;
  }
  ...
  Py_INCREF(&PVectorType);
  PyModule_AddObject(m, "PVector", (PyObject *)&PVectorType);
  return m;
}

@vstinner
Copy link
Member Author

@tiran
Copy link
Member

tiran commented Apr 29, 2021

Nice! Do you want to update _hashopenssl.c or should I update it later?

@vstinner
Copy link
Member Author

Nice! Do you want to update _hashopenssl.c or should I update it later?

I prefer to keep the PR simple to make the review easier, and write a second PR to use the flag on heap types that should not be instanciated directly: rewrite PR #25653 with this flag (or let @erlend-aasland do it) if my PR is accepted.

@erlend-aasland
Copy link
Contributor

Nice! Do you want to update _hashopenssl.c or should I update it later?

I prefer to keep the PR simple to make the review easier, and write a second PR to use the flag on heap types that should not be instanciated directly: rewrite PR #25653 with this flag (or let @erlend-aasland do it) if my PR is accepted.

I can adapt #25653 to use this flag.

@tiran
Copy link
Member

tiran commented Apr 29, 2021

@morzen CPython uses an external issue tracker. You can find more information in the top level README, https://github.com/python/cpython#issue-tracker-and-mailing-list

And please don't post pictures. They are neither searchable nor accessible.

@tiran
Copy link
Member

tiran commented Apr 29, 2021

Nice! Do you want to update _hashopenssl.c or should I update it later?

I prefer to keep the PR simple to make the review easier, and write a second PR to use the flag on heap types that should not be instanciated directly: rewrite PR #25653 with this flag (or let @erlend-aasland do it) if my PR is accepted.

Sounds good to me! I have created PR draft #25722.

@vstinner
Copy link
Member Author

About the flag name, another name could be Py_TPFLAGS_DISALLOW_NEW. Or a different name.

@vstinner
Copy link
Member Author

About the complexity of tp_new inheritance, here is an extract of tp_new_wrapper():

    /* Check that the use doesn't do something silly and unsafe like
       object.__new__(dict).  To do this, we check that the
       most derived base that's not a heap type is this type. */
    staticbase = subtype;
    while (staticbase && (staticbase->tp_new == slot_tp_new))
        staticbase = staticbase->tp_base;
    /* If staticbase is NULL now, it is a really weird type.
       In the spirit of backwards compatibility (?), just shut up. */
    if (staticbase && staticbase->tp_new != type->tp_new) {
        PyErr_Format(PyExc_TypeError,
                     "%s.__new__(%s) is not safe, use %s.__new__()",
                     type->tp_name,
                     subtype->tp_name,
                     staticbase->tp_name);
        return NULL;
    }

@tiran
Copy link
Member

tiran commented Apr 29, 2021

Naming is hard. Both Py_TPFLAGS_DISABLE_NEW and Py_TPFLAGS_DISALLOW_NEW reflect how Python prevents creation instances, not what the flag actually accomplishes. How about something like Py_TPFLAGS_DISALLOW_INSTANTIATION?

@erlend-aasland
Copy link
Contributor

I've adapted #25653 to use this flag. Regarding naming, I like @tiran's suggestion. I'll do a force push as soon as the name has landed :)

@vstinner vstinner changed the title bpo-43916: Add Py_TPFLAGS_DISABLE_NEW type flag bpo-43916: Add Py_TPFLAGS_DISALLOW_INSTANTIATION type flag Apr 29, 2021
@vstinner
Copy link
Member Author

I renamed the flag to Py_TPFLAGS_DISALLOW_INSTANTIATION. I also updated _xxsubinterpretersmodule.ChannelID to use the new flag.

"The creation of an instance is called instantiation." says Wikipedia: https://en.wikipedia.org/wiki/Instance_(computer_science)

I compared the usage of "disallow" versus "deny" in Include/ header files and in Doc/ documentation: "disallow" is more popular.

@python python deleted a comment from morzen Apr 29, 2021
@vstinner
Copy link
Member Author

@morzen CPython uses an external issue tracker. You can find more information in the top level README, https://github.com/python/cpython#issue-tracker-and-mailing-list

I removed the comment.

@vstinner
Copy link
Member Author

@tiran @erlend-aasland: my PR uses the flag in 9 types (and 1 documentation example). @erlend-aasland's PR disallow instanciation in many types. Do you think that these usages are common enough to justify to add a type flag?

@vstinner
Copy link
Member Author

This design doesn't prevent calling directly tp_new in C. I clarified the flag documentation:

/* Disallow creating instances of the type in Python. It remains possible to
 * call directly the tp_new function in C. */

Copy link
Member

@pablogsal pablogsal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a much more elegant solution, the only thing I am slightly wary of is to expose this publicly.

@serhiy-storchaka
Copy link
Member

There is already a standard way to prevent creating a new instance: setting tp_new to NULL. No new type flag is needed. BTW, tp_new == NULL is one of checks for pickleability of the type.

@vstinner
Copy link
Member Author

@serhiy-storchaka:

There is already a standard way to prevent creating a new instance: setting tp_new to NULL. No new type flag is needed. BTW, tp_new == NULL is one of checks for pickleability of the type.

If tp_new is set NULL after the type is created, creating an instance in Python fails with a weird error message. Example in Python 3.9:

$ python3.9
Python 3.9.4 (default, Apr  6 2021, 00:00:00) 
>>> import _curses_panel

# Good
>>> _curses_panel.panel()
TypeError: cannot create '_curses_panel.panel' instances

# Weird error
>>> _curses_panel.panel.__new__(_curses_panel.panel)
TypeError: object.__new__(_curses_panel.panel) is not safe, use _curses_panel.panel.__new__()

>>> _curses_panel.panel.__new__ is object.__new__
True

The problem is that the the type __dict__['__new__'] exists (wrapper to object.new). I would prefer to request in the type spec that instantiation is disallowed. Copy of my previous comment, extract of tp_new_wrapper():

    /* Check that the use doesn't do something silly and unsafe like
       object.__new__(dict).  To do this, we check that the
       most derived base that's not a heap type is this type. */
    staticbase = subtype;
    while (staticbase && (staticbase->tp_new == slot_tp_new))
        staticbase = staticbase->tp_base;
    /* If staticbase is NULL now, it is a really weird type.
       In the spirit of backwards compatibility (?), just shut up. */
    if (staticbase && staticbase->tp_new != type->tp_new) {
        PyErr_Format(PyExc_TypeError,
                     "%s.__new__(%s) is not safe, use %s.__new__()",
                     type->tp_name,
                     subtype->tp_name,
                     staticbase->tp_name);
        return NULL;
    }

@vstinner
Copy link
Member Author

Extract of inherit_special():

        /* The condition below could use some explanation.
           It appears that tp_new is not inherited for static types
           whose base class is 'object'; this seems to be a precaution
           so that old extension types don't suddenly become
           callable (object.__new__ wouldn't insure the invariants
           that the extension type's own factory function ensures).
           Heap types, of course, are under our control, so they do
           inherit tp_new; static extension types that specify some
           other built-in type as the default also
           inherit object.__new__. */
        if (base != &PyBaseObject_Type ||
            (type->tp_flags & Py_TPFLAGS_HEAPTYPE)) {
            if (type->tp_new == NULL)
                type->tp_new = base->tp_new;
        }

PyType_FromSpec() creates a type and then calls PyType_Ready() on it. PyType_Ready() calls inherit_special() which inherits tp_new (since it's a heap type), and then creates tp_dict["__new__"] since tp_new is not NULL anymore.

@vstinner
Copy link
Member Author

@serhiy-storchaka: I updated the PR. The flag now explicitly sets tp_new to NULL. So code checking if tp_new is NULL (like pickle) is not affected.

Advantages of using the new flag compared to setting tp_new to NULL:

  • PyType_FromSpec() doesn't support tp_new=NULL and it would be tricky to support that.
  • It ensures that __new__ is not created in the type dictionary. In Python, my_type.__new__ attribute doesn't exist.
  • The same flag can be used on static types, heap types and with _PyStructSequence_InitType().
  • The flag is known when the type is created and so PyType_Ready() can handle corner cases. For example, it doesn't create __new__ key in the type dictionary.

@vstinner vstinner marked this pull request as draft April 29, 2021 21:42
@vstinner
Copy link
Member Author

I mark this PR as a draft. I'm trying to modify PyType_FromSpec() to support {Py_tp_new, NULL}, slot, so we can compare the three options (Erlend's PR, this PR, and the future PyType_FromSpec PR).

@vstinner
Copy link
Member Author

Ok, there are now 3 API options to solve https://bugs.python.org/issue43916:

There are 3 main ways to create types

  • Static types

    • A uses tp_new = _PyType_DisabledNew
    • B sets Py_TPFLAGS_DISALLOW_INSTANTIATION flag
    • C uses regular tp_new = NULL
  • Heap types: PyType_Spec struct + PyFrom_FromSpec()

    • A uses {Py_tp_new, _PyType_DisabledNew} slot
    • B sets Py_TPFLAGS_DISALLOW_INSTANTIATION flag
    • C uses a new {Py_tp_new, NULL} slot
  • structseq type: PyStructSequence_InitType2() or a new _PyStructSequence_InitType() function

    • A doesn't address this case (keep the current code which sets tp_new and remove the __new__ key afterwards)
    • B sets Py_TPFLAGS_DISALLOW_INSTANTIATION flag
    • C uses a new internal function to pass disallow_instantiation=1

A drawbacks:

  • _PyType_DisabledNew() is only exposed in the internal C API and so needs to build more C extensions with the internal C API.
  • Don't slove the structseq types case.

B drawbacks:

  • Add a new public type flag.

C drawbacks:

  • {Py_tp_new, NULL} is less explicit than _PyType_DisabledNew() and Py_TPFLAGS_DISALLOW_INSTANTIATION.
  • New internal API are more specific to this problem than B internal functions.

@vstinner vstinner marked this pull request as ready for review April 29, 2021 22:06
@pablogsal
Copy link
Member

My preference is B, even if we can manually set tp_new to NULL it has a bit of a "hacky" feeling. I find the flag the most simple and elegant of all, but is true that still feels just a bit weird that a flag will disallow instantiation.

My preference order is B-> C -> A

@erlend-aasland
Copy link
Contributor

My preference order is B-> C -> A

Mine too, FWIW.

@vstinner
Copy link
Member Author

@pablogsal:

My preference is B, even if we can manually set tp_new to NULL it has a bit of a "hacky" feeling. I find the flag the most simple and elegant of all, but is true that still feels just a bit weird that a flag will disallow instantiation.

In Python, there is a similar "hack" to mark a type as non hashable:

$ python3
Python 3.9.4 (default, Apr  6 2021, 00:00:00) 
>>> class A:
...  pass

>>> hash(A())
8731944244974

>>> class B:
...  __hash__ = None

>>> hash(B())
TypeError: unhashable type: 'B'

@vstinner
Copy link
Member Author

In Python, setting __new__ to None prevents instantiation, but with a surprising error message ;-)

$ python3 
Python 3.9.4 (default, Apr  6 2021, 00:00:00) 
>>> class A:
...  __new__ = None

>>> A()
TypeError: 'NoneType' object is not callable

@pablogsal
Copy link
Member

In Python, setting __new__ to None prevents instantiation, but with a surprising error message ;-)

I wouldn't call that surprising TBH :)

@vstinner
Copy link
Member Author

Another slice of type inheritance (tp_new) complexity. Extract of update_one_slot() which is called by type_new() after PyType_Ready():

        else if (Py_IS_TYPE(descr, &PyCFunction_Type) &&
                 PyCFunction_GET_FUNCTION(descr) ==
                 (PyCFunction)(void(*)(void))tp_new_wrapper &&
                 ptr == (void**)&type->tp_new)
        {
            /* The __new__ wrapper is not a wrapper descriptor,
               so must be special-cased differently.
               If we don't do this, creating an instance will
               always use slot_tp_new which will look up
               __new__ in the MRO which will call tp_new_wrapper
               which will look through the base classes looking
               for a static base and call its tp_new (usually
               PyType_GenericNew), after performing various
               sanity checks and constructing a new argument
               list.  Cut all that nonsense short -- this speeds
               up instance creation tremendously. */
            specific = (void *)type->tp_new;
            /* XXX I'm not 100% sure that there isn't a hole
               in this reasoning that requires additional
               sanity checks.  I'll buy the first person to
               point out a bug in this reasoning a beer. */
        }

@erlend-aasland
Copy link
Contributor

I've converted #25653 to apply Py_TPFLAGS_DISALLOW_INSTANTIATION to the rest of the affected types: erlend-aasland@82d7a4b

Let me know if you want me to create a PR for it, @pablogsal.

Add a new Py_TPFLAGS_DISALLOW_INSTANTIATION type flag to disallow
creating type instances: set tp_new to NULL and don't create the
"__new__" key in the type dictionary.

The flag is set automatically on static types if tp_base is NULL or
&PyBaseObject_Type and tp_new is NULL.

Use the flag on the following types:

* _curses.ncurses_version type
* _curses_panel.panel
* _tkinter.Tcl_Obj
* _tkinter.tkapp
* _tkinter.tktimertoken
* _xxsubinterpretersmodule.ChannelID
* sys.flags type
* sys.getwindowsversion() type
* sys.version_info type

Update MyStr example in the C API documentation to use
Py_TPFLAGS_DISALLOW_INSTANTIATION.

Add _PyStructSequence_InitType() function to create a structseq type
with the Py_TPFLAGS_DISALLOW_INSTANTIATION flag set.

type_new() calls _PyType_CheckConsistency() at exit.
@vstinner
Copy link
Member Author

PR rebased to fix a conflict.

@vstinner vstinner merged commit 3bb0994 into python:master Apr 30, 2021
@vstinner vstinner deleted the disable_new_flag branch April 30, 2021 10:46
@vstinner
Copy link
Member Author

I took @pablogsal and @serhiy-storchaka concerns in account. Pablo would prefer to avoid adding a public type, Serhiy would prefer to modify PyType_FromSpec() to accept tp_new=NULL. I implemented Serhiy's idea to compare the API with this Py_TPFLAGS_DISALLOW_INSTANTIATION PR. Most people prefer the explicit Py_TPFLAGS_DISALLOW_INSTANTIATION flag which is less surprising than tp_new=NULL. I also listed other advantages of the Py_TPFLAGS_DISALLOW_INSTANTIATION flag in my previous comments.

Usually, I would prefer to get a clear consensus, but we are out of time: Python 3.10 beta1 (feature freeze) is next Wednesday, and I will be in holiday for 1 week starting tonight. If there is still a disagreement, we can still revisit the API or the implementation before Python 3.10 final release ("3.10.0 final: Monday, 2021-10-04" says PEP 619).

I merged my PR to unblock https://bugs.python.org/issue43916 release blocker issue (and there are other open release blocker issues to be solved before Wednesday).

Anyway, thanks everyone for this constructive and interesting discussion ;-)

The final merged change also takes in account your remarks. For example, the flag now sets tp_new to NULL, to take Serhiy's comment in account.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants