Skip to content

Disallow creation of incomplete/inconsistent objects #56

Open
@vstinner

Description

@vstinner

The C API has many ways to create incomplete and/or inconsistent Python objects. We should disallow that.

One big example is that PyTuple_New() API which creates an uninitialized Python tuple object and immediately tracks it in the Garbage Collector. This issue was discussed from 2012 to 2021. At the end, because this API is too widely used, it was decided to not fix this issue: python/cpython#59313 was closed as "not a bug" (!).

Python objects must only be tracked by the GC once they are fully initialized. At least, when calling their "traverse" function will not crash. I added _PyType_AllocNoTrack() function and used it in a few types to fix this issue in modified types: delay PyObject_GC_Track() call only after the object is initialized.

PyUnicode_New() is another example: it creates a Python str object with uninitialized characters.

These API are written for performance: create a "scratch" object, populate it, and then expose it in Python. Since data is written directly into the object, there is no memory copy or conversion. The problem is that Python introspection gives access to these incomplete/inconsistent objects. It's common that exploring gc.get_objects() lead to crashes. In 2016, an optimization for function call in property_descr_get() leaded to crash: issue #70998. There were a bunch of similar bugs, so it motivated me to write the FASTCALL calling convention to avoids the need to create a tuple object to call a C function: issue #71001.

Over the years, I also added many CheckConsistency() functions, used in assertions, to make sure that objects are consistent. Examples: _PyObject_CheckConsistency(), _PyUnicode_CheckConsistency(), _PyType_CheckConsistency(), _PyDict_CheckConsistency().


For example, unsafe PyTuple_New() can be replaced with PyTuple_Pack(): the created tuple is only tracked by the GC once it's fully initialized and consistent. PyTuple_Pack() is a good example to follow.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions