Skip to content

C API: Add a replacement for PySys_GetObject #108512

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
serhiy-storchaka opened this issue Aug 26, 2023 · 12 comments
Open

C API: Add a replacement for PySys_GetObject #108512

serhiy-storchaka opened this issue Aug 26, 2023 · 12 comments
Labels
topic-C-API type-feature A feature request or enhancement

Comments

@serhiy-storchaka
Copy link
Member

serhiy-storchaka commented Aug 26, 2023

Feature or enhancement

Proposal:

The PySys_GetObject() function has two flaws:

  • It clears any error raised inside the function, including important and critical errors.
  • It returns a borrowed reference.

We need to add a replacement free from these flaws. Any ideas about the API and the name?

Links to previous discussion of this feature:

Linked PRs

@vstinner
Copy link
Member

I propose the name PySys_GetAttr(). By the way, we already have a private _PySys_GetAttr() function.

The existing name PySys_GetObject() doesn't give any hint on the purpose of the function. It gets an object from where? Which object?

By the way, the special Py_TRACE_REFS debug build has a sys.getobjects() function which is unrelated: it gives the list of all Python objects :-)

@serhiy-storchaka
Copy link
Member Author

The name is only the part of the question. Other questions are:

  • Should it return a borrowed or new reference?
  • Should it raise an exception if sys has no such attribute and what type of exception?

Currently PySys_GetObject() returns a borrowed reference and NULL if sys has no such attribute. But the code that calls PySys_GetObject() has different reaction on the returned NULL:

  1. Some code raises some exception.
  2. Some code returns error (and the caller of that code may expect an exception be set, but it is not in this case).
  3. Some code expects the attribute be optional and does not expect any exception. It would need to silence it if it was raised.

Case 2 looks like a bug. We should raise explicit exceptions in these cases. But what exceptions?

@encukou
Copy link
Member

encukou commented Sep 18, 2023

IMO this is similar to the new PyDict_GetItemRef and PyObject_GetOptionalAttr and should have the same API. That is, int PySys_GetObjectStringRef(char *name, PyObject **result), returning

  • -1: internal error or API misuse; exception raised
  • 0: lookup succeeded; no item was found
  • 1: lookup succeeded; result set

It gets an object from where? Which object?

Well, from the sys module (see PySys_), and the one associated with the given key :)

@vstinner
Copy link
Member

@encukou:

Well, from the sys module (see PySys_), and the one associated with the given key :)

I was a rhetorical question. I was just argument in favor of my proposed PySys_GetAttr() name :-)

@vstinner
Copy link
Member

vstinner commented Sep 18, 2023

@serhiy-storchaka: I'm interested to work on such API, but if you prefer to write it, just tell me.

For many years, PySys_GetObject() annoyed me, but I was not motivated enough to change it. For example, I dislike the fact that it returns a borrowed reference. I also dislikes the fact that it returns NULL if there is no attribute and if an error occurs. Also, it never raises any exception, it's bad in many ways :-(

@serhiy-storchaka
Copy link
Member Author

@serhiy-storchaka: I'm interested to work on such API, but if you prefer to write it, just tell me.

I have not started to write a code yet.

There is a similar function _PyEval_GetBuiltin() which is much more used. It is similar to PyObject_GetAttr() -- returns a new reference or NULL, and always sets error if returns NULL. This interface is more convenient when passed to Py_BuildValue("N").

I think that convenient functions for getting an attribute of sys and builtins should be in the public API, and they should have the same interface. One of differences is that PySys_GetObject() takes name as a C string, and it is never used in performance sensitive code, while _PyEval_GetBuiltin() takes name as a Python string, so we perhaps need two variants of each functions.

@serhiy-storchaka
Copy link
Member Author

I started to work on preliminary patch and found that PySys_GetObject() is used so little because more sites use private _PySys_GetAttr() which take a name as PyObject*. So now I have much more cases to consider.

@vstinner
Copy link
Member

vstinner commented Sep 21, 2023

We should add PySys_GetAttr() with PyObject name and PySys_GetAttrString() with char* name. It would be consistent with other APIs, no?

@serhiy-storchaka
Copy link
Member Author

There are at least 53 sites in CPython code that can use new functions (currently they use PySys_GetObject(), private functions _PySys_GetObject() and _PySys_GetAttr(), direct access to sysdict, _PyImport_GetModuleAttrString(), etc). Approximately 23 of them treat non-existing object as error, and 29 tolerates the absence. It is approximately because in some sites any errors are silenced, so any variant can be used. 20 sites pass the names as a C string, and 20 sites pass it as a Python string (_Py_ID()). Again, the difference is not strict, because every site that use literal C string can be reritten to use _Py_ID() and vice verse. _Py_ID() is faster, but in most cases it does not matter.

More detailed, per function:

11 PySys_GetAttr()
12 PySys_GetAttrString()
21 PySys_GetOptionalAttr()
8 PySys_GetOptionalAttrString()

All 4 variants are used in almost equal proportions. 4 new functions to replace PySys_GetObject() may be too much. We can get rid of Python string variants in favor of C string variants at some performance cost (but in most cases the difference is negligible). It is difficult to chose between "optional" and "required" variants, because when we left only one we will need to add either PyErr_ExceptionMatches(PyExc_RuntimeError)+PyErr_Clear() or PyErr_SetString(PyExc_RuntimeError, "...") in half of cases.

serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this issue Oct 18, 2023
Add functions PySys_GetAttr(), PySys_GetAttrString(),
PySys_GetOptionalAttr() and PySys_GetOptionalAttrString().
@vstinner
Copy link
Member

Oh, I forgot about this old issue! I just created #129367 to propose adding PySys_GetAttr() function.

@vstinner
Copy link
Member

All 4 variants are used in almost equal proportions. 4 new functions to replace PySys_GetObject() may be too much.

IMO PySys_GetOptionalAttrString functions are too much (adding 2 functions is already "a lot") and not worth it.

If you want to support optional attribute, you should expect AttributeError and ignore it (PyErr_Clear()).

@vstinner
Copy link
Member

IMO PySys_GetOptionalAttrString functions are too much (adding 2 functions is already "a lot") and not worth it.

I read the PR #111035 which replaces PySys_GetObject() with new functions and I changed my mind. There are many functions which do nothing if a sys attribute is missing: having to call PyErr_ExceptionMatches(PyExc_AttributeError) + PyErr_Clear() sounds annoying in this case.

@vstinner vstinner marked this as a duplicate of #129367 Jan 28, 2025
serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this issue Feb 6, 2025
Add functions PySys_GetAttr() and PySys_GetAttrString().
serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this issue Feb 24, 2025
… free-threaded build

The use of PySys_GetObject() and _PySys_GetAttr(), which return a borrowed
reference, has been replaced by using one of the following functions, which
return a strong reference and distinguish a missing attribute from an error:
_PySys_GetOptionalAttr(), _PySys_GetOptionalAttrString(),
_PySys_GetRequiredAttr(), and _PySys_GetRequiredAttrString().
serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this issue Feb 24, 2025
… free-threaded build

The use of PySys_GetObject() and _PySys_GetAttr(), which return a borrowed
reference, has been replaced by using one of the following functions, which
return a strong reference and distinguish a missing attribute from an error:
_PySys_GetOptionalAttr(), _PySys_GetOptionalAttrString(),
_PySys_GetRequiredAttr(), and _PySys_GetRequiredAttrString().
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-C-API type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

3 participants