Skip to content

Increase PyStructSequence compatibility with collections.namedtuple #108647

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
FFY00 opened this issue Aug 29, 2023 · 7 comments
Closed

Increase PyStructSequence compatibility with collections.namedtuple #108647

FFY00 opened this issue Aug 29, 2023 · 7 comments
Labels
type-feature A feature request or enhancement

Comments

@FFY00
Copy link
Member

FFY00 commented Aug 29, 2023

Feature or enhancement

Has this already been discussed elsewhere?

This is a minor feature, which does not need previous discussion elsewhere

Links to previous discussion of this feature:

collections.namedtuple provides a couple useful helper functions, which are not present in PyStructSequence, its C API. Even though its documentation does not promise collections.namedtuple compatibility, it refers to it as a "named tuple".

The structseq helper is considered an internal CPython implementation
detail. Docs for modules using structseqs should call them
"named tuples" (be sure to include a space between the two
words and add a link back to the term in Docs/glossary.rst).

IMO, it'd be worth to increase compatibility between these two named tuple instances, and also make the extra functionality available in PyStructSequence.

Many times already, I have had the need for some of these helpers, with them being unavailable. An example use-case would be converting sys.version_info or sys.flags to a dictionary.


Here's a list of the collections.namedtuple helpers that are not available in PyStructSequence:

  • _fields
    • Tuple of strings listing the field names. Useful for introspection and for creating new named tuple types from existing named tuples

  • _asdict
    • Return a new dict which maps field names to their corresponding values.

  • _field_defaults (would return an empty dict, if implemented)
    • Tuple of strings listing the field names. Useful for introspection and for creating new named tuple types from existing named tuples.

  • _replace
    • Return a new instance of the named tuple replacing specified fields with new values:

  • _make
    • Class method that makes a new instance from an existing sequence or iterable.

Proposal:

>>> import sys
>>> sys.version_info._fields
('major', 'minor', 'micro', 'releaselevel', 'serial')
>>> sys.version_info._asdict()
{'major': 3, 'minor': 13, 'micro': 0, 'releaselevel': 'alpha', 'serial': 0}
>>> sys.flags._fields
('debug', 'inspect', 'interactive', 'optimize', 'dont_write_bytecode', 'no_user_site', 'no_site', 'ignore_environment', 'verbose', 'bytes_warning', 'quiet', 'hash_randomization', 'isolated', 'dev_mode', 'utf8_mode', 'warn_default_encoding', 'safe_path', 'int_max_str_digits')
>>> sys.flags._asdict()
{'debug': 0, 'inspect': 0, 'interactive': 0, 'optimize': 0, 'dont_write_bytecode': 0, 'no_user_site': 0, 'no_site': 0, 'ignore_environment': 0, 'verbose': 0, 'bytes_warning': 0, 'quiet': 0, 'hash_randomization': 1, 'isolated': 0, 'dev_mode': False, 'utf8_mode': 0, 'warn_default_encoding': 0, 'safe_path': False, 'int_max_str_digits': 4300}

Linked PRs

@ericvsmith
Copy link
Member

How do these proposed changes interact with the PyStructSequence unnamed fields? That's a major difference with namedtuple that I'm not sure can (or should be) hidden.

@FFY00
Copy link
Member Author

FFY00 commented Aug 29, 2023

My proposal for that would be to either not consider unnamed fields in these helpers, or not provide the helpers for PyStructSequences with unnamed fields. Honestly, I am okay with either one, the vast majority of cases where the helpers would be helpful to me are instances where PyStructSequence is simply used to implement a "fully" named tuple.

We could also consider trying to fit the unnamed fields into the API, but I don't think that's a good idea.

@FFY00
Copy link
Member Author

FFY00 commented Aug 30, 2023

I have updated the PR to not provide the helpers when there are unnamed fields. That's probably the safest bet right now, and we can change it in the future if we decide.

@serhiy-storchaka
Copy link
Member

Is it a duplicate of #46145?

@ericvsmith
Copy link
Member

I have updated the PR to not provide the helpers when there are unnamed fields. That's probably the safest bet right now, and we can change it in the future if we decide.

I'm not sure I recall all of the specifics correctly, but are you allowed to add unnamed fields to an existing type? Specifically, I'm thinking about backward compatibility. My concern is that if you added fields and suddenly the helpers vanished, this wouldn't be a very friendly change. But maybe I'm thinking about it backwards: things like sys.getwindowsversion() have more named than unnamed fields. Is there a case where adding named fields would cause the helpers to vanish?

What's an example of a stdlib type with unnamed fields?

@FFY00
Copy link
Member Author

FFY00 commented Sep 1, 2023

Is it a duplicate of #46145?

Yes, sorry! I missed that in my initial search.

I'm not sure I recall all of the specifics correctly, but are you allowed to add unnamed fields to an existing type? Specifically, I'm thinking about backward compatibility. My concern is that if you added fields and suddenly the helpers vanished, this wouldn't be a very friendly change. But maybe I'm thinking about it backwards: things like sys.getwindowsversion() have more named than unnamed fields. Is there a case where adding named fields would cause the helpers to vanish?

What's an example of a stdlib type with unnamed fields?

Well, I think this is kind of a grey area. I think it heavily depends on the use-case, adding a field, either named or unnamed, will change the tuple format, so that can often be a backwards incompatible change on its own. I understand and share the concern about preventing people from adding unnamed fields to fully named PyStructSequences in the future, but that seems to be a very unpopular feature, and I think the benefits from having the helpers outweigh the concern here.

AFAICT the only example in the stdlib with unnamed fields is os.stat in the POSIX implementation.

@AA-Turner
Copy link
Member

Closing as duplicate

@AA-Turner AA-Turner closed this as not planned Won't fix, can't repro, duplicate, stale Sep 1, 2023
FFY00 added a commit to FFY00/cpython that referenced this issue Nov 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

4 participants