-
-
Notifications
You must be signed in to change notification settings - Fork 31.9k
Define behavior of descriptor-typed fields on dataclasses #91330
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Recent discussions about PEP-681 (dataclass_transform) have focused on support for descriptor-typed fields. See the email thread here: https://mail.python.org/archives/list/[email protected]/thread/BW6CB6URC4BCN54QSG2STINU2M7V4TQQ/ Initially we were thinking that dataclass_transform needed a new parameter to switch between two modes. In one mode, it would use the default behavior of dataclass. In the other mode, it would be smarter about how descriptor-typed fields are handled. For example, __init__ would pass the value for a descriptor-typed field to the descriptor's __set__ method. However, Carl Meyer found that dataclass already has the desired behavior at runtime! We missed this because mypy and Pyright do not correctly mirror this runtime behavior. Although this is the current behavior of dataclass, I haven't found it documented anywhere and the behavior is not covered by unit tests. Since dataclass_transform wants to rely on this behavior and the behavior seems desirable for dataclass as well, I'm proposing that we add additional dataclass unit tests to ensure that this behavior does not change in the future. Specifically, we would like to document (and add unit tests for) the following behavior given a field whose default value is a descriptor:
Here's an example: class Descriptor(Generic[T]):
def __get__(self, __obj: object | None, __owner: Any) -> T:
return getattr(__obj, "_x")
def __set__(self, __obj: object | None, __value: T) -> None:
setattr(__obj, "_x", __value)
@dataclass
class InventoryItem:
quantity_on_hand: Descriptor[int] = Descriptor[int]()
i = InventoryItem(13) # calls __set__ with 13
print(i.quantity_on_hand) # 13 -- obtained via call to __get__
i.quantity_on_hand = 29 # calls __set__ with 29
print(i.quantity_on_hand) # 29 -- obtained via call to __get__ I took a first stab at unit tests here: debonte@c583e7c We are aware of two other descriptor-related behaviors that may also be worth documenting: First, if a field is annotated with a descriptor type but is *not* assigned a descriptor object as its default value, it acts like a non-descriptor field. Here's an example: @dataclass
class InventoryItem:
quantity_on_hand: Descriptor[int] # No default value
i = InventoryItem(13) # Sets quantity_on_hand to 13 -- No call to Descriptor.__set__
print(i.quantity_on_hand) # 13 -- No call to Descriptor.__get__ And second, when a field with a descriptor object as its default value is initialized (when the code for the dataclass is initially executed), __get__ is called with a None instance and the return value is used as the field's default value. See the example below. Note that if __get__ doesn't handle this None instance case (for example, in the initial definition of Descriptor above), a call to InventoryItem() fails with "TypeError: InventoryItem.__init__() missing 1 required positional argument: 'quantity_on_hand'". I'm less sure about documenting this second behavior, since I'm not sure what causes it to work, and therefore I'm not sure how intentional it is. class Descriptor(Generic[T]):
def __init__(self, *, default: T):
self._default = default
def __get__(self, __obj: object | None, __owner: Any) -> T:
if __obj is None:
return self._default
def __set__(self, __obj: object | None, __value: T) -> None:
if __obj is not None:
setattr(__obj, "_x", __value)
# When this code is executed, __get__ is called with __obj=None and the
# returned value is used as the default value of quantity_on_hand.
@dataclass
class InventoryItem:
quantity_on_hand: Descriptor[int] = Descriptor[int](default=100)
i = InventoryItem() # calls __set__ with 100
print(i.quantity_on_hand) # 100 -- obtained via call to __get__ |
I think it's fine to document and test for the first two items.
For: I think that's the correct behavior. To do otherwise would require dataclasses to look at the annotation to figure out what type of descriptor to create, which I think is outside its scope. For: I don't really understand what's going on. I'll look at it further. |
Behavior 4 happens because in I actually think this behavior is quite nice in practice for dataclasses, and if we are going to document these behaviors, IMO it makes sense to document it too. It is nice because it gives the I think the possible downside is that it is somewhat common today for descriptors to |
@carljm, thanks for explaining why this works. I agree that it's worth documenting.
Are you saying that this is a downside to documenting it? Given that dataclass already behaves this way and no one has expressed a desire to change that behavior, is there a downside? I think your point is that if a user relies on the default value of a descriptor-typed field and the descriptor's So perhaps type checkers and language servers will choose not to support this because they don't want to lead users into that sort of problem. But documenting the behavior at least makes it common knowledge and can steer descriptor authors in the right direction in the future. And maybe tools can help there, for example maybe by warning if a Do you disagree? |
SQLAlchemy's descriptor (the one in question) does not return "self" when "obj" is None, it returns a SQL expression construct. this doesn't actually affect SQLAlchemy with dataclasses in any case as we don't make any dataclasses with said descriptor interpreted in a dataclasses context at runtime. as far as descriptors generally reutnring "self" when obj is None, I'm not sure what else you think would be returned by the average "normal" descriptor. returning "self" is usually kind of important for tools like Sphinx autodoc and such to work, and also the primary document on descriptors pretty much uses "return self" in all the examples. |
@zzzeek, are you arguing for not documenting this behavior at all or just pointing out that this is only relevant to descriptors that are used as dataclass field types rather than all descriptors? |
I am sort of pointing out the latter, but also wondering what the suggestion is for garden variety descriptors to do when their |
I think that may be intentionally left open-ended. PEP 252 says:
|
No, I don't really think there is a downside to documenting it. I guess I meant "downside" broadly as in "this behavior kind of conflicts with a typical way that descriptors are written," but I don't think that's sufficient reason either to change dataclasses' behavior or avoid documenting it. I'm not sure how common use of descriptors as dataclass fields will ever be, but I think if they are used, this way of setting the default value makes as much sense as anything, and has some strong advantages (not least that it is the status quo). It may also be worth documenting the corollary that @zzzeek I wasn't trying to question |
) Co-authored-by: Łukasz Langa <[email protected]>
…ythonGH-94424) Co-authored-by: Łukasz Langa <[email protected]> (cherry picked from commit 5f31930) Co-authored-by: Erik De Bonte <[email protected]>
…fields (pythonGH-94424) Co-authored-by: Łukasz Langa <[email protected]> (cherry picked from commit 5f31930) Co-authored-by: Erik De Bonte <[email protected]>
@ericvsmith, my PR got merged without your review. Are you OK with the test changes in there? |
) (GH-94576) Co-authored-by: Erik De Bonte <[email protected]> Co-authored-by: Łukasz Langa <[email protected]> (cherry picked from commit 5f31930)
…GH-94424) (GH-94577) Co-authored-by: Erik De Bonte <[email protected]> Co-authored-by: Łukasz Langa <[email protected]> (cherry picked from commit 5f31930)
Yes, this is now landed in 3.10 - 3.12. Thanks, Erik! I'll leave the issue open for a while in case Eric's got any test improvement suggestions. We can easily make those improvements forward. |
Thanks, @debonte. I thought I'd reviewed this but I guess I never committed it. It looks good to me, so I'm closing this. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: