-
Notifications
You must be signed in to change notification settings - Fork 2
Second opinions on a couple specific issues #41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thanks for working on this @mikeshardmind - good to know it's coming along! Here are my thoughts on your points:
If the
I've noticed this comment - isn't there? Can't the type checker just work through elements of the intersection until it finds one that specifies the requested method?
class A:
def hex(self):
pass
class Test(A, float): # Base classes for class "Test" define method "hex" in incompatible way
pass # Return type mismatch: base method returns type "str", override returns type "None" I think |
Made this quick runtime illustration of how I imagine the type checker for intersection might work: from typing import Any, Optional
def check_compatibility(intersection_tup: tuple[Any,...]) -> Optional[str]:
# Checks for any errors in combination of items
# If errors returns str, else returns None
# This can reuse the logic type checkers already use when mixing classes
return None
def intersection_getattr(intersection_tup: tuple[Any,...], attr: str) -> Any:
error = check_compatibility(intersection_tup)
if error is not None:
raise TypeError(f"Intersection {intersection_tup} incompatible, reason: {error}")
else:
for item in intersection_tup:
if hasattr(item, attr):
return getattr(item, attr)
raise TypeError(f"Attribute {attr} not found in intersection {intersection_tup}") Then if we apply this to case 1, it takes the intersection_getattr((T,U), "__init__") ...and if we're in case 2, check_compatibility((HasDotHexMethod,float)) ...would then produce an error Edit: The logic of |
Ah, see that's not actually what I was getting at @mark-todd import random
class T:
def __init__(self, req_arg: int):
...
class U:
def __init__(self, req_arg: int):
...
class LSPViolation(T, U):
def __init__(self, req_arg: int, extra_required_arg: int): # type checkers allow this
self._x = required_arg
type Intersected = OrderedIntersection[T, U]
def f(typ: type[Intersected]) -> Intersected:
return typ(random.randint(1, 100))
f(LSPViolation) You can check a simpler version of this to see that it's an existing problem) without intersections, but the implications of this are worse because this is explicitly a means of compositing types, so not being able to safely construct them afterward feels wrong. Now this is fixable with intersections, where it isn't currently fixable: import random
import typing
class T:
def __init__(self, req_arg: int):
...
class U:
def __init__(self, req_arg: int):
...
class LSPViolation(T, U):
def __init__(self, req_arg: int, extra_required_arg: int): # type checkers allow this
self._x = required_arg
type Intersected = OrderedIntersection[T, U]
type ConstructableAsIntersected[T, U] = Callable[[int], Intersected]
type IntersectedType = OrderedIntersection[ConstructableAsIntersected, type[Intersected]]
def f(typ: IntersectedType) -> Intersected:
return typ(random.randint(1, 100))
f(LSPViolation) # this would actually be detected now, this isn't a ConstructableAsIntersected But you'll notice there's no way to just automatically do this. We can't extract the expected init signature automatically and convert those type aliases to type ConstructableAsIntersected[*Ts] = Callable[??? , OrderedIntersection[*Ts]]
type IntersectedType[*Ts] = OrderedIntersection[ConstructableAsIntersected[*Ts], type[OrderedIntersection[*Ts]] The triple question marks as standins for the missing puzzle piece to allow this to be type safe and not require a ton of per type boilerplate, but the question is really why should users need this boilerplate when this is a clear and obvious use. Basically, we're running into an issue with a prior decision and no way to square this with that decision without adding an exception to the existing exception, removing the exception, or pairing this with some additional construct we haven't described here yet. |
I think you also missed my intent with the numeric tower, which is on me for just assuming people were aware that the numeric tower in python is currently not type safe as it is. The question I had is very similar to the first one, is it enough that the type safety with the numeric tower is now expressable and avoidable, do we need to do more, or can this be fixed seperate to this without us needing to consider anything about it now? Code sample in pyright playground def f(x: float, /):
hexrep = x.hex()
... # but actually use it as a numeric value here too, so protocol isn't "simply" enough
f(1) This is allowed, again for "pragmatic" reasons, and there is a way to express what you want with intersections now: class HasDotHex(Protocol):
def hex(self) -> str:
...
def f(x: OrderedIntersection[HasDotHex, float]):
...
f(1) # this fails the protocol now Again, this is a situation where in theory we don't need to do anything, and we gave people tools to fix an existing type hole, but it feels like we're just kicking an existing problem further down the road here, so in doing so I want to see if anyone has a reason for anything more that's actually our problem to consider to make sure that in the process of not addressing this one we don't make it harder for it to be properly adressed in the future. "You already have this way of doing it, this isn't a priority" mixed with people doing the easiest thing... well, the correct thing not being ergonomic is going to lead to the correct thing not being done. |
@mikeshardmind I think I see - that's pretty spicy! So I'm pretty sure there are "spin-off" problems from this you could also generate from the fact some method combinations don't violate LSP I believe. I propose a rule - if you overwrite an existing method from let's say import random
import typing
class T:
def __init__(self, req_arg: int):
...
class U:
def __init__(self, req_arg: int):
...
class NotFine(T, U):
def __init__(self, req_arg: int, extra_required_arg: int): # type checkers allow this
self._x = required_arg
class Fine(T,U):
def __init__(self, req_arg: int):
...
def f(typ: type[OrderedIntersection[T, U]]) -> OrderedIntersection[T, U]:
return typ(random.randint(1, 100))
f(Fine) # No issues
f(NotFine) # Not fine - method `__init__` clashes with T definition of `__init__` To consult my earlier runtime algorithm, for this to work you would get the type of intersection_getattr((T,U),'__init__') And find that this type matches with |
This would be the "Add an exception to the existing exception" option... type checkers don't consider the |
Right, I think I see, because e.g. with the function I shared earlier |
@mikeshardmind |
It breaks congruency with what's allowed currently to fix this by going the exception route if we had that, then this would happen: import random
class T:
def __init__(self, req_arg: int):
...
class U:
def __init__(self, req_arg: int):
...
class LSPViolation(T, U):
def __init__(self, req_arg: int, extra_required_arg: int): # type checkers allow this
self._x = required_arg
type Intersected = OrderedIntersection[T, U]
def f1(typ: type[Intersected]) -> Intersected:
return typ(1)
def f2(typ: type[T]) -> T:
return typ(1)
f(LSPViolation) # the exception to the exception prevents this from being allowed
f2(LSPViolation) # allowed, the existing exception allows this, errors at runtime Maybe we can't do anything about it from that end without being too disruptive. Other than modifying the original exception to fix it, I think we would need to figure out how to remove the question marks from this.
to at least allow this to be a way for people to patch over the type hole. |
You can with the way the rules were described, and not allowing this would break a few of our class decorator cases. It's not a big deal, we aren't assuming there is a concrete implementation at the point of the intersection, we're placing in an annotation context the expectation that there will be a value which satisfies the protocol and non protocol constraints simultaneously. There's not the same ambiguity with subclassing a protocol going on here. |
@mikeshardmind Yeah I see what you're getting at here - it's a bit strange but I think it's fine. I would expect the intersection to be more restrictive than saying must inherit from T. |
@mikeshardmind Ooh interesting. I would argue Protocols and Non-Protocol shouldn't share an intersection, but I do see also why it might be useful, as you can tag on behaviour to an existing class... - any reason why you couldn't use Abstract classes instead of Protocols for that use case? I think it's important that we follow the tenet of what is acceptable by inheritance - mixing them currently isn't so that's why I'm thinking that neither should it be to Intersection. At the same time though, pragmatically a non-useful Intersection helps nobody so it's difficult. |
Oh similarly I would expect |
That the protocol is the correct tool. abstract classes (As in from
I haven't gotten to TypedDicts yet, but I'm not sure I agree with that. I need to explore the rules we have for potential issues with not restricting it like that (FWIW I don't think there are any issues with allowing it) as I'm already aware of reasons people would want to not do this, such as with annotating sql wrapping objects that can behave like a typeddict for accessing values, but may have additional methods for getting metadata about the action that produced the object. |
@mikeshardmind Right, thanks, I think I follow why Protocol mixtures would be useful now. Also I've thought some more re-implementation and mixing Protocols and Non-Protocols should be fine as it's just the types that get merged. On which note, returning to your example: class HasDotHex(Protocol):
def hex(self) -> str:
...
def f(x: OrderedIntersection[HasDotHex, float]):
...
f(1) See, I would say OrderedIntersection[Protocol, Protocol] = Protocol The point is that I think it has to be one or the other, and the result being a non-protocol is definitely out - I'm not sure we can leave it as a "mixed" type if that makes sense.
Let's recircle to |
That would defeat the purpose of OrderedIntersection. The entire point is the ability to specify multiple things about something.
None of the above. As I described it, it's a typing special form and does not create a new type, but describes a set of types. Union does not turn a union of protocols and non protocols into only being protocols, it's a type system tool (special form) for describing which things are being considered. The simple behavior: |
Ah gotcha, sorry I was thinking it was being used as a float. I guess my point was that f(1.0) should succeed. Ok, I think the problem I'm wrestling with is a different one again - consider the following: class A:
def foo(self) -> int:
return 1
class B(Protocol):
def bar(self) -> int:
...
class C:
def third(self) -> int:
return 2
class D(A,C):
pass
class E(A):
def bar(self) -> int:
return 3
d: OrderedIntersection[A,C] = D() # Must inherit from A & C in that order
e: OrderedIntersection[A,B] = E() # Must inherit from A and have the method `bar`
f: OrderedIntersection[A, B, C] = ?? # Now what? To make f, we could say, it must inherit from A & C in that order and have method bar, but that would require us to treat Protocols and Non-Protocols separately. My solution was to just say it becomes "Protocol-Like" - in other words we drop the inheritance condition. Edit: I'm now thinking looking at your descriptions that maybe the inheritance condition isn't there even for non-protocol classes - is that correct? |
@mark-todd So, Maybe this is a good example of a reason to use a term that isn't OrderedIntersection, as the ordered part is only on one side of the interaction. All of your examples there are fine and solvable with the current semantics. The ordering doesn't come into play at class definition. You have: d: OrderedIntersection[A,C] = D() # Must inherit from A & C in that order
e: OrderedIntersection[A,B] = E() # Must inherit from A and have the method `bar`
f: OrderedIntersection[A, B, C] = ?? # Now what? But the comments aren't quite right. line by line "Must be an instance of A and C or their subtypes, attribute/method lookup at type time will be done by looking at A then C" The reason this isn't a type-hole is that the limitations on what classes can be defined prevent any case with incompatible overlap from ever being created to then have an instance to pass in. The ordering being only at the end serves 2 purposes.
Some quick examples Intentionally using the ordering as a mask for exported capability to rely on vs internal types (previously not possible without a lie in a type stub or a cast) __all__ = ["X",]
class _X:
def foo(self, _private_cache: dict[str, int] = {}) -> int:
"""
The private cache being implemented this way prevents the issues of lru_cache on methods...
"""
...
class ProtoFoo:
def foo(self) -> int:
...
# We've exported this without `_private_cache` being an allowed arg to pass in X().foo
# We can continue to export this as is, even if `_X` is later changed to use some other means of handling caching as we've declared that the definition for the method `foo` should be looked for by type checkers in ProtoFoo before _X
X: type[OrderedIntersection[ProtoFoo, _X]] = _X Examples of behavior with Any: class X:
def foo(self) -> int:
...
def foox(x1: OrderedIntersection[X, Any], x2: OrderedIntersection[Any, X]):
x1.foo() # int # This is the case that was more strict than *possible* with `Any` in option 5 from the prior set
x1.bar() # Any
x2.foo() # Any # This is the case I didn't like with option 4 from the prior set
x2.bar() # Any going back to your example class A:
def foo(self) -> int:
return 1
class B(Protocol):
def bar(self) -> int:
...
class C:
def third(self) -> int:
return 2
class D(A,C):
pass
class E(A):
def bar(self) -> int:
return 3
class F(E, C):
pass
d: OrderedIntersection[A,C] = D()
e: OrderedIntersection[A,B] = E()
f: OrderedIntersection[A, B, C] = F() # As there's no overlapping methods/attributes to ever be looked up, the ordered nature won't ever come into play |
@mikeshardmind What you have there makes sense to me. If you haven't already, you probably need to explain how the rules prevent an issue in the following case: # library.py
__all__ = ["X",]
class _X:
def foo(self, _private_cache: dict[str, int] = {}) -> int:
"""
The private cache being implemented this way prevents the issues of lru_cache on methods...
"""
...
class ProtoFoo:
def foo(self) -> int:
...
# We've exported this without `_private_cache` being an allowed arg to pass in X().foo
# We can continue to export this as is, even if `_X` is later changed to use some other means of handling caching as we've declared that the definition for the method `foo` should be looked for by type checkers in ProtoFoo before _X
X: type[OrderedIntersection[ProtoFoo, _X]] = _X # user.py
from library import X
class Y(X):
def foo(self) -> int:
... I think this can properly error for LSP issues with the definitions you chose. |
@DiscordLiz Thanks for spelling that out as an example after the quick chat we had earlier, I'm going to be taking a lot of these examples and creating test cases as part of the proposal, both as samples of what is and isn't expected, as well as things that can be used in the conformance suite. I gave it a bit more thought since we talked about it, I do have a note for it here after confirming it isn't a problematic case. The current working definition requires this be detected as invalid subtyping (LSP conflict). It just requires extra clarity on in what contexts type checkers must consider consistency with all operands and when they only need to get the first definition. Namely, consistency with all operands must be considered when determing if there is a valid type relationship between an intersection and another type¹. In all other cases, it is sufficient to use the first found definition. However, this occurs in multiple contexts.
It may be worth getting a list of contexts where this relationship must be handled into the typing specification as part of the increase in clarification as part of this, as clarity around intersections would be improved for having such a reference. ¹ The word type and the idea of a type relationship are each somewhat overloaded and not fully specified in the specification. When considering annotations as specifications of allowed types, this is the level at which a type comparison must be taking place for this to apply. ² I believe the term "Type specification" to be a good way to express the difference between types as a runtime concept, and types which express expectations of the runtime types, and will be using this as a defined term in what I am writing. Should another term for this be adopted instead in the specification, it would be easy to revise this to use another term. |
Putting this here because I had a slightly more general comment from this, and this one: @mikeshardmind I think we really have three sensible choices:
A lot of our discussions come down to:
It occurs to me that these may actually be the same issue. @CarliJoy made some good points about why combining Personally I think the only option I am against is option 3, where we allow special subcases, because this makes the logic of intersection much more complex. Sure, we could say we want to allow There have been plenty of compelling discussions against option 2 - I think this option realistically won't succeed. That leaves us with option 1 - anything can be intersected, but what does that mean? Option 1 - Intersect anythingLet's take We know obviously that these cannot be combined in a class - but the question I pose is, does that stop us forming an intersection? We've discussed plenty of other cases where this is allowed. If we consider the methods of a ['capitalize', 'casefold', 'center', 'count', 'encode', 'endswith', 'expandtabs', 'find',...] And we consider the methods of ['as_integer_ratio', 'bit_count', 'bit_length', 'conjugate', 'denominator', 'from_bytes', 'imag', 'is_integer',...] These could be combined into some combination, defined by another class: ['as_integer_ratio', 'bit_count', 'bit_length', 'capitalize', 'casefold', 'center', 'conjugate', 'count', ...] And these could be the methods of But wait! It must inherit from
|
@mark-todd The option you're missing is the current option as proposed by Here's how this plays out with two incompatible cases, 1 that is expressible by the type system, and one that's a runtime issue. from typing import OrderedIntersection as OI
def takes_bad_runtime_intersect(x: OI[int, str]):
...
class UserAttempt(int, str):
pass User gets a runtime error at the class definition, and is told they can't do that. Note that type checkers do not error at that subclass. from typing import OrderedIntersection as OI
class X:
foo: int
class Y:
foo: str
def takes_bad_intersect(x: OI[X, Y]):
...
class UserAttempt(X, Y): # error here
pass This time the user will get a type checker error no matter how they try to construct such a class. They never even need to reach the point where assignment to the value annotated with the intersection happens. |
@mark-todd I don't think what was proposed has any relaxed conditions or even special cases at all. The simplest way I can express this while remaining accurate, is that a value expression is consistent with an ordered intersection as a type specification* if it is consistent with all of the operands of the ordered intersection as type specifications.
There isn't actually an inheritance restriction, there's a subtyping consistency restriction. The examples given for how someone could think about it included inheritence for nominal types, and structural compatability for structural types because both of those are the respective ways to check for subtyping consistency for those kinds of type specifications * I'm re-using a definition from this comment #41 (comment) for "type specification", I don't know if I like the name of the term, but I don't care enough to bikeshed the name. |
@mikeshardmind Yeah sorry this is just what I meant by option 1, but the part I missed was a solution c). I see what you're getting at - just because you can make an intersection, doesn't mean that intersection can exist in practice. So we say effectively if you want to set your function interface as I think in regards
I sense this is important but I'm not sure I follow you here - I'm imagining at runtime
@DiscordLiz Ok, so if I'm understanding correctly I think being a subtype means, for non-protocol classes, this inherits from it, and for protocol classes, this follows the specified structure? Apologies if I missing something here. If this is correct, that resolves the
I suppose one thing about this that perplexes me is we're effectively saying "Yes you can define an object to be of this type, but no that object can't actually exist, only via cast". I'm not sure this has ever been done before in python - typically if types are used incorrectly this is flagged, so it feels like we're breaking new ground here. That's not necessarily an argument against doing it, just an observation. ...unless |
It's a structural type similar to Protocols, but with different semantics to help with a few common cases involving heterogenous mappings. The bit about a cast shouldn't be needed (but there may be issues with typecheckers I haven't accounted for, just going off of what's specified here, if there's a type checker bug clouding this, we can raise that to their issue trackers) |
This may seem like it's needlessly semantic in nature, and it's collapsed as important, but tangential.Quick prefaceThere are other issues relating to this. types have different meaning in an annotation and as a value, and some type system constructs in This has come up in some other discussions, it's a known place where the type system's specifying language can be improved upon. Being able to differentiate between types as a runtime concept and types as a type system concept helps with making powerful generalizations that can remain consistent Some types aren't types that you can just use to get a valueThere are a lot of things in the type system like this where something is technically a type at runtime (due to all implementable runtime objects needing to be and python having runtime access to annotations and types), but the distinction of what it actually is to the type system differs from what it is at runtime, and matters for handling what rules apply to it and ensuring those rules compose appropriately. Most of these things are for expressing ways to compose simpler typing concepts.
typing.OrderedIntersection should it happen will be similar. It will have runtime representation in typing, but this representation isn't something you expect to deal with except in runtime typing code, it's a way for the type system to be introspectable and work at runtime. As a value, rather than as an annotation, these are typing special forms. Taking it further though, Even basic types usually express consistency, not a specific typeIf we take a look at the below: variable: annotation = value_expression
This is also consistent with simpler cases, but this notion of consistency is often substituted for the specific consistency check which applies to the current example. var: list[int] = UserList() while |
@mikeshardmind Yeah that makes sense to me
Just wanted to elaborate on what I meant about this, because I think it's important to what I was describing about the drawbacks of the approach. class A(TypedDict):
x: int
class B:
def foo(self):
pass
x: OI[A,B] What value can class C(A,B):
pass The above is banned, because you cannot inherit from class C(B):
def __getitem__(self, key:str):
if key == 'x':
return 1
else:
raise KeyError
x: OI[A,B] = C() Although technically correct, the above would also fail type checks I believe. Perhaps this is what you meant about the bug in type checkers? So that means we need to say: x: OI[A,B] = cast(OI[A,B], C()) |
Answer to @mark-todd
Well that isn't accurate. This object exists with the exact signature that I will cast there. There is only no way at the moment to express this in Python typing at the moment. Note: I really never will create a discussion group in github, it suck so much, that there aren't even threads here... Btw: why do we discuss typed dicts again hre and not in #42, that is confusing... |
@CarliJoy I see what you're getting at, but I'm not sure this is strictly true - for it to have exactly the same signature it would have to also have methods for
I still think the
It's interesting, I hadn't really thought about it this way, but I take the point. I suppose this is the view that Intersection is a good stopgap for other missing language features (like the one I describe above) or perhaps adding some typing to poorly typed libraries like pandas.
Apologies, that's on me - I had a more general comment, that then developed into a specific discussion again - I find it quite difficult to separate these topics sometimes, but happy to comment in the other thread if that's easier. |
I'd personally find it easier if we didn't try to split every single line of discussion to another issue in the first place, there's not really a good way to do that with github being used for this, and a lot of discussion seems to be split off while being relevant to the initial topic and not something that needs to be brought back and forth.
TypedDict doesn't guarantee inhereiting from dict as it exists in the specification, and you aren't allowed to use methods like The more we discuss this, the more I think we actually do need a section on TypedDict to point out that prior to intersections, TypedDict was assumed by developers to only ever actually be a dict, but that it is possible for other things to be consistent with the structure it provides. |
@mikeshardmind Oh interesting - ok in that case I think it's all fine. Are there any other structural types in python other than
Yeah I agree - this was not clear to me at all previously (partially from the name) |
I'm gonna close this, I think we're enough in consensus on the points here, that any further clarity can be handled when I finish the draft (hopefully this weekend, time permitting). Going through all of this and ensuring we don't need special cases, but are aware of anything that may seem strange and whether or not it causes problems for either type safety, consistency of the system, and composability with future type system additions is mostly done on my end now, and was able to result in much simpler PEP language after confirming TypedDicts shouldn't need special treatment. |
Uh oh!
There was an error while loading. Please reload this page.
So.. while working on something formal for #38, I'm left in a situation where there are a few corner cases I really don't like, and I think we need answers for.
type[OrderedIntersection[T, U]]
: The primary use case seems to imply that the class that satisfies this should be instantiable as T, buttype
is currently not type safe for__init__
(my displeasure at LSP exceptions for "pragmatic" reasons have already been expressed, and I dont think it's in scope to change that) so do users need to doOrderedIntersection[Callable[*expected_init_args, OrderedIntersection[T, U]], OrderedIntersection[T, U]]
to actually have type safety here? This seems exceptionally verbose for a situation that shouldn't require this level of verbosity, especially since there's no way to just "copy"T.__init__
's signature, is there something we can do to improve this without fixing the fact that LSP is being ignored for constructors?What about intersections with members of the numeric tower?
OrderedIntersection[HasDotHexMethod, float]
allows fixing (one of) the type holes with it. Do we need to do anything else here? I'm leaning no and that we should, after this is accepted, look into fixing some things with the numeric tower and having protocols built with intersections that express common things people want to be able to allow for duck typed math while remaining type safe, but I think that doesn't require extra work here, but future work after, have I missed something that requires we consider it more ourselves?The text was updated successfully, but these errors were encountered: