-
-
Notifications
You must be signed in to change notification settings - Fork 31
Conversation
@@ -326,6 +335,16 @@ class ndarray(_ArrayOrScalarCommon, Iterable, Sized, Container): | |||
) -> None: ... | |||
def dump(self, file: str) -> None: ... | |||
def dumps(self) -> bytes: ... | |||
@overload | |||
def astype( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kind of the common theme here is adding overloads that distinguish a known dtype from a dtype-like, because in the former case we can make much stronger assertions about the output types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eventually we could handle things like dtype='float64'
too by adding an overload for Literal['float64']
.
6e7fffc
to
c753c41
Compare
c753c41
to
858909f
Compare
There's a long discussion on shape support in gh-5. Not sure how that relates to |
Oops sorry, I was a little too in my head when I wrote that description. Let me try to elaborate.
|
Add overloads for when a dtype is passed (versus just dtype-like) and handle the default float64 dtype for ones/zeros.
Ok, got some overloads for Taking this out of draft though-modulo the philosophical issues I think it's in an ok state. |
I had missed this PR, Ralf just mentioned it in the community meeting. I would like to make sure I see how this relates to NEP 41/42; In other words, what is "dtype" here? What does this currently enforce for the That is, from a typing (theoretical) point of view I do not think it is typically necessary to distinguish Within NEP 41/42, I see these as instances of a class (and also type), making In any case, I am not sure how much this actually changes this at all. Typing and classes are somewhat different issues. It is maybe more of a heads-up, that I think the scalar types (hierarchy) is good to use for this dtype, but it may not be quite enough on its own, also we want user About the shape, just curious, is it really not possible to make |
Warning: I have only vaguely been following the recent dtypes discussion, so apologies in advance for any misunderstandings.
Right, so it's not actually a dtype, it's actually enforcing that it be a subclass of
Oops, I had that wrong; fixed it in ca3fe78.
That would be amazing from a typing perspective!
If I am understanding everything correctly (I'm sure that I am not), NEP 41/42 would make it possible to type things in a much better way. For better or worse, typing and subclasses are tightly coupled in Python (in that subclass => subtype), so not having classes for dtypes makes it basically impossible (?) to type them. There are actually situations like that all over the place in NumPy-e.g. ufuncs are all instances of |
Oh, I had hoped that you can somehow type subtypes of classes without subclasses, (say things like integers of a certain range, I guess parametric types, if that was the name for it). UFuncs are an issue with typing, I guess. Internally I was thinking to use something like As to using the scalar hierarchy, I think that is actually fine. It may have some holes in the long run, but I do think we should have a clean So yes, NEP 41/42, should make things better I guess, since you have a classes for each type (ignoring things such as string lengths, byteorder), and for that you actually mainly need the first tiny step maybe... Although, an alternative where all dtypes are classes (or close) may also be possible (I do not have the vision for that, but we need to finish a thought in that direction in the NEP 41 discussion). Note that we have not yet accepted the NEP, I expect this to happen, but it is not 100%. EDIT: One example, I do not think it would be possible for example to type the Unit of a (physical) Unit DType, since that should typically be stored on the dtype instance. |
I suspect that typing ufuncs will eventually require something like a mypy plugin. It allows you to hook into the type checking process and make modifications, so we could hook into the ufunc call method and augment it with the available loops. (All very theoretical though.)
Yeah, I think that the approach we have to take with typing NumPy is "it's impossible, but we can probably still get something useful". As typing and NumPy evolve I imagine that the situation will improve (and hopefully the current typing experiments will help inform that process). |
After NEP 41 goes in (assuming it does without major changes), we could put in the |
@seberg do you happen to have an (in progress I imagine) branch implementing NEP 41? It might be interesting to see what can be done typing-wise against that branch. |
@person142 see numpy/numpy#15508 but, I have not added anything to make The main point I have against moving a bit more into the scalar, is that I am not sure it is practical to expect scalars to have information used by arrays attached to them. I.e. if you want to make a dtype for an arbitrary python object you might have to modify the python scalar. In the current NEP 41 design there is a mapping |
Ok, here's an experimental branch using NEP 41: https://github.com/person142/numpy-stubs/tree/nep-41 (The tests pass when built against @seberg's NumPy branch.) Some takeaways from that:
|
Oh, there is no dynamic nature intended. It just seemd like a reasonable way to not overload the namespace with new names and have a uniform mapping |
About the metaclass, I do not think it should matter at all? You can just say |
Strictly speaking yes. But it would be nice if we could infer that the type of x = np.array([1], dtype=np.dtype(np.float64))
x = np.array([1], dtype=np.dtype[np.float64]) And it seems that to capture things like that we'll need to bake in some kind of understanding of the dtype metaclass. |
Something which might be relevant: mypy plugins support
(From https://mypy.readthedocs.io/en/latest/extending_mypy.html#current-list-of-plugin-hooks.) |
That I would be surprised if mypy doesn't have something for the possible |
Yeah I'm feeling pretty optimistic about what they could let us do. I opened #56 to try out a simple plugin for getting more precise ufunc signatures; if we move forward with that I'll see what we could do in the dtype case. |
@person142 just a quick note. I expect/hope that the dtypemeta branch will go into master shortly after branching of 1.19. If it would help you quite a bit to push it into 1.19, that is not totally unrealistic. There is just not a big reason for me to push for that. |
No rush here; I'm fine working from the branch until it's merged. Hopefully I'll have some time this weekend to experiment with writing a plugin using |
Closing this; will reopen a reworked version in NumPy focusing on the new dtype classes soon. |
Closes https://github.com/numpy/numpy-stubs/issues/7.
Keeping this as a draft as some discussion will be needed. This makes
ndarray
parameterized over a type variable with a bound ofnp.generic
; more concretely that's things likenp.ndarray[np.int64]
. Note that sincendarray
is not generic at runtime, you have to use the usual tricks to add annotations in practice.A concrete issue is that it doesn't yet work correctly with subclasses; note that I commented out one of the subclass tests below.Edit: nevermind, was hitting stale mypy cache.
On a more philosophical level, there's a question of "is this going to make supporting shapes in the future harder" because presumably that is going to require being generic over shape. One way to handle that would be to make
ndarray
generic over shape right now, but just do nothing with it. Then all annotations should bendarray[dtype, Any]
in preparation for whatever happens with shapes. Given that it seems pretty unclear what shape support will look like in the future, that might end up being off-base though.Thoughts?