ENH: get more specific about _ArrayLike, make it public #66

person142 · 2020-04-21T04:13:04Z

Closes #37.

Add tests to check various examples.

person142 · 2020-04-21T04:17:13Z

As something of a data point, MyPy passes on SciPy when checked against this branch. Now the types in SciPy are still very rough, so take that with a grain of salt, but perhaps it means something.

rgommers · 2020-04-21T08:31:54Z

What about other array-like objects? Are things with both __array__ and __len__ passing because of Sequence, or is that not tested? Or is that what the Protocol discussion was about?

BvB93 · 2020-04-21T09:21:04Z

numpy-stubs/__init__.pyi

 _NdArraySubClass = TypeVar("_NdArraySubClass", bound=ndarray)

-_ArrayLike = TypeVar("_ArrayLike")
+ArrayLike = Union[int, float, complex, generic, ndarray, Sequence]


Might it be an idea to, respectively, replace int, float and complex with the SupportsInt, SupportsFloat and SupportsComplex protocols?

What about str, bytes, bool, ~~dt.datetime~~ dt.date and dt.timedelta (considering they all have their corresponding generic)?

Furthermore, I'd presonally be in favor of replacing ndarray with a custom _SupportsArray protocol along the lines of:

class _SupportsArray(Protocol): def __array__(self, dtype: _DtypeLike = ...) -> ndarray: ...

Might it be an idea to, respectively, replace int, float and complex with the SupportsInt, SupportsFloat and SupportsComplex protocols?

I think the protocols are a little too general; e.g.

>>> class A: ... def __int__(self): ... return 1 ... >>> int(A()) 1 >>> np.array(A()) array(<__main__.A object at 0x10e61a290>, dtype=object)

What about str, bytes, bool, dt.datetime dt.date and dt.timedelta (considering they all have their corresponding generic)?

str and bytes are both covered by Sequence. I added bool (thanks for catching that!). The dt.* I think also ends up giving unexpected results:

>>> np.array(datetime.timedelta(days=1)) array(datetime.timedelta(days=1), dtype=object)

Furthermore, I'd presonally be in favor of replacing ndarray with a custom _SupportsArray protocol along the lines of:

Yes, that is an excellent idea, switched to that.

I think the protocols are a little too general; e.g.

Good catch, I actually wasn't aware that, e.g., a SupportsInt member wouldn't produce an integer array.

str and bytes are both covered by Sequence

Ah right, that's true.

person142 · 2020-04-21T14:23:47Z

@rgommers @BvB93 thanks for the feedback-I added a _SupportsArray protocol in 4f654e5; it is quite a nice unification.

person142 · 2020-04-21T14:31:11Z

It seems that the principle we're roughly going for here is "don't allow stuff that will produce object arrays", though we still leave an escape hatch a la

4f654e5#diff-98e9e1660b68614cffb8585ea52a0bdcR31.

BvB93 · 2020-04-21T15:17:15Z

@rgommers @BvB93 thanks for the feedback-I added a _SupportsArray protocol in 4f654e5; it is quite a nice unification.

Btw, doesn't __array__() also have the dtype argument?

person142 · 2020-04-22T01:15:48Z

Btw, doesn't array() also have the dtype argument?

Hm, it appears to be a little wonky:

>>> np.float64(1).__array__()
array(1.)
>>> np.float64(1).__array__(np.complex128)
array(1.+0.j)
>>> np.float64(1).__array__(dtype=np.complex128)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: __array__() takes no keyword arguments

>>> class A:
...     def __array__(self, dtype):
...         return np.array([1, 2, 3], dtype=dtype)
...
>>> np.array(A())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: __array__() missing 1 required positional argument: 'dtype'
>>> np.array(A(), dtype=np.float64)
array([1., 2., 3.])

>>> class B:
...     def __array__(self, dtype=None):
...         return np.array([1, 2, 3], dtype=dtype)
...
>>> np.array(B())
array([1, 2, 3])
>>> np.array(B(), dtype=np.float64)
array([1., 2., 3.])

>>> class C:
...     def __array__(self):
...         return np.array([1, 2, 3])
...
>>> np.array(C())
array([1, 2, 3])
>>> np.array(C(), dtype=np.complex128)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: __array__() takes 1 positional argument but 2 were given

person142 · 2020-04-22T01:24:14Z

That's unfortunate... it looks like the __array__ methods on scalars are inherently different from the user ones since they force positional arguments, whereas only positional arguments in the user ones don't work.

person142 · 2020-04-22T03:30:42Z

Ok, I made the protocol

class _SupportsArray(Protocol):
    def __array__(
        self, dtype: Optional[_DTypeLike] = ...
    ) -> Union[Sequence, ndarray]: ...

This unfortunately means that ndarray and generic no longer satisfy it! To work around that I added them back to the big union. This is kind of gross, but I think it's the best we can do given the above examples. WDYT?

BvB93 · 2020-04-22T09:22:40Z

This unfortunately means that ndarray and generic no longer satisfy it! To work around that I added them back to the big union. This is kind of gross, but I think it's the best we can do given the above examples. WDYT?

I'd suggest adding another overload to the _SupportsArray protocol instead, such that it works with in situations where dtype is positional-only or keyword-or-positional:

# first overload: dtype is optional and positional-only
# Second overload: dtype is optional and can be a positional or keyword argument
class _SupportsArray(Protocol):
    @overload
    def __array__(self, __dtype: _DtypeLike = ...) -> ndarray: ...
    @overload
    def __array__(self, dtype: _DtypeLike = ...) -> ndarray: ...


class A():
    def __array__(self, dtype: _DtypeLike = None) -> ndarray:
        return np.array([1, 2, 3], dtype=dtype)

class B():
    def __array__(self, __dtype: _DtypeLike = None) -> ndarray:
        return np.array([1, 2, 3], dtype=__dtype)

a = A()
a.__array__()
a.__array__(float)
a.__array__(dtype=float)

b = B()
b.__array__()
b.__array__(float)
b.__array__(dtype=float)  # E: Unexpected keyword argument
b.__array__(__dtype=float)  # E: Unexpected keyword argument

By the way, why is the returned type annotated as Union[Sequence, ndarray] in your example? Doesn't it always just return ndarray?

person142 · 2020-04-22T13:53:13Z

I'd suggest adding another overload to the _SupportsArray protocol instead, such that it works with in situations where dtype is positional-only or keyword-or-positional

The signatures are incompatible, so that’s going to violate Liskov I think. (But I’ll give it a try.)

By the way, why is the returned type annotated as Union[Sequence, ndarray] in your example?

I was going by the docs here:

https://docs.scipy.org/doc/numpy/reference/generated/numpy.array.html

which say ndarray or nested sequence.

BvB93 · 2020-04-22T14:23:08Z

The signatures are incompatible, so that’s going to violate Liskov I think. (But I’ll give it a try.)

It's not quite clear to me where this incompatibility is located.
__array__() (ref) takes self and (optionally) dtype as argument. The newly overloaded _SupportsArray can now deal with cases where dtype is either keyword-only or positional-or-keyword.

I was going by the docs here:

I believe what they're talking about in the docs is an object which either:
a. Has an __array__() method that returns an ndarray.
b. Is a (nested) sequence.

So Union[_SupportsArray, Sequence[Any]] in other words, rather than __array__() itself (potentially) returning a Sequence.

person142 · 2020-04-24T01:59:54Z

It's not quite clear to me where this incompatible is located.

I believe what they're talking about in the docs is an object which either:

Yup, right on both counts! Updated; hopefully we're good to go now.

Note that I had to make _DtypeLike public too so that someone could use it in the annotations for their __array__ method, but that's ok because it was desired anyway (#13).

shoyer · 2020-04-24T02:41:11Z

This looks really nice!

My own serious concern here is about adding public type aliases. These do seem quite useful, but what would this imply if/when we move type annotations into NumPy properly? If np.ArrayLike needs to be valid, does that imply that if we move annotations into NumPy we need to expose ArrayLike in the public API of NumPy?

person142 · 2020-04-24T03:43:01Z

If np.ArrayLike needs to be valid, does that imply that if we move annotations into NumPy we need to expose ArrayLike in the public API of NumPy?

Thankfully it doesn't-this is the same sort of fudging that happens with things like Queue[T] in the stdlib (where the real class isn't actually generic). But it does mean that people using annotations need to take one of these options:

import numpy as np

x: "np.ArrayLike"  # Use strings

from __future__ import annotations

import numpy as np

x: np.ArrayLike  # Now it's treated as a string anyway

from typing import TYPE_CHECKING

if TYPE_CHECKING:
    from numpy import ArrayLike
else:
    ArrayLike = None  # Or whatever

x: ArrayLike

person142 · 2020-04-24T03:48:33Z

Although maybe I am interpreting your question in the wrong way. I think a backdrop to my answer above is an assumption that even when we move the types into NumPy, they will remain stubs instead of being inlined into the code.

I suspect that this is the right course of action because we've seen that the types require a fair bit of "fudging"; i.e. we aren't trying to represent the full NumPy API but instead some typeable subset of it. I think that if the types are inlined then we lose the ability to do that fudging as well. (And lose the ability to do a gazillion overloads; doubt that is going to fly in the NumPy codebase proper.)

I'd contrast this to e.g. SciPy where we are inlining the types, mostly because there isn't much odd there (except for needing a bunch of stubs for extension modules).

emmatyping · 2020-04-24T06:31:49Z

FWIW, I've heard that at the language summit they decided __future__.annotations will become default in 3.10 I believe.

shoyer · 2020-04-24T07:05:33Z

I appreciate that we could need np.ArrayLike in strings or type annotations, but I suspect that it could still lead to some user confusion to not define them at runtime, too, e.g., to cover use cases like type aliases. The user experience is independent of whether we choose to use stubs or annotations inside NumPy — though I expect we’ll probably end up with some of both.

NumPy-stubs is certainly still experimental, so you definitely have my blessing to go ahead for now. But I do think it could be worth sounding out the broader NumPy community on the appetite for adding a selective handful of type protocols into NumPy proper. We might get some useful feedback. For example, should the protocol be called ArrayLike or array_like as currently appears in many NumPy docstrings?

rgommers · 2020-04-24T09:37:22Z

If it helps making ArrayLike et al. public, I don't see a big issue with that; would probably prefer to write it as

from numpy.types import ArrayLike

x: ArrayLike

to keep it out of the main namespace. In my (limited) experience, it's helpful for things to exist at runtime.

person142 · 2020-04-24T15:26:41Z

from numpy.types import ArrayLike

Keeping it out of the main namespace seems good, though maybe it should be numpy.typing instead of numpy.types to match the stdlib module?

person142 · 2020-04-24T15:27:21Z

Re

But I do think it could be worth sounding out the broader NumPy community on the appetite for adding a selective handful of type protocols into NumPy proper.

I'll send out something to the mailing list.

rgommers · 2020-04-24T16:44:38Z

from numpy.types import ArrayLike

Keeping it out of the main namespace seems good, though maybe it should be numpy.typing instead of numpy.types to match the stdlib module?

No, I avoided that name on purpose, naming something the same as a stdlib module is usually a bad idea (e.g. scipy.io was a pretty bad choice, that's why it's usually import as spio).

BvB93 · 2020-04-24T16:59:04Z

What about something along the lines of numpy.annotations?

eric-wieser · 2020-04-27T17:56:43Z

Keeping it out of the main namespace seems good, though maybe it should be numpy.typing instead of numpy.types to match the stdlib module?

No, I avoided that name on purpose, naming something the same as a stdlib module is usually a bad idea

As a counterargument, numpy.distutils clashes with distutils, but we used it anyway, presumably because the similarity was worth emphasizing.

Almost all the usage of type annotations I've seen in the wild has erred on the side of keeping the annotations as short as possible, as:

from typing import Tuple
from numpy.typing import ArrayLike

def get_arr() -> Tuple[ArrayLike, int]: ...

The similarity aids reading here, and the clash is irrelevant.

Alternatively, some users might want the full names anyway. The clash is again irrelevant:

def get_arr() -> typing.Tuple[np.typing.ArrayLike, int]: ...

Finally, if the user cares enough to import just the submodule, they probably want to do something similar with typing anyway (note: I've not actually seen anyone do this).

import typing as t
import numpy.typing as npt
def get_arr() -> t.Tuple[npt.ArrayLike, int]: ...

person142 · 2020-04-30T15:21:14Z

Ok, discussion on the mailing list:

http://numpy-discussion.10968.n7.nabble.com/Feelings-about-type-aliases-in-NumPy-td48059.html

seems to be dying down. Takeaways so far:

No (open) objections to adding types to NumPy itself in the future
Most people preferred not doing it in the top-level namespace; though @shoyer was a notable objector
Little agreement on what to name the module

So, as is often the case, seems like there's rough consensus except on what to name the darned thing.

BvB93 · 2020-05-01T15:50:41Z

numpy-stubs/__init__.pyi

+    @overload
+    def __array__(self, __dtype: DtypeLike = ...) -> ndarray: ...
+    @overload
+    def __array__(self, dtype: Optional[DtypeLike] = ...) -> ndarray: ...


Is the Optional in Optional[DtypeLike] not redundant here?
Considering None is already included the union defining DtypeLike.

BvB93 · 2020-05-01T15:53:31Z

numpy-stubs/__init__.pyi

    def shape(self) -> _Shape: ...
    @property
    def strides(self) -> _Shape: ...
+    def __array__(self, __dtype: Optional[DtypeLike] = ...) -> ndarray: ...


See comment above.

eric-wieser · 2020-05-01T16:00:07Z

numpy-stubs/__init__.pyi

    @overload
    def view(self, *, type: Type[_NdArraySubClass]) -> _NdArraySubClass: ...
-    def getfield(self, dtype: Union[_DtypeLike, str], offset: int = ...) -> ndarray: ...
+    def getfield(self, dtype: Union[DtypeLike, str], offset: int = ...) -> ndarray: ...


Can this be simplified?

Suggested change

def getfield(self, dtype: Union[DtypeLike, str], offset: int = ...) -> ndarray: ...

def getfield(self, dtype: DtypeLike, offset: int = ...) -> ndarray: ...

TomAugspurger · 2020-05-01T16:02:44Z

cc @simonjayhawkins if you see any issues with how this ArrayLike will interact with pandas' typing.

person142 · 2020-05-09T17:17:23Z

I don't want us to lose momentum here, so: I find @eric-wieser's comment

Almost all the usage of type annotations I've seen in the wild has erred on the side of keeping the annotations as short as possible

to match with my own experience as well-typing in Python is fairly verbose, so short forms like

import typing as t

or

from typing import ...

seem to be the norm. For that reason I propose that we move ahead with putting things in numpy.typing. For now it will be in the stubs only, and when we merge the stubs into NumPy itself we can make it available at runtime.

Are people ok with that, or shall be continue to discuss?

shoyer · 2020-05-09T18:00:40Z

+1 for numpy.typing, but please send this to the email list, which is the place of record for design decisions

…

On Sat, May 9, 2020 at 10:17 AM Josh Wilson ***@***.***> wrote: I don't want us to lose momentum here, so: I find @eric-wieser <https://github.com/eric-wieser>'s comment Almost all the usage of type annotations I've seen in the wild has erred on the side of keeping the annotations as short as possible to match with my own experience as well-typing in Python is fairly verbose, so short forms like import typing as t or from typing import ... seem to be the norm. For that reason I propose that we move ahead with putting things in numpy.typing. For now it will be in the stubs only, and when we merge the stubs into NumPy itself we can make it available at runtime. Are people ok with that, or shall be continue to discuss? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#66 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAJJFVU6ZQ67KMCLMA4UE6LRQWF27ANCNFSM4MM7A2OQ> .

rgommers · 2020-05-09T18:31:47Z

Sounds fine to me, thanks for keeping this moving.

Closes numpy#37. Add tests to check various examples. Note that supporting __array__ also requires making _DtypeLike public too, so this does that as well.

person142 · 2020-05-10T18:14:39Z

Ok, mailing list has been notified, PR has been rebased, review comments have been addressed (I think), the types have been moved into numpy.typing, and the tests have been updated accordingly.

person142 · 2020-05-17T18:54:35Z

Any objections to moving forward? Since this conflicts with just about every other PR it would be nice to get it in to avoid more rebasing.

BvB93 · 2020-05-17T19:14:18Z

No complaints here from my side; feel free to continue.

shoyer · 2020-05-17T19:16:51Z

Look good to me!

person142 · 2020-05-17T22:50:57Z

In it goes then. Thanks for reviewing everyone!

person142 mentioned this pull request Apr 21, 2020

What to do with array_like variables #37

Closed

rgommers added the enhancement label Apr 21, 2020

BvB93 reviewed Apr 21, 2020

View reviewed changes

BvB93 mentioned this pull request Apr 21, 2020

ENH: Add type annotations for the np.core.fromnumeric module: part 1/4 #67

Merged

person142 force-pushed the array-like branch from 861f4f5 to 35f8549 Compare April 22, 2020 03:27

person142 force-pushed the array-like branch from 35f8549 to 9fd098b Compare April 22, 2020 03:39

person142 force-pushed the array-like branch from 9fd098b to 53f2189 Compare April 24, 2020 01:58

person142 force-pushed the array-like branch from 53f2189 to e30b16a Compare April 24, 2020 02:02

person142 force-pushed the array-like branch from ce54acb to 6579ac3 Compare April 24, 2020 05:02

BvB93 reviewed May 1, 2020

View reviewed changes

eric-wieser reviewed May 1, 2020

View reviewed changes

person142 mentioned this pull request May 9, 2020

What needs to be done before merging the stubs into NumPy #79

Closed

person142 force-pushed the array-like branch 3 times, most recently from f39c6ef to 7adbc04 Compare May 10, 2020 18:07

ENH: get more specific about _ArrayLike, make it public

6b1e862

Closes numpy#37. Add tests to check various examples. Note that supporting __array__ also requires making _DtypeLike public too, so this does that as well.

person142 force-pushed the array-like branch from 7adbc04 to 6b1e862 Compare May 10, 2020 18:10

person142 merged commit fa6b9fd into numpy:master May 17, 2020

person142 deleted the array-like branch May 17, 2020 22:50

person142 mentioned this pull request May 19, 2020

Exporting type aliases for DtypeLike, ShapeLike and ArrayLike #13

Closed

This was referenced Jun 6, 2020

ENH: add type stubs from numpy-stubs numpy/numpy#16515

Merged

Make numpy.typing available at runtime numpy/numpy#16550

Closed

person142 mentioned this pull request Jun 15, 2020

ENH: make typing module available at runtime numpy/numpy#16558

Merged

person142 mentioned this pull request Jul 18, 2020

Should we error on "not-recommended" behavior in static typing tests ? numpy/numpy#16891

Closed

	def getfield(self, dtype: Union[DtypeLike, str], offset: int = ...) -> ndarray: ...
	def getfield(self, dtype: DtypeLike, offset: int = ...) -> ndarray: ...

Uh oh!

ENH: get more specific about _ArrayLike, make it public #66

ENH: get more specific about _ArrayLike, make it public #66

Uh oh!

Conversation

person142 commented Apr 21, 2020

Uh oh!

person142 commented Apr 21, 2020

Uh oh!

rgommers commented Apr 21, 2020

Uh oh!

BvB93 Apr 21, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

person142 Apr 21, 2020

Choose a reason for hiding this comment

Uh oh!

BvB93 Apr 21, 2020

Choose a reason for hiding this comment

Uh oh!

person142 commented Apr 21, 2020

Uh oh!

person142 commented Apr 21, 2020

Uh oh!

BvB93 commented Apr 21, 2020

Uh oh!

person142 commented Apr 22, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

person142 commented Apr 22, 2020

Uh oh!

person142 commented Apr 22, 2020

Uh oh!

BvB93 commented Apr 22, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

person142 commented Apr 22, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

BvB93 commented Apr 22, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

person142 commented Apr 24, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shoyer commented Apr 24, 2020

Uh oh!

person142 commented Apr 24, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

person142 commented Apr 24, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

emmatyping commented Apr 24, 2020

Uh oh!

shoyer commented Apr 24, 2020

Uh oh!

rgommers commented Apr 24, 2020

Uh oh!

person142 commented Apr 24, 2020

Uh oh!

person142 commented Apr 24, 2020

Uh oh!

rgommers commented Apr 24, 2020

Uh oh!

BvB93 commented Apr 24, 2020

Uh oh!

eric-wieser commented Apr 27, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

person142 commented Apr 30, 2020

Uh oh!

BvB93 May 1, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

BvB93 May 1, 2020

Choose a reason for hiding this comment

Uh oh!

BvB93 Apr 21, 2020 •

edited

Loading

person142 commented Apr 22, 2020 •

edited

Loading

BvB93 commented Apr 22, 2020 •

edited

Loading

person142 commented Apr 22, 2020 •

edited

Loading

BvB93 commented Apr 22, 2020 •

edited

Loading

person142 commented Apr 24, 2020 •

edited

Loading

person142 commented Apr 24, 2020 •

edited

Loading

person142 commented Apr 24, 2020 •

edited

Loading

eric-wieser commented Apr 27, 2020 •

edited

Loading

BvB93 May 1, 2020 •

edited

Loading