Skip to content
This repository was archived by the owner on Jun 10, 2020. It is now read-only.

Define special methods for ndarray and add more extensive tests. #10

Merged
merged 17 commits into from
Mar 13, 2018
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
151 changes: 144 additions & 7 deletions numpy/__init__.pyi
Original file line number Diff line number Diff line change
@@ -1,15 +1,41 @@
# very simple, just enough to start running tests
#
import builtins
from typing import Any, Mapping, List, Optional, Tuple, Union

from typing import (
Any, Dict, Iterable, List, Optional, Mapping, Sized,
SupportsInt, SupportsFloat, SupportsComplex, SupportsBytes, SupportsAbs,
Tuple, Union,
)

import sys

from numpy.core._internal import _ctypes

_Shape = Tuple[int, ...]

# Anything that can be coerced into numpy.dtype. To avoid recursive
# definitions, any nested fields are required to be castable to a dtype object
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps do _DtypeLike_nested = Any # TODO: wait for support for recursive types, and then use it in place of Any below?

# are typed as Any.
# Refernce: https://docs.scipy.org/doc/numpy/reference/arrays.dtypes.html
_ConvertibleToDtype = Union[
Copy link
Member Author

@shoyer shoyer Mar 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently (with mypy) I can actually include a forward reference to dtype in this top-level type alias. If that's a standard feature for pyi files, then I suppose we should make use of it rather than defining the separate _DtypeLike alias below.

Copy link
Member Author

@shoyer shoyer Mar 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, this doesn't work. It does work.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just pushed a commit implementing this.

type, # TODO: enumerate np.generic types and Python scalars
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does Type[np.generic] work here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would, but we haven't defined the scalar type hierarchy yet.

# TODO: add a protocol for anything with a dtype attribute
str,
Tuple[Any, int],
Tuple[Any, _Shape],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could combine these with a ShapeLike - which is consistent with how np.zeros(shape_like) and a bunch of other functions work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that (int, 2) is shorthand for (int, (2,)), so this the int case covers more than you might expect

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, in that case I'm going to use _ShapeLike = Union[int, Tuple[int, ...]] for clarity.

List[Union[Tuple[Union[str, Tuple[str, str]], Any],
Tuple[Union[str, Tuple[str, str]], Any, _Shape]]],
Dict[str, Any],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really Any? I think there could be:

Union[
    Dict[str, Sequence[Union[int, str, _DtypeLike]]]   # for {"names": ..., "formats": ...}
    Dict[str, Union[Tuple[str, int], Tuple[_DtypeLike, int]]] # for {"field": ....}
]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that should be


    # for {"names": ..., "formats": ...}
    Dict[str, Union[
        Sequence[str],  #names
        Sequence[_DtypeLike], # formats
        Sequence[int], # offsets
        int,  # itemsize
    ]],
    Dict[str, Tuple[_DtypeLike, int]] # for {"field": ....}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably dangerous because dict is an invariant type (so, for example, Dict[str, int] wouldn't be compatible with this big union type). Can you use Mapping instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use Mapping, but it would be inaccurate. NumPy does actually insist on dict:

In [13]: od = collections.OrderedDict(x=(float, 1), y=(int, 2))

In [14]: np.dtype(od)
Out[14]: dtype({'names':['x','y'], 'formats':['<f8','<i8'], 'offsets':[1,2], 'itemsize':10})

In [15]: np.dtype(ChainMap([od]))
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-15-03baf5fab980> in <module>()
----> 1 np.dtype(ChainMap([od]))

TypeError: data type not understood

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have the same problem with:

   List[Union[
       Tuple[Union[str, Tuple[str, str]], _DtypeLikeNested],
       Tuple[Union[str, Tuple[str, str]], _DtypeLikeNested, _ShapeLike]]],

which maybe should be Sequence but NumPy insists on taking a list (or at least, it doesn't take a tuple which is my go-to example of a non-list sequence).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the more specific types for Dict, which should at least minimize the invariance issue.

Is it possible to make a dict that is covariant in its argument type? e.g.,

K = TypeVar('V', covariant=True)
V = TypeVar('V', covariant=True)
def f(x: Dict[K, V]): ...

Tuple[Any, Any]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think each of these could do with a comment giving an example of the case they aim to cover



class dtype:
names: Optional[Tuple[str, ...]]

def __init__(self,
obj: Union[dtype, _ConvertibleToDtype],
align: bool = ...,
copy: bool = ...) -> None: ...

@property
def alignment(self) -> int: ...

Expand Down Expand Up @@ -83,7 +109,8 @@ class dtype:
def type(self) -> builtins.type: ...


_dtype_class = dtype # for ndarray type
_DtypeLike = Union[dtype, _ConvertibleToDtype]
_Dtype = dtype # to avoid name conflicts with ndarray.dtype
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great if we could have a "strict-typing" mode that narrowed the scope of what dtype-like or array-like means. I wonder how that could be achieved...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'd love see this.



class _flagsobj:
Expand Down Expand Up @@ -144,8 +171,10 @@ class flatiter:
def __next__(self) -> Any: ...


class ndarray:
dtype: _dtype_class
class ndarray(Iterable, Sized, SupportsInt, SupportsFloat, SupportsComplex,
SupportsBytes, SupportsAbs[Any]):

dtype: _Dtype
imag: ndarray
real: ndarray
shape: _Shape
Expand Down Expand Up @@ -181,12 +210,120 @@ class ndarray:
@property
def ndim(self) -> int: ...

# Many of these special methods are irrelevant currently, since protocols
# aren't supported yet. That said, I'm adding them for completeness.
# https://docs.python.org/3/reference/datamodel.html
def __len__(self) -> int: ...
def __getitem__(self, key) -> Any: ...
def __setitem__(self, key, value): ...
def __iter__(self) -> Any: ...
def __contains__(self, key) -> bool: ...

def __int__(self) -> int: ...
def __float__(self) -> float: ...
def __complex__(self) -> complex: ...
if sys.version_info.major < 3:
def __oct__(self) -> str: ...
def __hex__(self) -> str: ...
def __nonzero__(self) -> bool: ...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is called bool on py3

def __bytes__(self) -> bytes: ...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I imagine this is python 3 only?

Also __unicode__, __str__, and __repr__ are missing

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


def __index__(self) -> int: ...

def __copy__(self, order: str = ...) -> ndarray: ...
def __deepcopy__(self, memo: dict) -> ndarray: ...

# https://github.com/numpy/numpy/blob/v1.13.0/numpy/lib/mixins.py#L63-L181

# TODO(shoyer): add overloads (returning ndarray) for cases where other is
# known not to define __array_priority__ or __array_ufunc__, such as for
# numbers or other numpy arrays. Or even better, use protocols (once they
# work).

def __lt__(self, other): ...
def __le__(self, other): ...
def __eq__(self, other): ...
def __ne__(self, other): ...
def __gt__(self, other): ...
def __ge__(self, other): ...

def __add__(self, other): ...
def __radd__(self, other): ...
def __iadd__(self, other): ...

def __sub__(self, other): ...
def __rsub__(self, other): ...
def __isub__(self, other): ...

def __mul__(self, other): ...
def __rmul__(self, other): ...
def __imul__(self, other): ...

if sys.version_info.major < 3:
def __div__(self, other): ...
def __rdiv__(self, other): ...
def __idiv__(self, other): ...

def __truediv__(self, other): ...
def __rtruediv__(self, other): ...
def __itruediv__(self, other): ...

def __floordiv__(self, other): ...
def __rfloordiv__(self, other): ...
def __ifloordiv__(self, other): ...

def __mod__(self, other): ...
def __rmod__(self, other): ...
def __imod__(self, other): ...

def __divmod__(self, other): ...
def __rdivmod__(self, other): ...

# NumPy's __pow__ doesn't handle a third argument
def __pow__(self, other): ...
def __rpow__(self, other): ...
def __ipow__(self, other): ...

def __lshift__(self, other): ...
def __rlshift__(self, other): ...
def __ilshift__(self, other): ...

def __rshift__(self, other): ...
def __rrshift__(self, other): ...
def __irshift__(self, other): ...

def __and__(self, other): ...
def __rand__(self, other): ...
def __iand__(self, other): ...

def __xor__(self, other): ...
def __rxor__(self, other): ...
def __ixor__(self, other): ...

def __or__(self, other): ...
def __ror__(self, other): ...
def __ior__(self, other): ...

if sys.version_info >= (3, 5):
def __matmul__(self, other): ...
def __rmatmul__(self, other): ...

def __neg__(self) -> ndarray: ...
def __pos__(self) -> ndarray: ...
def __abs__(self) -> ndarray: ...
def __invert__(self) -> ndarray: ...

# TODO(shoyer): remove when all methods are defined
def __getattr__(self, name) -> Any: ...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing __setattr__ for .dtype and .shape?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dtype and shape are only typed as attributes, not properties, which means they can be set. But perhaps it would indeed be good to overload setters appropriately...



def array(
object: object,
dtype: dtype = ...,
dtype: _DtypeLike = ...,
copy: bool = ...,
subok: bool = ...,
ndmin: int = ...) -> ndarray: ...


# TODO(shoyer): remove when the full numpy namespace is defined
def __getattr__(name: str) -> Any: ...
133 changes: 131 additions & 2 deletions tests/test_simple.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,134 @@
"""Simple expression that should pass with mypy."""
import operator

import numpy as np
from typing import Iterable

# Basic checks
array = np.array([1, 2])
def ndarray_func(x: np.ndarray) -> np.ndarray:
return x
ndarray_func(np.array([1, 2]))
array == 1
array.dtype == float

# Dtype construction
np.dtype(float)
np.dtype(np.float64)
np.dtype('float64')
np.dtype(np.dtype(float))
np.dtype(('U', 10))
np.dtype((np.int32, (2, 2)))
np.dtype([('R', 'u1'), ('G', 'u1'), ('B', 'u1')])
np.dtype([('R', 'u1', (2, 2))])
np.dtype({'col1': ('U10', 0), 'col2': ('float32', 10)})
np.dtype((np.int32, {'real': (np.int16, 0), 'imag': (np.int16, 2)}))
np.dtype((np.int32, (np.int8, 4)))

# Iteration and indexing
def iterable_func(x: Iterable) -> Iterable:
return x
iterable_func(array)
[element for element in array]
iter(array)
zip(array, array)
array[1]
array[:]
array[...]
array[:] = 0

array_2d = np.ones((3, 3))
array_2d[:2, :2]
array_2d[..., 0]
array_2d[:2, :2] = 0

# Other special methods
len(array)
str(array)
array_scalar = np.array(1)
int(array_scalar)
float(array_scalar)
# currently does not work due to https://github.com/python/typeshed/issues/1904
# complex(array_scalar)
bytes(array_scalar)
operator.index(array_scalar)
bool(array_scalar)

# comparisons
array < 1
array <= 1
array == 1
array != 1
array > 1
array >= 1
1 < array
1 <= array
1 == array
1 != array
1 > array
1 >= array

# binary arithmetic
array + 1
1 + array
array += 1

array - 1
1 - array
array -= 1

array * 1
1 * array
array *= 1

array / 1
1 / array
array /= 1

array // 1
1 // array
array //= 1

array % 1
1 % array
array %= 1

# Overloading divmod() is not yet support in typeshed:
# https://github.com/python/typing/issues/541
# divmod(array, 1)
# divmod(1, array)

array ** 1
1 ** array
array **= 1

array << 1
1 << array
array <<= 1

array >> 1
1 >> array
array >>= 1

array & 1
1 & array
array &= 1

array ^ 1
1 ^ array
array ^= 1

array | 1
1 | array
array |= 1

array @ array

def foo(a: np.ndarray): pass
# unary arithmetic
-array
+array
abs(array)
~array

foo(np.array(1))
# Other methods
np.array([1, 2]).transpose()