Skip to content

Commit 5bc11a0

Browse files
authored
Merge pull request #7 from hameerabbasi/uarray-me
Summarise mailing list discussion
2 parents d2c5761 + a74b550 commit 5bc11a0

File tree

1 file changed

+122
-2
lines changed

1 file changed

+122
-2
lines changed

doc/neps/nep-0031-uarray.rst

Lines changed: 122 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -114,7 +114,8 @@ Proposals
114114
~~~~~~~~~
115115

116116
The only change this NEP proposes at its acceptance, is to make ``unumpy`` the
117-
officially recommended way to override NumPy. ``unumpy`` will remain a separate
117+
officially recommended way to override NumPy, along with making some submodules
118+
overridable by default via ``uarray``. ``unumpy`` will remain a separate
118119
repository/package (which we propose to vendor to avoid a hard dependency, and
119120
use the separate ``unumpy`` package only if it is installed, rather than depend
120121
on for the time being). In concrete terms, ``numpy.overridable`` becomes an
@@ -130,6 +131,10 @@ GitHub workflow. There are a few reasons for this:
130131
rather than breakages happening when it is least expected.
131132
In simple terms, bugs in ``unumpy`` mean that ``numpy`` remains
132133
unaffected.
134+
* For ``numpy.fft``, ``numpy.linalg`` and ``numpy.random``, the functions in
135+
the main namespace will mirror those in the ``numpy.overridable`` namespace.
136+
The reason for this is that there may exist functions in the in these
137+
submodules that need backends, even for ``numpy.ndarray`` inputs.
133138

134139
Advantanges of ``unumpy`` over other solutions
135140
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -156,7 +161,13 @@ allows one to override a large part of the NumPy API by defining only a small
156161
part of it. This is to ease the creation of new duck-arrays, by providing
157162
default implementations of many functions that can be easily expressed in
158163
terms of others, as well as a repository of utility functions that help in the
159-
implementation of duck-arrays that most duck-arrays would require.
164+
implementation of duck-arrays that most duck-arrays would require. This would
165+
allow us to avoid designing entire protocols, e.g., a protocol for stacking
166+
and concatenating would be replaced by simply implementing ``stack`` and/or
167+
``concatenate`` and then providing default implementations for everything else
168+
in that class. The same applies for transposing, and many other functions for
169+
which protocols haven't been proposed, such as ``isin`` in terms of ``in1d``,
170+
``setdiff1d`` in terms of ``unique``, and so on.
160171

161172
It also allows one to override functions in a manner which
162173
``__array_function__`` simply cannot, such as overriding ``np.einsum`` with the
@@ -211,6 +222,101 @@ If the user wishes to obtain a NumPy array, there are two ways of doing it:
211222
2. Use ``numpy.overridable.asarray`` with the NumPy backend set and coercion
212223
enabled
213224

225+
Aliases outside of the ``numpy.overridable`` namespace
226+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
227+
228+
All functionality in ``numpy.random``, ``numpy.linalg`` and ``numpy.fft``
229+
will be aliased to their respective overridable versions inside
230+
``numpy.overridable``. The reason for this is that there are alternative
231+
implementations of RNGs (``mkl-random``), linear algebra routines (``eigen``,
232+
``blis``) and FFT routines (``mkl-fft``, ``pyFFTW``) that need to operate on
233+
``numpy.ndarray`` inputs, but still need the ability to switch behaviour.
234+
235+
This is different from monkeypatching in a few different ways:
236+
237+
* The caller-facing signature of the function is always the same,
238+
so there is at least the loose sense of an API contract. Monkeypatching
239+
does not provide this ability.
240+
* There is the ability of locally switching the backend.
241+
* It has been `suggested <http://numpy-discussion.10968.n7.nabble.com/NEP-31-Context-local-and-global-overrides-of-the-NumPy-API-tp47452p47472.html>`_
242+
that the reason that 1.17 hasn't landed in the Anaconda defaults channel is
243+
due to the incompatibility between monkeypatching and ``__array_function__``,
244+
as monkeypatching would bypass the protocol completely.
245+
* Statements of the form ``from numpy import x; x`` and ``np.x`` would have
246+
different results depending on whether the import was made before or
247+
after monkeypatching happened.
248+
249+
All this isn't possible at all with ``__array_function__`` or
250+
``__array_ufunc__``.
251+
252+
It has been formally realised (at least in part) that a backend system is
253+
needed for this, in the `NumPy roadmap <https://numpy.org/neps/roadmap.html#other-functionality>`_.
254+
255+
For ``numpy.random``, it's still necessary to make the C-API fit the one
256+
proposed in `NEP-19 <https://numpy.org/neps/nep-0019-rng-policy.html>`_.
257+
This is impossible for `mkl-random`, because then it would need to be
258+
rewritten to fit that framework. The guarantees on stream
259+
compatibility will be the same as before, but if there's a backend that affects
260+
``numpy.random`` set, we make no guarantees about stream compatibility, and it
261+
is up to the backend author to provide their own guarantees.
262+
263+
Providing a way for implicit dispatch
264+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
265+
266+
It has been suggested that the ability to dispatch methods which do not take
267+
a dispatchable is needed, while guessing that backend from another dispatchable.
268+
269+
As a concrete example, consider the following:
270+
271+
.. code:: python
272+
273+
with unumpy.determine_backend(array_like, np.ndarray):
274+
unumpy.arange(len(array_like))
275+
276+
While this does not exist yet in ``uarray``, it is trivial to add it. The need for
277+
this kind of code exists because one might want to have an alternative for the
278+
proposed ``*_like`` functions, or the ``like=`` keyword argument. The need for these
279+
exists because there are functions in the NumPy API that do not take a dispatchable
280+
argument, but there is still the need to select a backend based on a different
281+
dispatchable.
282+
283+
The need for an opt-in module
284+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
285+
286+
The need for an opt-in module is realised because of a few reasons:
287+
288+
* There are parts of the API (like `numpy.asarray`) that simply cannot be
289+
overridden due to incompatibility concerns with C/Cython extensions, however,
290+
one may want to coerce to a duck-array using ``asarray`` with a backend set.
291+
* There are possible issues around an implicit option and monkeypatching, such
292+
as those mentioned above.
293+
294+
NEP 18 notes that this may require maintenance of two separate APIs. However,
295+
this burden may be lessened by, for example, parametrizing all tests over
296+
``numpy.overridable`` separately via a fixture. This also has the side-effect
297+
of thoroughly testing it, unlike ``__array_function__``. We also feel that it
298+
provides an oppurtunity to separate the NumPy API contract properly from the
299+
implementation.
300+
301+
Benefits to end-users and mixing backends
302+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
303+
304+
Mixing backends is easy in ``uarray``, one only has to do:
305+
306+
.. code:: python
307+
308+
# Explicitly say which backends you want to mix
309+
ua.register_backend(backend1)
310+
ua.register_backend(backend2)
311+
ua.register_backend(backend3)
312+
313+
# Freely use code that mixes backends here.
314+
315+
The benefits to end-users extend beyond just writing new code. Old code
316+
(usually in the form of scripts) can be easily ported to different backends
317+
by a simple import switch and a line adding the preferred backend. This way,
318+
users may find it easier to port existing code to GPU or distributed computing.
319+
214320
Related Work
215321
------------
216322

@@ -245,6 +351,14 @@ Existing alternate dtype implementations
245351
* Datashape: https://datashape.readthedocs.io
246352
* Plum: https://plum-py.readthedocs.io/
247353

354+
Alternate implementations of parts of the NumPy API
355+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
356+
357+
* ``mkl_random``: https://github.com/IntelPython/mkl_random
358+
* ``mkl_fft``: https://github.com/IntelPython/mkl_fft
359+
* ``bottleneck``: https://github.com/pydata/bottleneck
360+
* ``opt_einsum``: https://github.com/dgasmith/opt_einsum
361+
248362
Implementation
249363
--------------
250364

@@ -420,6 +534,12 @@ also a possibility that can be considered by this NEP. However, the act of
420534
doing an extra ``pip install`` or ``conda install`` may discourage some users
421535
from adopting this method.
422536

537+
An alternative to requiring opt-in is mainly to *not* override ``np.asarray``
538+
and ``np.array``, and making the rest of the NumPy API surface overridable,
539+
instead providing ``np.duckarray`` and ``np.asduckarray``
540+
as duck-array friendly alternatives that used the respective overrides. However,
541+
this has the downside of adding a minor overhead to NumPy calls.
542+
423543
Discussion
424544
----------
425545

0 commit comments

Comments
 (0)