Open
Description
Firstly thanks for the impressive package. We're considering using it in https://github.com/pydata/xarray to provide faster groupby operations.
It looks like some forms of stack / unstacking are supported too, if I'm looking at "Form 4" in the readme. Is it currently possible to supply a subset of the indices as part of that?
In [7]: import numpy_groupies as npg
In [1]: import numpy as np
In [30]: from numpy_groupies.aggregate_numpy import aggregate
In [26]: flat = np.arange(12).astype(float)
...: data = values = flat.reshape(3, -1)
In [4]: import itertools
In [5]: group_idx = np.array(list(itertools.product(*[range(x) for x in values.shape]))).T
...: group_idx
Out[5]:
array([[0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2],
[0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3]])
This works well:
In [32]: aggregate(group_idx, flat, "array", size=(3, 4))
Out[32]:
array([[ 0., 1., 2., 3.],
[ 4., 5., 6., 7.],
[ 8., 9., 10., 11.]])
But this doesn't:
In [33]: aggregate(group_idx[:, :-1], flat[:-1].astype(float), "array", size=(3, 4))
/usr/local/lib/python3.8/site-packages/numpy/core/_asarray.py:136: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
return array(a, dtype, copy=False, order=order, subok=True)
Out[33]:
array([[array([0.]), array([1.]), array([2.]), array([3.])],
[array([4.]), array([5.]), array([6.]), array([7.])],
[array([8.]), array([9.]), array([10.]), 0]], dtype=object)
Notably, supplying sum
does compute, though the result has 0
rather than nan
:
In [35]: aggregate(group_idx[:, :-1], flat[:-1].astype(float), "sum", size=(3, 4))
Out[35]:
array([[ 0., 1., 2., 3.],
[ 4., 5., 6., 7.],
[ 8., 9., 10., 0.]])
Ideally the "array" case above would return:
array([[ 0., 1., 2., 3.],
[ 4., 5., 6., 7.],
[ 8., 9., 10., np.nan.]])
Of course, if this library — as per the name — is more focused on groupby than stacking, totally reasonable to close this as wontfix.
Thanks!
Metadata
Metadata
Assignees
Labels
No labels