Skip to content

Bug in SparseArray.__array_ufunc__ for reduce #27080

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
TomAugspurger opened this issue Jun 27, 2019 · 2 comments · Fixed by #27890
Closed

Bug in SparseArray.__array_ufunc__ for reduce #27080

TomAugspurger opened this issue Jun 27, 2019 · 2 comments · Fixed by #27890
Labels
Bug Sparse Sparse Data Type
Milestone

Comments

@TomAugspurger
Copy link
Contributor

Code Sample, a copy-pastable example if possible

In [2]: a = pd.SparseArray([0, 10, 1])

In [3]: np.maximum.reduce(a)
Out[3]: ---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
~/Envs/pandas-dev/lib/python3.7/site-packages/IPython/core/formatters.py in __call__(self, obj)
    700                 type_pprinters=self.type_printers,
    701                 deferred_pprinters=self.deferred_printers)
--> 702             printer.pretty(obj)
    703             printer.flush()
    704             return stream.getvalue()

~/Envs/pandas-dev/lib/python3.7/site-packages/IPython/lib/pretty.py in pretty(self, obj)
    400                         if cls is not object \
    401                                 and callable(cls.__dict__.get('__repr__')):
--> 402                             return _repr_pprint(obj, self, cycle)
    403
    404             return _default_pprint(obj, self, cycle)

~/Envs/pandas-dev/lib/python3.7/site-packages/IPython/lib/pretty.py in _repr_pprint(obj, p, cycle)
    695     """A pprint that just redirects to the normal repr function."""
    696     # Find newlines and replace them with p.break_()
--> 697     output = repr(obj)
    698     for idx,output_line in enumerate(output.splitlines()):
    699         if idx:

~/sandbox/pandas/pandas/core/arrays/sparse.py in __repr__(self)
   1815     def __repr__(self):
   1816         return '{self}\nFill: {fill}\n{index}'.format(
-> 1817             self=printing.pprint_thing(self),
   1818             fill=printing.pprint_thing(self.fill_value),
   1819             index=printing.pprint_thing(self.sp_index))

~/sandbox/pandas/pandas/io/formats/printing.py in pprint_thing(thing, _nest_lvl, escape_chars, default_escapes, quote_strings, max_seq_items)
    215         result = _pprint_seq(thing, _nest_lvl, escape_chars=escape_chars,
    216                              quote_strings=quote_strings,
--> 217                              max_seq_items=max_seq_items)
    218     elif isinstance(thing, str) and quote_strings:
    219         result = "'{thing}'".format(thing=as_escaped_unicode(thing))

~/sandbox/pandas/pandas/io/formats/printing.py in _pprint_seq(seq, _nest_lvl, max_seq_items, **kwds)
    111     r = [pprint_thing(next(s),
    112                       _nest_lvl + 1, max_seq_items=max_seq_items, **kwds)
--> 113          for i in range(min(nitems, len(seq)))]
    114     body = ", ".join(r)
    115

~/sandbox/pandas/pandas/io/formats/printing.py in <listcomp>(.0)
    111     r = [pprint_thing(next(s),
    112                       _nest_lvl + 1, max_seq_items=max_seq_items, **kwds)
--> 113          for i in range(min(nitems, len(seq)))]
    114     body = ", ".join(r)
    115

~/sandbox/pandas/pandas/core/arrays/base.py in __iter__(self)
    283         # calls to ``__getitem__``, which may be slower than necessary.
    284         for i in range(len(self)):
--> 285             yield self[i]
    286
    287     # ------------------------------------------------------------------------

~/sandbox/pandas/pandas/core/arrays/sparse.py in __getitem__(self, key)
   1092
   1093         if is_integer(key):
-> 1094             return self._get_val_at(key)
   1095         elif isinstance(key, tuple):
   1096             data_slice = self.to_dense()[key]

~/sandbox/pandas/pandas/core/arrays/sparse.py in _get_val_at(self, loc)
   1135             return self.fill_value
   1136         else:
-> 1137             return libindex.get_value_at(self.sp_values, sp_loc)
   1138
   1139     def take(self, indices, allow_fill=False, fill_value=None):

TypeError: Argument 'arr' has incorrect type (expected numpy.ndarray, got numpy.int64)

In [4]: result = np.maximum.reduce(a)

In [5]: type(result)
Out[5]: pandas.core.arrays.sparse.SparseArray

should be a scalar 10.

@jbrockmendel
Copy link
Member

It looks like there is also a problem with __repr__? Or is the SparseArray constructor accepting something it shouldn't?

@TomAugspurger
Copy link
Contributor Author

TomAugspurger commented Jun 27, 2019 via email

@jbrockmendel jbrockmendel added Bug Sparse Sparse Data Type labels Jul 21, 2019
@jreback jreback added this to the 1.0 milestone Aug 13, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Sparse Sparse Data Type
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants