-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Support dicts with default values in series.map #16002
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pandas/core/series.py
Outdated
if isinstance(arg, (dict, Series)): | ||
if isinstance(arg, dict): | ||
arg = self._constructor(arg, index=arg.keys()) | ||
default_dict_types = collections.Counter, collections.defaultdict |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of actually naming these, use issubclass(type(arg), dict) and not type(arg) is dict
In [17]: f = lambda arg: issubclass(type(arg), dict) and not type(arg) is dict
In [18]: f({})
Out[18]: False
In [19]: f(collections.defaultdict())
Out[19]: True
In [21]: f(collections.Counter())
Out[21]: True
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issubclass(type(arg), dict) and not type(arg) is dict
This would evaluate as True
for a collections.OrderedDict
which should behave like a plain dict
.
always write the tests first! |
Codecov Report
@@ Coverage Diff @@
## master #16002 +/- ##
==========================================
+ Coverage 91.02% 91.02% +<.01%
==========================================
Files 145 145
Lines 50391 50394 +3
==========================================
+ Hits 45870 45873 +3
Misses 4521 4521
Continue to review full report at Codecov.
|
a3bcb48
to
96d12a6
Compare
@jreback good idea (assuming this is so we can check that the tests fail before the enhancement). I rebased to put the testing commit first. Unfortunately, the GitHub PR displays orders by commit time rather than order. Also CI is only building the tip commit. |
u need to duck type the test rather than hard coding things |
I dug a bit deeper. The |
97cfc49
to
961ea46
Compare
great make sure to test a dict subclass with and w/o missimg document in doc string add a note in other api changes section |
@@ -2132,10 +2132,14 @@ def map_f(values, f): | |||
else: | |||
map_f = lib.map_infer | |||
|
|||
if isinstance(arg, (dict, Series)): | |||
if isinstance(arg, dict): | |||
if isinstance(arg, dict): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a 1-line comment here on what you are doing
doc/source/whatsnew/v0.20.0.txt
Outdated
@@ -449,10 +449,8 @@ Other Enhancements | |||
- Integration with the ``feather-format``, including a new top-level ``pd.read_feather()`` and ``DataFrame.to_feather()`` method, see :ref:`here <io.feather>`. | |||
- ``Series.str.replace()`` now accepts a callable, as replacement, which is passed to ``re.sub`` (:issue:`15055`) | |||
- ``Series.str.replace()`` now accepts a compiled regular expression as a pattern (:issue:`15446`) | |||
|
|||
|
|||
- ``Series.map()`` now respects default values of dictionary subclasses with a ``__missing__`` method, such as ``collections.Counter`` (:issue:`15999`, :issue:`16002`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only list the issue (and not the PR)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is an api change (and not an enhancement), ignore how I labelled the issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved the note to "Other API Changes". Is this okay, or should I make a larger entry under:
Backwards incompatible API changes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
pandas/core/series.py
Outdated
if isinstance(arg, dict): | ||
if hasattr(arg, '__missing__'): | ||
# If a dictionary subclass defines a default value method, | ||
# convert arg to a lookup function (https://git.io/vS7LK). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can just add GH #15999
instead of a short url
small comments, other lgtm. |
Confirmed locally that the following tests fail under the old code for
Removing WIP from title. |
@@ -449,10 +449,7 @@ Other Enhancements | |||
- Integration with the ``feather-format``, including a new top-level ``pd.read_feather()`` and ``DataFrame.to_feather()`` method, see :ref:`here <io.feather>`. | |||
- ``Series.str.replace()`` now accepts a callable, as replacement, which is passed to ``re.sub`` (:issue:`15055`) | |||
- ``Series.str.replace()`` now accepts a compiled regular expression as a pattern (:issue:`15446`) | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jreback let me know if you want me to revert the deletion of these blank lines.
pandas/core/series.py
Outdated
@@ -2089,21 +2089,33 @@ def map(self, arg, na_action=None): | |||
two B | |||
three C | |||
|
|||
Values in Series that are not in the dictionary (as keys) are converted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
put this in Notes
section.
pandas/core/series.py
Outdated
defines ``__missing__`` (i.e. provides a method for default values), | ||
then this default is used rather than ``NaN``: | ||
|
||
>>> from collections import Counter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
put this example at the end
pandas/tests/series/test_apply.py
Outdated
assert_series_equal(result, expected) | ||
|
||
def test_map_dict_subclass_with_missing(self): | ||
class DictWithMissing(dict): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add the issue number here as a comment
couple minor doc comments. ping on green. |
@jreback also known as PR is passing all checks |
thanks! |
Closes #15999
Still a work in progress.