-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
BUG: df.replace over pd.Period columns (#34871) #36867
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 11 commits
a26823f
7b40ece
fa20da9
c378787
6f38780
96f35f6
8be0a16
0b2de8f
ac8309b
5df2c69
33a05af
1d798bc
5921fb7
8d16db5
91ad6dd
90ab53e
f7bd1e4
bd1fea8
0fa2a25
bcf61e6
42e8c74
d4c7acf
5cc38c8
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -434,6 +434,8 @@ Strings | |
|
||
Interval | ||
^^^^^^^^ | ||
|
||
- Bug in :meth:`DataFrame.replace` and :meth:`Series.replace` where :class:`Interval` dtypes would be converted to object dtypes (:issue:34871) | ||
- Bug in :meth:`IntervalIndex.take` with negative indices and ``fill_value=None`` (:issue:`37330`) | ||
- | ||
- | ||
|
@@ -488,6 +490,11 @@ I/O | |
- Bug in :class:`HDFStore` threw a ``TypeError`` when exporting an empty :class:`DataFrame` with ``datetime64[ns, tz]`` dtypes with a fixed HDF5 store (:issue:`20594`) | ||
- Bug in :class:`HDFStore` was dropping timezone information when exporting :class:`Series` with ``datetime64[ns, tz]`` dtypes with a fixed HDF5 store (:issue:`20594`) | ||
|
||
Period | ||
^^^^^^ | ||
|
||
- Bug in :meth:`DataFrame.replace` and :meth:`Series.replace` where :class:`Period` dtypes would be converted to object dytpes (:issue:34871) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. typo on dtypes There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
typo is still there. BTW in general let the person who made the original comment hit the "conversation resolved" button There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Gotcha thanks for the heads up, haven't used github at all really. Fixed |
||
|
||
Plotting | ||
^^^^^^^^ | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2002,6 +2002,18 @@ class ObjectValuesExtensionBlock(ExtensionBlock): | |
def external_values(self): | ||
return self.values.astype(object) | ||
|
||
def _can_hold_element(self, element: Any) -> bool: | ||
samc1213 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
if is_valid_nat_for_dtype(element, self.dtype): | ||
return True | ||
if element is NaT: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jbrockmendel ok here? @samc1213 are all of these branches actually hit? in particular why L2008 and L2010 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jreback Added these due to unit tests. Specifically, L2010 is required for |
||
return True | ||
if isinstance(element, list) and len(element) == 0: | ||
return True | ||
jbrockmendel marked this conversation as resolved.
Show resolved
Hide resolved
|
||
tipo = maybe_infer_dtype_type(element) | ||
if tipo is not None: | ||
return issubclass(tipo.type, self.dtype.type) | ||
return isinstance(element, self.dtype.type) | ||
|
||
|
||
class NumericBlock(Block): | ||
__slots__ = () | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1517,17 +1517,15 @@ def test_replace_with_duplicate_columns(self, replacement): | |
|
||
tm.assert_frame_equal(result, expected) | ||
|
||
@pytest.mark.xfail( | ||
reason="replace() changes dtype from period to object, see GH34871", strict=True | ||
) | ||
def test_replace_period_ignore_float(self): | ||
@pytest.mark.parametrize("value", [pd.Period("2020-01"), pd.Interval(0, 5)]) | ||
def test_replace_ea_ignore_float(self, value): | ||
""" | ||
Regression test for GH#34871: if df.replace(1.0, 0.0) is called on a df | ||
with a Period column the old, faulty behavior is to raise TypeError. | ||
with a Period/Interval column the old, faulty behavior is to raise TypeError. | ||
""" | ||
df = DataFrame({"Per": [pd.Period("2020-01")] * 3}) | ||
df = DataFrame({"Per": [value] * 3}) | ||
result = df.replace(1.0, 0.0) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. since this is now frame_or_series, can you call it There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Got it, done |
||
expected = DataFrame({"Per": [pd.Period("2020-01")] * 3}) | ||
expected = DataFrame({"Per": [value] * 3}) | ||
tm.assert_frame_equal(expected, result) | ||
|
||
def test_replace_value_category_type(self): | ||
|
Original file line number | Diff line number | Diff line change | ||
---|---|---|---|---|
|
@@ -450,6 +450,18 @@ def test_replace_with_compiled_regex(self): | |||
expected = pd.Series(["z", "b", "c"]) | ||||
tm.assert_series_equal(result, expected) | ||||
|
||||
@pytest.mark.parametrize("value", [pd.Period("2020-01"), pd.Interval(0, 5)]) | ||||
def test_replace_ea_ignore_float(self, value): | ||||
""" | ||||
Regression test for corrolary to GH#34871: if series.replace(1.0, 0.0) | ||||
is called on a Period/Interval Series, the old, faulty behavior | ||||
is to raise TypeError. | ||||
""" | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this can be a comment, not a docstring, just needs to see There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. actually you can share this test using the frame_or_series fixture There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Changed to a comment, but I grepped for frame_or_series and not finding much in the codebase. I did find There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
It's a pytest fixture: Line 296 in aa390ec
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks @arw2019 - I needed to merge master in locally to find this |
||||
series = pd.Series([value] * 3) | ||||
result = series.replace(1.0, 0.0) | ||||
expected = pd.Series([value] * 3) | ||||
tm.assert_series_equal(expected, result) | ||||
|
||||
@pytest.mark.parametrize("pattern", ["^.$", "."]) | ||||
def test_str_replace_regex_default_raises_warning(self, pattern): | ||||
# https://github.com/pandas-dev/pandas/pull/24809 | ||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
backticks around the issue number
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, fixed