-
-
Notifications
You must be signed in to change notification settings - Fork 18.7k
Support NDFrame.shift with EAs #22387
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 3 commits
b29dfc6
c980035
c4b0b97
8d404bc
64f51f4
ab901a6
c5b556d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -400,6 +400,36 @@ def dropna(self): | |
|
||
return self[~self.isna()] | ||
|
||
def shift(self, periods=1): | ||
# type: (int) -> ExtensionArray | ||
""" | ||
Shift values by desired number. | ||
|
||
Newly introduced missing values are filled with | ||
``self.dtype.na_value``. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. can you add a versionadded tag |
||
|
||
Parameters | ||
---------- | ||
periods : int, default 1 | ||
The number of periods to shift. Negative values are allowed | ||
for shifting backwards. | ||
|
||
Returns | ||
------- | ||
shifted : ExtensionArray | ||
""" | ||
if periods == 0: | ||
return self.copy() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thoughts on this? non-zero periods will necessarily be a copy. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Although There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I personally find that strange behaviour to not copy then (it would be more logical to know that the result is always a copy, regardless the value of periods IMO) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Opened #22397 for that. Will leave this as is then. |
||
empty = self._from_sequence([self.dtype.na_value] * abs(periods), | ||
dtype=self.dtype) | ||
if periods > 0: | ||
a = empty | ||
b = self[:-periods] | ||
else: | ||
a = self[abs(periods):] | ||
b = empty | ||
return self._concat_same_type([a, b]) | ||
|
||
def unique(self): | ||
"""Compute the ExtensionArray of unique values. | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2061,6 +2061,12 @@ def interpolate(self, method='pad', axis=0, inplace=False, limit=None, | |
limit=limit), | ||
placement=self.mgr_locs) | ||
|
||
def shift(self, periods, axis=0, mgr=None): | ||
# type: (int, Optional[BlockPlacement]) -> List[ExtensionBlock] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. can you add a doc-string |
||
return [self.make_block_same_class(self.values.shift(periods=periods), | ||
placement=self.mgr_locs, | ||
ndim=self.ndim)] | ||
|
||
|
||
class NumericBlock(Block): | ||
__slots__ = () | ||
|
@@ -2684,10 +2690,6 @@ def _try_coerce_result(self, result): | |
|
||
return result | ||
|
||
def shift(self, periods, axis=0, mgr=None): | ||
return self.make_block_same_class(values=self.values.shift(periods), | ||
placement=self.mgr_locs) | ||
|
||
def to_dense(self): | ||
# Categorical.get_values returns a DatetimeIndex for datetime | ||
# categories, so we can't simply use `np.asarray(self.values)` like | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -138,3 +138,28 @@ def test_combine_add(self, data_repeated): | |
expected = pd.Series( | ||
orig_data1._from_sequence([a + val for a in list(orig_data1)])) | ||
self.assert_series_equal(result, expected) | ||
|
||
@pytest.mark.parametrize('frame', [True, False]) | ||
@pytest.mark.parametrize('periods, indices', [ | ||
(-2, [2, 3, 4, -1, -1]), | ||
(0, [0, 1, 2, 3, 4]), | ||
(2, [-1, -1, 0, 1, 2]), | ||
]) | ||
def test_container_shift_negative(self, data, frame, periods, indices): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Very minor comment, but is there a reason this is called There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Leftover from a merger of multiple tests. Will fix the name. I do |
||
# https://github.com/pandas-dev/pandas/issues/22386 | ||
subset = data[:5] | ||
data = pd.Series(subset, name='A') | ||
expected = pd.Series(subset.take(indices, allow_fill=True), name='A') | ||
|
||
if frame: | ||
result = data.to_frame(name='A').assign(B=1).shift(periods) | ||
expected = pd.concat([ | ||
expected, | ||
pd.Series([1] * 5, name='B').shift(periods) | ||
], axis=1) | ||
compare = self.assert_frame_equal | ||
else: | ||
result = data.shift(periods) | ||
compare = self.assert_series_equal | ||
|
||
compare(result, expected) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think update in the ExtensionArray doc-string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, in the class docstring we only mention the methods that either needs to implemented (because they raise AbstractMethodError otherwise) or either have a suboptimal implementation because it does the object ndarray roundtrip.
This is not the case here (which is not saying we couldn't also list other methods that can be overriden for specific reasons)