Skip to content

BUG: Rolling apply on DataFrame with Datetime index returns NaN #17156

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Aug 10, 2017
2 changes: 2 additions & 0 deletions doc/source/whatsnew/v0.21.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -296,6 +296,8 @@ Groupby/Resample/Rolling
- Bug in :func:`infer_freq` causing indices with 2-day gaps during the working week to be wrongly inferred as business daily (:issue:`16624`)
- Bug in ``.rolling(...).quantile()`` which incorrectly used different defaults than :func:`Series.quantile()` and :func:`DataFrame.quantile()` (:issue:`9413`, :issue:`16211`)
- Bug in ``groupby.transform()`` that would coerce boolean dtypes back to float (:issue:`16875`)
- Bug in ``.rolling(...).apply(...)`` where when rolling apply on DataFrame
with Datetime index and min_periods set to be greater than 1 (:issue:`15305`)

Sparse
^^^^^^
Expand Down
11 changes: 6 additions & 5 deletions pandas/_libs/window.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -1427,16 +1427,17 @@ def roll_generic(ndarray[float64_t, cast=True] input,
n = len(input)
if n == 0:
return input


counts = roll_sum(np.concatenate([np.isfinite(input).astype(float),
np.array([0.] * offset)]),
win, minp, index, closed)[offset:]

start, end, N, win, minp, is_variable = get_window_indexer(input, win,
minp, index,
closed,
floor=0)
output = np.empty(N, dtype=float)

counts = roll_sum(np.concatenate([np.isfinite(input).astype(float),
np.array([0.] * offset)]),
win, minp, index, closed)[offset:]
output = np.empty(N, dtype=float)

if is_variable:

Expand Down
21 changes: 21 additions & 0 deletions pandas/tests/test_window.py
Original file line number Diff line number Diff line change
Expand Up @@ -423,6 +423,27 @@ def test_constructor_with_timedelta_window(self):
expected = df.rolling('3D').sum()
tm.assert_frame_equal(result, expected)

def test_constructor_with_timedelta_window_and_minperiods(self):
# GH 15305
n = 10
df = pd.DataFrame({'value': np.arange(n)},
index=pd.date_range('2015-12-24',
periods=n,
freq="D"))
expected_data = np.append([np.NaN, 1.], np.arange(3., 27., 3))
for window in [timedelta(days=3), pd.Timedelta(days=3)]:
result_roll_sum = df.rolling(window=window, min_periods=2).sum()
result_roll_generic = df.rolling(window=window,
min_periods=2).apply(sum)
expected = pd.DataFrame({'value': expected_data},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can put the expected_data directly in the frame.

index=pd.date_range('2015-12-24',
periods=n,
freq="D"))
tm.assert_frame_equal(result_roll_sum, expected)
tm.assert_frame_equal(result_roll_generic, expected)
expected = df.rolling('3D', min_periods=2).sum()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this last part testing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This tests when window == '3D'. I'd better put it together with [timedelta(days=3), pd.Timedelta(days=3)]

tm.assert_frame_equal(result_roll_generic, expected)

def test_numpy_compat(self):
# see gh-12811
r = rwindow.Rolling(Series([2, 4, 6]), window=2)
Expand Down