Skip to content

BUG: Using df.groupby(..).rolling(..) on non-monotonic timestamp column does not raise an exception #43909

Closed
@domsmrz

Description

@domsmrz

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the master branch of pandas.

Reproducible Example

import pandas as pd

shuffled = [3, 0, 1, 2]
sec = 1_000_000_000
df = pd.DataFrame([{'t': pd.Timestamp(2 * x * sec), 'x': x+1, 'c': 42} for x in shuffled])                                                                                                                                                                                                                                                                        

#                     t  x   c
# 0 1970-01-01 00:00:06  4  42
# 1 1970-01-01 00:00:00  1  42
# 2 1970-01-01 00:00:02  2  42
# 3 1970-01-01 00:00:04  3  42

# Following row raises an Exception (as expected):
df.rolling(on='t', window='3s').x.sum()

# Following code should raise the same exception, but instead it gives meaningless results:
df.groupby('c').rolling(on='t', window='3s').x.sum()
# c   t                  
# 42  1970-01-01 00:00:06    4.0
#     1970-01-01 00:00:00    1.0
#     1970-01-01 00:00:02    3.0
#     1970-01-01 00:00:04    6.0

Issue Description

When calling df.groupby(..).rolling(on='...', ...) on some Timestamp column (with timedelta window) which is not sorted, pandas gives some meaningless results.

Expected Behavior

An exception should be raised, as is the case when we call df.rolling(on=...) (without the groupby) on non-monotonic column

Installed Versions

INSTALLED VERSIONS
------------------
commit           : 73c68257545b5f8530b7044f56647bd2db92e2ba
python           : 3.8.10.final.0
python-bits      : 64
OS               : Linux
OS-release       : 5.11.0-37-generic
Version          : #41~20.04.2-Ubuntu SMP Fri Sep 24 09:06:38 UTC 2021
machine          : x86_64
processor        : x86_64
byteorder        : little
LC_ALL           : None
LANG             : en_US.UTF-8
LOCALE           : en_US.UTF-8

pandas           : 1.3.3
numpy            : 1.21.2
pytz             : 2021.1
dateutil         : 2.8.2
pip              : 21.2.3
setuptools       : 45.2.0
Cython           : None
pytest           : None
hypothesis       : None
sphinx           : None
blosc            : None
feather          : None
xlsxwriter       : None
lxml.etree       : 4.5.0
html5lib         : 1.0.1
pymysql          : None
psycopg2         : None
jinja2           : 2.10.1
IPython          : 7.13.0
pandas_datareader: None
bs4              : 4.8.2
bottleneck       : None
fsspec           : None
fastparquet      : None
gcsfs            : None
matplotlib       : None
numexpr          : None
odfpy            : None
openpyxl         : None
pandas_gbq       : None
pyarrow          : None
pyxlsb           : None
s3fs             : None
scipy            : 1.7.1
sqlalchemy       : None
tables           : None
tabulate         : None
xarray           : None
xlrd             : 2.0.1
xlwt             : None
numba            : None

Metadata

Metadata

Assignees

Labels

BugError ReportingIncorrect or improved errors from pandasGroupbyWindowrolling, ewma, expanding

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions