This library is intended to be used as an alternative to
pd.Series.rolling and pd.Series.expanding to gain a speedup by using
numba optimized functions operating on numpy arrays. There are also
online classes for more efficient updates of window statistics.
pip install window-ops
conda install -c conda-forge window-ops
For a transformations n_samples -> n_samples you can use
[seasonal_](rolling|expanding)_(mean|max|min|std) on an array.
pd.__version__'1.3.5'
n_samples = 10_000  # array size
window_size = 8  # for rolling operations
season_length = 7  # for seasonal operations
execute_times = 10 # number of times each function will be executedAverage times in milliseconds.
times.applymap('{:.2f}'.format)| window_ops | pandas | |
|---|---|---|
| rolling_mean | 0.03 | 0.43 | 
| rolling_max | 0.14 | 0.57 | 
| rolling_min | 0.14 | 0.58 | 
| rolling_std | 0.06 | 0.54 | 
| expanding_mean | 0.03 | 0.31 | 
| expanding_max | 0.05 | 0.76 | 
| expanding_min | 0.05 | 0.47 | 
| expanding_std | 0.09 | 0.41 | 
| seasonal_rolling_mean | 0.05 | 3.89 | 
| seasonal_rolling_max | 0.18 | 4.27 | 
| seasonal_rolling_min | 0.18 | 3.75 | 
| seasonal_rolling_std | 0.08 | 4.38 | 
| seasonal_expanding_mean | 0.04 | 3.18 | 
| seasonal_expanding_max | 0.06 | 3.29 | 
| seasonal_expanding_min | 0.06 | 3.28 | 
| seasonal_expanding_std | 0.12 | 3.89 | 
speedups = times['pandas'] / times['window_ops']
speedups = speedups.to_frame('times faster')
speedups.applymap('{:.0f}'.format)| times faster | |
|---|---|
| rolling_mean | 15 | 
| rolling_max | 4 | 
| rolling_min | 4 | 
| rolling_std | 9 | 
| expanding_mean | 12 | 
| expanding_max | 15 | 
| expanding_min | 9 | 
| expanding_std | 4 | 
| seasonal_rolling_mean | 77 | 
| seasonal_rolling_max | 23 | 
| seasonal_rolling_min | 21 | 
| seasonal_rolling_std | 52 | 
| seasonal_expanding_mean | 78 | 
| seasonal_expanding_max | 52 | 
| seasonal_expanding_min | 51 | 
| seasonal_expanding_std | 33 | 
If you have an array for which you want to compute a window statistic
and then keep updating it as more samples come in you can use the
classes in the window_ops.online module. They all have a
fit_transform method which take the array and return the
transformations defined above but also have an update method that take
a single value and return the new statistic.
Average time in milliseconds it takes to transform the array and perform 100 updates.
times.to_frame().applymap('{:.2f}'.format)| average time (ms) | |
|---|---|
| RollingMean | 0.12 | 
| RollingMax | 0.23 | 
| RollingMin | 0.22 | 
| RollingStd | 0.32 | 
| ExpandingMean | 0.10 | 
| ExpandingMax | 0.07 | 
| ExpandingMin | 0.07 | 
| ExpandingStd | 0.17 | 
| SeasonalRollingMean | 0.28 | 
| SeasonalRollingMax | 0.35 | 
| SeasonalRollingMin | 0.38 | 
| SeasonalRollingStd | 0.42 | 
| SeasonalExpandingMean | 0.17 | 
| SeasonalExpandingMax | 0.14 | 
| SeasonalExpandingMin | 0.15 | 
| SeasonalExpandingStd | 0.23 |