Skip to content

PERF: perf regressions vs 0.14.0 #7633

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jreback opened this issue Jul 1, 2014 · 24 comments · Fixed by #7684
Closed

PERF: perf regressions vs 0.14.0 #7633

jreback opened this issue Jul 1, 2014 · 24 comments · Fixed by #7684
Labels
Performance Memory or execution speed performance
Milestone

Comments

@jreback
Copy link
Contributor

jreback commented Jul 1, 2014

We have many + perf fixes in 0.14.1 yeh!

but need to look at these (resamples could be spurious)

dataframe_resample_mean_string               |   4.0483 |   2.4170 |   1.6749 |
dataframe_resample_max_numpy                 |   3.3394 |   1.7230 |   1.9381 |
dataframe_resample_max_string                |   3.3154 |   1.7047 |   1.9448 |
dataframe_resample_min_numpy                 |   3.3020 |   1.6843 |   1.9604 |
dataframe_resample_min_string                |   3.2830 |   1.6587 |   1.9793 |
timeseries_timestamp_downsample_mean         |  11.4803 |   4.0847 |   2.8106 |
timeseries_period_downsample_mean            |  29.7631 |  10.3510 |   2.8754 |
datetimeindex_normalize                      |  77.5570 |   3.0633 |  25.3183 |
-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------

Ratio < 1.0 means the target commit is faster then the baseline.
Seed used: 1234

Target [272dae5] : BUG: doc example in groupby.rst (GH7559 / GH7628)
Base   [da0f7ae] : RLS: 0.14.0 final
@jreback jreback added this to the 0.14.1 milestone Jul 1, 2014
@jreback
Copy link
Contributor Author

jreback commented Jul 1, 2014

cc @sinhrks

@cpcloud
Copy link
Member

cpcloud commented Jul 1, 2014

There are some pretty hefty hits for DatetimeIndex-ed repring, so much so that I have to KeyboardInterrupt the repring. I'll see if I can't track this down. It looks like there's some unnecessary vectorization of timezone manipulation.

@jreback
Copy link
Contributor Author

jreback commented Jul 1, 2014

can you post your example?

@cpcloud
Copy link
Member

cpcloud commented Jul 1, 2014

In [1]: period = 1.0 / 2048 * 1e9

In [2]: freq = pd.datetools.Nano(period)

In [3]: df = DataFrame({'a': np.random.randn(6e6)}, index=pd.date_range('now', periods=6e6, freq=freq, tz='EST'))

In [4]: df.a  # takes a very long time

@cpcloud
Copy link
Member

cpcloud commented Jul 1, 2014

hm maybe repr-ing is not the issue repr(df.a) finishes in a timely manner.

@jreback
Copy link
Contributor Author

jreback commented Jul 1, 2014

what does ipython actually do after executing, maybe it caches it or something?

@jreback
Copy link
Contributor Author

jreback commented Jul 1, 2014

the answer is in the format, while I ctrl-c....its in trim_zeros (its prob trimming on the entire frame and not on the displayed portion)

@cpcloud
Copy link
Member

cpcloud commented Jul 1, 2014

hm when i control-c it's in a np.vectorize-d function. vectorize is not really that much faster than just map + lambda which i think is the issue

@cpcloud
Copy link
Member

cpcloud commented Jul 1, 2014

okay maybe not ... just cython'd that and it didn't improve the situation

@cpcloud
Copy link
Member

cpcloud commented Jul 1, 2014

that is where the bottleneck is though

@cpcloud
Copy link
Member

cpcloud commented Jul 1, 2014

In [4]: df.a
> /home/phillip/Documents/code/py/pandas/pandas/tseries/frequencies.py(674)__init__()
    673             import ipdb; ipdb.set_trace()
--> 674             self.values = tslib.tz_convert_safe_dst(self.values, 'UTC',
    675                                                     index.tz)

ipdb> n
> /home/phillip/Documents/code/py/pandas/pandas/tseries/frequencies.py(675)__init__()
    674             self.values = tslib.tz_convert_safe_dst(self.values, 'UTC',
--> 675                                                     index.tz)
    676             #self.values = np.vectorize(f)(self.values)

ipdb> n  # bueller ..... bueller ..... anybody

@jreback
Copy link
Contributor Author

jreback commented Jul 1, 2014

yes...that is the same problem as in the vbench above

@cpcloud
Copy link
Member

cpcloud commented Jul 1, 2014

ok

@cpcloud
Copy link
Member

cpcloud commented Jul 1, 2014

what does bcdc6d9 actually fix? i don't see any docs on it

@jreback
Copy link
Contributor Author

jreback commented Jul 1, 2014

If you put the commented code back this fixes the entire problem: https://github.com/pydata/pandas/blob/master/pandas/tseries/frequencies.py#L675

@cpcloud
Copy link
Member

cpcloud commented Jul 1, 2014

yep

@jreback
Copy link
Contributor Author

jreback commented Jul 1, 2014

can you figure out where that was changed?

@cpcloud
Copy link
Member

cpcloud commented Jul 1, 2014

the commit i just put up is where it was changed

@cpcloud
Copy link
Member

cpcloud commented Jul 1, 2014

git blame -L 671,677 pandas/tseries/frequencies.py

@jreback
Copy link
Contributor Author

jreback commented Jul 1, 2014

cc @sinhrks bcdc6d9

@jreback
Copy link
Contributor Author

jreback commented Jul 2, 2014

partially closd by #7652

@cpcloud this also fixes the display issue from above (though no idea how to actully test it)

@jreback
Copy link
Contributor Author

jreback commented Jul 2, 2014

cc @sinhrks. Seems min/max are doing a vair amount of computation on an Index w/o missing values. This is why the resamples are now slower. Can you take a look?

@sinhrks
Copy link
Member

sinhrks commented Jul 2, 2014

Yes, will check.

@sinhrks
Copy link
Member

sinhrks commented Jul 5, 2014

It looks offset ops get also little slower. I'll take a look.

-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------
frame_ctor_dtindex_Millix2                   |   2.3377 |   1.8246 |   1.2812 |
frame_ctor_dtindex_YearEndx2                 |   2.7140 |   2.1173 |   1.2818 |
frame_ctor_dtindex_Microx2                   |   2.3347 |   1.8183 |   1.2840 |
frame_ctor_dtindex_Hourx2                    |   2.3340 |   1.8170 |   1.2846 |
frame_ctor_dtindex_Microx1                   |   2.3340 |   1.8164 |   1.2850 |
frame_ctor_dtindex_Dayx1                     |   2.3523 |   1.8146 |   1.2963 |
...
timeseries_day_incr                          |   0.0190 |   0.0054 |   3.5147 |
timeseries_day_apply                         |   0.0173 |   0.0033 |   5.1905 |

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Performance Memory or execution speed performance
Projects
None yet
3 participants