-
Notifications
You must be signed in to change notification settings - Fork 1.8k
[PERF] Rolling Regressions #58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
+1 @humdings yeah, we should just go down the pandas route for everything that pandas supports since its very industry recognized/supported |
Yeah that sounds great! I am only concerned about the differences. How significant are they? May just be different methods of compounding returns or something. |
The differences are not too severe, but they are not negligible. I pushed to a 'pandas-ols' branch, you can check it out there, I want to make sure none of the actual tear sheets break before it hits the master. |
@humdings great, thanks! can you do a PR? |
Just looked at the code. Not only is it faster but also much more succinct. Should definitely merge this. |
Rolling regressions are now deprecated in pandas, and will be removed in a future version. I 100% agree with the point about performance between Closing this issue due to this statsmodels deprecation issue. |
I'd like to suggest we use Pandas implementation for rolling regressions.
https://github.com/pydata/pandas/blob/master/pandas/stats/ols.py
The performance difference is huge.
For a single factor
For multiple factors
Pandas uses some sophisticated caching methods that make their implementation really fast. I'm slightly suspect of the differences between the results of the two implementations as well, I know the pandas method takes takes takes float precision into account.
If there are no arguments against this I can swap it out and submit a PR.
The text was updated successfully, but these errors were encountered: