Fama-French multivariate regression #406

eigenfoo · 2017-07-31T17:21:46Z

Fixes #379

We don't just want a multivariate regression, we want a rolling multivariate regression. Pandas used to support this sort of thing with pd.stats.ols.MovingOLS, but that has unfortunately been deprecated.

A solution is described on StackOverflow, but frustratingly this solution doesn't work since apply only works on Series data (see citynorman's comment on the top answer). So, we will have to write our own rolling multivariate regression...

eigenfoo · 2017-07-31T20:03:53Z

@twiecki @gusgordon the Python 3.4 build seems very sad without the statsmodels module

  File "/home/travis/build/quantopian/pyfolio/pyfolio/timeseries.py", line 24, in <module>
    import statsmodels.formula.api as sm
ImportError: No module named 'statsmodels'

eigenfoo · 2017-07-31T20:06:57Z

pyfolio/timeseries.py

+                        regression_df.index[rolling_window:]):
+        window = regression_df.loc[beg:end]
+        coeffs = sm.ols(formula='rets ~ SMB + HML + UMD - 1', data=window) \
+            .fit().params.values


I also want to make sure that this computation is correct. Are these parameters the Fama French betas?

The -1 in the formula keyword means to set the intercept equal to 0.

we should not force the intercept to be 0.

twiecki · 2017-07-31T21:22:54Z

Yes, we need to add statsmodels to the requirements file.

eigenfoo · 2017-08-01T04:29:15Z

Added statsmodels to requirements file and fixed bug forcing regression intercept to be 0. The Travis builds seem to be passing, although the Python 3.4 build timed out: I'll try and implement this solution tomorrow morning.

Other than that, the PR is ready for review and merge! @gusgordon @twiecki

eigenfoo · 2017-08-01T18:09:52Z

@twiecki the Travis builds are not failing, but are timing out. See here for an example. I've tried implementing a solution using travis_wait, but that does not seem to help: builds still time out. @gusgordon and I don't really know what's going on. Any thoughts?

twiecki · 2017-08-01T18:48:57Z

@georgh0021 how slow is the regression when you try it locally?

eigenfoo · 2017-08-01T21:53:17Z

@twiecki

In [13]: %timeit rolling_fama_french(returns, factor_returns)
         1 loop, best of 3: 4.52 s per loop

If performance is an issue, it may be worth looking into Pythonic's answer to this forum post. He implements a numpy-only solution with linear algebra...

In [14]: %timeit rolling_regression(factor_returns.iloc[:, 0], returns)
         10 loops, best of 3: 61.8 ms per loop

twiecki · 2017-08-02T08:17:56Z

4.52 secs is quite slow for this simple functionality. How long a time-range did you check this on? The numpy version is not ideal but perhaps our best shot if we want to keep the functionality.

twiecki · 2017-08-02T08:18:36Z

Why do you test on only one factor? factor_returns.iloc[:, 0],

eigenfoo · 2017-08-02T12:34:53Z

Laziness 😛 the code on StackOverflow only took in one variable, I just copied and pasted to see what it would look like. Will refactor and test today.

twiecki · 2017-08-02T13:41:20Z

Not sure I trust that code. Have you tried sklearn.linear_model.LinearRegression?

accidentally pushed to wrong directory

eigenfoo · 2017-08-02T17:27:41Z

@twiecki sklearn runs much faster!
1 loop, best of 3: 479 ms per loop
And you were right, the regression matches the regression done by statsmodels, while the numpy solution was weird and incorrect.

Travis can't seem to find the sklearn package though. Any ideas?

PackageNotFoundError: Packages missing in current channels:
            
  - sklearn
We have searched for the packages in the following channels:
            
  - https://repo.continuum.io/pkgs/free/linux-64
  - https://repo.continuum.io/pkgs/free/noarch
  - https://repo.continuum.io/pkgs/r/linux-64
  - https://repo.continuum.io/pkgs/r/noarch
  - https://repo.continuum.io/pkgs/pro/linux-64
  - https://repo.continuum.io/pkgs/pro/noarch
            
The command "conda create -q -n testenv --yes python=$TRAVIS_PYTHON_VERSION ipython pyzmq numpy scipy nose matplotlib pandas Cython patsy flake8 seaborn sklearn runipy pytables networkx pandas-datareader matplotlib-tests joblib" failed and exited with 1 during .
Your build has been stopped.

twiecki · 2017-08-02T18:19:15Z

Great! it's conda install scikit-learn

eigenfoo · 2017-08-02T19:33:43Z

@twiecki @gusgordon back to square one: Python 3.4 build is still timing out. More help needed, unfortunately.

eigenfoo · 2017-08-17T02:38:43Z

@richafrank @twiecki bump. No idea why the tests are timing out.

twiecki · 2017-08-21T08:33:04Z

Hm, seems like 3.4 and 3.5 are pretty close to the limit. At this point we can probably drop 3.4 all-together.

eigenfoo · 2017-08-21T10:29:07Z

Not sure why the builds aren't timing out now... Something to keep in mind going forward I suppose. Once this becomes a serious problem we can look into finding a solution.

twiecki · 2017-08-21T10:55:46Z

Well, I think it's right on the razor's edge. We should probably just test over a shorter time-period.

eigenfoo added 4 commits July 31, 2017 13:00

DOC updated docstring to indicate separate lin regressions

56715f9

ENH first pass at rolling multivar ff

f16783c

DOC used Mom instead of UMD

e48d5b1

DOC UMD is a better name than Mom

50c15f0

eigenfoo added the bug label Jul 31, 2017

eigenfoo added 3 commits July 31, 2017 15:10

DOC described multivar lin reg in docstring

85ded16

MAINT refactored code, reduced clutter

1302f93

DOC added comment to explain initial nan padding

3dbbfb3

eigenfoo commented Jul 31, 2017

View reviewed changes

eigenfoo requested a review from twiecki July 31, 2017 20:07

eigenfoo added 6 commits July 31, 2017 21:19

BUG do not force intercept = 0

503b8c1

BLD added statsmodels to requirements

85cb552

BLD added statsmodels requirement to .travis.yml and setup.py

053d10c

BUG made statsmodels version >= 0.6.1

c8f9c73

DOC docstring style

04ccd27

BUG ols returns np ndarray, not pd Series

0919334

eigenfoo mentioned this pull request Aug 1, 2017

[PERF] Rolling Regressions #58

Closed

eigenfoo added 4 commits August 1, 2017 09:37

TST using travis_wait to fix build timeouts

98957d6

TST allow travis_wait 60 mins to finish build

27b6d66

TST allow 90 mins to finish tests...

0fc763e

REV remove travis_wait, pending discussion

a8ae484

eigenfoo added 5 commits August 2, 2017 13:16

ENH use sklearn instead of statsmodels

f0c4121

REV revert to previous commit

13a8b5f

accidentally pushed to wrong directory

ENH use sklearn instead of statsmodels

5afed69

ENH added sklearn to requirements

8746053

END added sklearn, removed statsmodels

51603b6

eigenfoo added 2 commits August 2, 2017 14:30

BUG scikit-learn, not sklearn

19a0545

BUG scikit-learn, not sklearn

cfe4600

eigenfoo requested a review from gusgordon August 2, 2017 19:18

eigenfoo mentioned this pull request Aug 14, 2017

New release for pyfolio #414

Closed

4 tasks

eigenfoo added this to the v1.0.0 milestone Aug 15, 2017

twiecki merged commit 17bbfe3 into master Aug 21, 2017

twiecki deleted the ff_multivar branch August 21, 2017 09:39

Fama-French multivariate regression #406

Fama-French multivariate regression #406

Uh oh!

Conversation

eigenfoo commented Jul 31, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eigenfoo commented Jul 31, 2017

Uh oh!

eigenfoo Jul 31, 2017

Choose a reason for hiding this comment

Uh oh!

twiecki Jul 31, 2017

Choose a reason for hiding this comment

Uh oh!

eigenfoo Aug 1, 2017

Choose a reason for hiding this comment

Uh oh!

twiecki commented Jul 31, 2017

Uh oh!

eigenfoo commented Aug 1, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eigenfoo commented Aug 1, 2017

Uh oh!

twiecki commented Aug 1, 2017

Uh oh!

eigenfoo commented Aug 1, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

twiecki commented Aug 2, 2017

Uh oh!

twiecki commented Aug 2, 2017

Uh oh!

eigenfoo commented Aug 2, 2017

Uh oh!

twiecki commented Aug 2, 2017

Uh oh!

eigenfoo commented Aug 2, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

twiecki commented Aug 2, 2017

Uh oh!

eigenfoo commented Aug 2, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eigenfoo commented Aug 17, 2017

Uh oh!

twiecki commented Aug 21, 2017

Uh oh!

eigenfoo commented Aug 21, 2017

Uh oh!

twiecki commented Aug 21, 2017

Uh oh!

Uh oh!

eigenfoo commented Jul 31, 2017 •

edited

Loading

eigenfoo commented Aug 1, 2017 •

edited

Loading

eigenfoo commented Aug 1, 2017 •

edited

Loading

eigenfoo commented Aug 2, 2017 •

edited

Loading

eigenfoo commented Aug 2, 2017 •

edited

Loading