Skip to content

[MRG] FIX Fix LMNN rollback #101

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

wdevazelhes
Copy link
Member

@wdevazelhes wdevazelhes commented Jul 2, 2018

Fixes #88
Stores L and G in addition to what was already stored (df, a1 and a2), at the last "good" point (meaning the last time a point had a better objective than all the previous points), and gets back to this point if the update worsens the objective.
Here is a little code to show the difference between this PR and master:

from sklearn.datasets import make_classification
from metric_learn import LMNN
lmnn = LMNN(verbose=True)
X, y = make_classification(random_state=0)
lmnn.fit(X, y)

This PR:

1 3719.8858456299436 -inf 419 1e-07
2 3719.3690067170437 -0.516838912899857 419 1.0099999999999999e-07
3 3718.8470839936936 -0.5219227233501442 419 1.0201e-07
4 3718.32002830601 -0.5270556876835144 419 1.030301e-07
5 3717.787790041948 -0.5322382640620162 419 1.0406040099999999e-07
/home/will/Code/metric-learn/metric_learn/lmnn.py:62: UserWarning: use_pca does nothing for the python_LMNN implementation
  warnings.warn('use_pca does nothing for the python_LMNN implementation')
6 3717.2503191273736 -0.5374709145744418 419 1.0510100500999999e-07
7 3716.707565022117 -0.5427541052567904 419 1.0615201506009999e-07
8 3716.1594767159886 -0.5480883061281929 419 1.0721353521070098e-07
9 3715.607125166057 -0.5523515499317 418 1.08285670562808e-07
10 3715.0504268415325 -0.5566983245244046 418 1.0936852726843608e-07
[...]
802 295.8600904596078 -0.006394422217169904 587 3.6168492926956634e-05
803 295.8625006630451 0.0024102034373072456 592 3.65301778562262e-05
804 295.8537018319783 -0.006388627629519306 592 1.844773981739423e-05
805 295.85064356510844 -0.003058266869857107 590 1.863221721556817e-05
806 295.84687892031536 -0.0037646447930796967 587 1.8818539387723852e-05
807 295.84384134091164 -0.0030375794037240667 595 1.900672478160109e-05
808 295.84056710633945 -0.0032742345721885613 584 1.9196792029417102e-05
809 295.83776376246647 -0.0028033438729835325 597 1.9388759949711275e-05
810 295.8366193762049 -0.0011443862615578837 582 1.9582647549208387e-05
811 295.8319662001932 -0.004653176011686355 595 1.977847402470047e-05
812 295.82935080014795 -0.002615400045272054 585 1.9976258764947475e-05
813 295.8262825137674 -0.0030682863805395755 596 2.0176021352596948e-05
814 295.8269467992082 0.0006642854407914456 589 2.0377781566122918e-05
LMNN converged with objective 295.8262825137674

Master:
1 3719.8858456299436 -inf 419 1e-07
2 3719.3690067170437 -0.516838912899857 419 1.0099999999999999e-07
3 3718.8470839936936 -0.5219227233501442 419 1.0201e-07
4 3718.32002830601 -0.5270556876835144 419 1.030301e-07
/home/will/Code/metric-learn/metric_learn/lmnn.py:62: UserWarning: use_pca does nothing for the python_LMNN implementation
  warnings.warn('use_pca does nothing for the python_LMNN implementation')
5 3717.787790041948 -0.5322382640620162 419 1.0406040099999999e-07
6 3717.2503191273736 -0.5374709145744418 419 1.0510100500999999e-07
7 3716.707565022117 -0.5427541052567904 419 1.0615201506009999e-07
8 3716.1594767159886 -0.5480883061281929 419 1.0721353521070098e-07
[...]
986 296.9272749579874 0.021928604062395607 593 8.894939764557867e-84
987 296.9272749579874 0.021928604062395607 593 4.4474698822789335e-84
988 296.9272749579874 0.021928604062395607 593 2.2237349411394667e-84
989 296.9272749579874 0.021928604062395607 593 1.1118674705697334e-84
990 296.9272749579874 0.021928604062395607 593 5.559337352848667e-85
991 296.9272749579874 0.021928604062395607 593 2.7796686764243334e-85
992 296.9272749579874 0.021928604062395607 593 1.3898343382121667e-85
993 296.9272749579874 0.021928604062395607 593 6.949171691060834e-86
994 296.9272749579874 0.021928604062395607 593 3.474585845530417e-86
995 296.9272749579874 0.021928604062395607 593 1.7372929227652084e-86
996 296.9272749579874 0.021928604062395607 593 8.686464613826042e-87
997 296.9272749579874 0.021928604062395607 593 4.343232306913021e-87
998 296.9272749579874 0.021928604062395607 593 2.1716161534565105e-87
999 296.9272749579874 0.021928604062395607 593 1.0858080767282552e-87
LMNN didn't converge in 1000 steps.

Stores L and G in addition to what was stored before
@wdevazelhes wdevazelhes changed the title FIX Fix LMNN rollback [MRG] FIX Fix LMNN rollback Jul 2, 2018
@wdevazelhes
Copy link
Member Author

I'll also add some non regression tests that fail in master and pass in this PR

@wdevazelhes wdevazelhes changed the title [MRG] FIX Fix LMNN rollback [WIP] FIX Fix LMNN rollback Jul 3, 2018
- test that LMNN converges on a simple example where it should converge
- test that the objective function never has twice the same value
@wdevazelhes wdevazelhes changed the title [WIP] FIX Fix LMNN rollback [MRG] FIX Fix LMNN rollback Jul 3, 2018
X, y = make_classification(random_state=0)
old_stdout = sys.stdout
sys.stdout = StringIO()
lmnn = LMNN(verbose=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use python_LMNN here and the other test, to not fail when the shogun version is available.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

out = sys.stdout.getvalue()
sys.stdout.close()
sys.stdout = old_stdout
assert ("LMNN converged with objective" in out)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style nit: the parens here aren't needed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

finally:
out = sys.stdout.getvalue()
sys.stdout.close()
sys.stdout = old_stdout
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be nice to have this logic in a context manager. See https://eli.thegreenplace.net/2015/redirecting-all-kinds-of-stdout-in-python/

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Also I just found there exist something in pytest that seem to do the job quite nicely: https://docs.pytest.org/en/3.2.1/capture.html
But it breaks a bit the structure of unittest classes... If it is important to keep the previous structure I'll use the context manager
Tell me what you think

sys.stdout.close()
sys.stdout = old_stdout
lines = re.split("\n+", out)
objectives = [re.search("\d* (?:(\d*.\d*))[ | -]\d*.\d*", s)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment explaining this regular expression, with an example of what it should be matching.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

# we update L and will see in the next iteration if it does indeed
# better
L -= learn_rate * 2 * L.dot(G)
learn_rate *= 1.01
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems wrong to me, though it was also wrong before this PR.

Reading through the Shogun implementation here and here, they don't do any rolling back.
They do the L update unconditionally in gradient_step, then compute the objective value for the current iteration, then do the learning rate update based on the change in objective.

In one of the reference Matlab implementations here they do the L update first, then optionally roll back to a saved state when updating the step size.

--
So I think the correct fix would be to move the L update to the # do the gradient update section, after computing the new G, using the existing learning rate. Then, if the objective didn't improve we can halve the learning rate and roll back to the last good state (including L and G). Otherwise, we just grow the learning rate by 1% and carry on.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inverting the order of things as suggested would also improve the readability of the code I think

Copy link
Member Author

@wdevazelhes wdevazelhes Jul 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just submitted a new commit inverting the order of things. I have commented the code to make it clearer: basically it starts from a reference point, and tries the next possible updated point, retrying with a smaller learning rate if needed, until it finds a new reference point which has a better objective

@@ -72,6 +74,37 @@ def test_iris(self):
self.assertLess(csep, 0.25)


def test_convergence_simple_example(capsys):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does capsys get passed in from?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be automatically found by pytest (as one of the integrated fixtures), and runned when running test_convergence_simple_example. I verified doing -v (verbose) when doing pytest and these tests are correctly passing when they should (and failing when modifying the error message)

assert "LMNN converged with objective" in out


def test_no_twice_same_objective(capsys):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these be methods of TestLMNN?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read here (pytest-dev/pytest#2504 (comment)) that pytest fixtures cannot be integrated with unittest classes, so I extracted the tests from the class hierarchy. But I agree that it is not ideal. They propose a workaround in the link, so maybe it would be better ? (adding these lines to TestLMNN, include the test in TestLMNN, and replace capsys by self.capsys in the test)

@pytest.fixture(autouse=True)
def capsys(self, capsys):
  self.capsys = capsys

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. The current solution is fine, then.

# previous L, following the gradient:

# we want to enter the loop for the first try, with the original
# learning rate (hence * 2 since it will be / 2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit confusing to read. How about a while True: loop that breaks on the delta_obj > 0 condition at the end?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it would improve readability
Done

@wdevazelhes
Copy link
Member Author

@perimosocordiae I adressed your comments, except for the ones about pytest functions being outside of the unittest class architecture
I was not sure what to do, is it OK to let it this way, or should I go for the option mentioned in #101 (comment) ?
Otherwise I think the PR is ready to merge

Copy link
Contributor

@perimosocordiae perimosocordiae left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there's a keyring.deb file added in this PR, which probably shouldn't be.

Otherwise, I'm +1 to merge after the conflicts are resolved.

William de Vazelhes added 2 commits August 18, 2018 01:31
@wdevazelhes
Copy link
Member Author

Oups, you're right for the .deb file, I don't know how it got there :p

@perimosocordiae perimosocordiae merged commit efeab88 into scikit-learn-contrib:master Aug 18, 2018
@wdevazelhes wdevazelhes deleted the fix/lmnn_rollback branch August 22, 2018 06:50
# objective than the previous L, following the gradient:
while True:
# the next point next_L to try out is found by a gradient step
L_next = L - 2 * learn_rate * G
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here it should be 2*learn_rate*L.dot(G), not 2*learn_rate*G... (see #201)

objective = objective_old
else:
# update L
L -= learn_rate * 2 * L.dot(G)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants