Skip to content

WIP: figure out appveyor issue #367

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 6 commits into from
Closed

WIP: figure out appveyor issue #367

wants to merge 6 commits into from

Conversation

bcipolli
Copy link
Contributor

We have a test failure in appveyor. The test is within a loop, has no message, and doesn't reproduce locally, so it's hard to debug.

I pushed this change to my repo (use yield for the test, add a message to be clear what test(s) are failing), and it seems to pass. Bizarre. Submitting a PR as a placeholder for exploration / discussion, and so we can solve this issue in appveyor.

@effigies
Copy link
Member

It's a Win32 precision issue, I believe caused by the upgrade to NumPy 1.10. It does have messages, which is a big sequence of tuples of failing conditions.

I suspect updating this line with a more lenient threshold could resolve the issue. Failing that, updating the print statements to be more informative might be a good step.

@bcipolli
Copy link
Contributor Author

👍 will work on that, thanks for the tips!

@bcipolli
Copy link
Contributor Author

Here's the updated output.
image

I'm not sure changing the ulp is going to matter here, as error is expected to be zero for ints, it seems.

@effigies
Copy link
Member

So it seems to be this case that's actually the problem. Nice cleanup.

Still don't have ideas on how to fix it, but it does look more tractable, now.

@bcipolli
Copy link
Contributor Author

On my machine, whenever slope=1, inter=0. However, in this example, inter looks like it's underflowed somehow (it's huge).

I can't figure out where dataobj.slope and dataobj.inter are coming from, however.

@bcipolli
Copy link
Contributor Author

On my machine, whenever slope=1, inter=0. However, in this example, inter looks like it's underflowed somehow (it's huge).

Actually, that's not always true, sorry.

@effigies
Copy link
Member

Slope and intercept will be coming from AnalyzeImage.to_file_map, I think.

@bcipolli
Copy link
Contributor Author

I tweaked the output; it didn't make much sense to me (seemed to be querying absolute differences on relative failure cases, and vice verse).

This seems more interpretable to me, but still not making sense.
image

@effigies
Copy link
Member

Added frequency information.

Test ID: 0.05; in_type=<i8, out_type=<u8
    slope=1.00000e+00, inter=-3.82645e+18
    ABS FAIL: exp_abs_mx_e=5.12000e+02 < abs_mx_e=6.40000e+02; max_diff=1.28000e+02; num=2/10000
    REL FAIL: rel_thresh  =0.00000e+00 < rel_mx_e=7.51984e-13; max_diff=7.51984e-13; num=7438/10000

Absolute fails 2/10000 times, so we should be able to look at individual cases. Relative fails ~75% of the time, but it's a very small error. Possibly picking an arbitrary case or two to print more details of would be instructive?

@effigies effigies mentioned this pull request Nov 3, 2015
@effigies
Copy link
Member

effigies commented Nov 5, 2015

Rebase issue?

@bcipolli
Copy link
Contributor Author

bcipolli commented Nov 5, 2015

So the last commit seems to solve the issue. It's hard to know if that's the right fix--I don't know the new threshhold value is (ulp(scaling_type(inter))), but I'm pretty sure it's a bug in the test code.

Besides that, I'm not sure how much of the test code changes we want to keep. If that's the bug, just two lines really needed to be changed. The others are all printing / cleanup. Which I think is good, except I'm not 100% sure I understood the code well enough to be sure those changes are right. I think they improve the code, but perhaps I've just misunderstood.

@matthew-brett If you have a moment to look things over, that'd be great!

@effigies
Copy link
Member

effigies commented Nov 6, 2015

Just linking to the old commit (83cbfbf) discussion's been happening on.

Do you think this is going to be resolved soon? Since it seems to be a numpy 1.10 issue, it would be good to have it in for 2.0.2.

@bcipolli
Copy link
Contributor Author

bcipolli commented Nov 9, 2015

I will take another look at this, this morning. I've downloaded a Windows 32-bit virtualbox image, and am downloading Python now.

@bcipolli
Copy link
Contributor Author

bcipolli commented Nov 9, 2015

So... I was able to reproduce this locally. I played around a bit, but ran out of time before really debugging if nibabel can do any of the scaling computations differently to reduce the error (I was poking around in arraywriters.py). Not sure when I can get back to it; I'm off to New Orleans in the morning until next week.

@effigies
Copy link
Member

effigies commented Nov 9, 2015

Thanks, Ben.

@matthew-brett How do you feel about blocking 2.0.2 on this PR? I think this is my only block, at the moment.

@matthew-brett
Copy link
Member

Failure is a bit puzzling - we don't seem to be getting it on the buildbots...

@matthew-brett
Copy link
Member

But then again - if the failure is due to numpy 1.10, we wouldn't get it. I'll try to get to this tomorrow.

@matthew-brett
Copy link
Member

OK - I can't replicate on a Windows 7 64-bit bit numpy 1.10.1 from Christoph Gohlke, for the 2.0.2 branch, so this seems to be specific to conda. Is that possible?

@effigies
Copy link
Member

It's the 32 bit test that's failing. (Apologies if I'm missing something and uselessly stating the obvious.)

@matthew-brett
Copy link
Member

OK - the 32-bit test does fail with Christoph's numpy 1.10.1 wheel - I'll investigate.

@effigies
Copy link
Member

@matthew-brett Looks like the latest commit on master resolved the appveyor error. Should we close this PR and backport to 2.0.2?

@matthew-brett
Copy link
Member

Unfortunately, that commit is not a fix, just a cleanup, and it fixes the failure by accident - just pumping another series of random numbers through. There still seems to be an issue on 32 bit - still looking at it...

@matthew-brett
Copy link
Member

Still exploring, it's slow I'm afraid.

@matthew-brett
Copy link
Member

OK - I think I know what's going on now. I don't know why numpy-1.10.1 exposes it, but what is happening is that the values being tested are right at the extreme end of values that can be saved as np.uint64 after subtracting the minimum. The 32-bit builds we are using are built with MSVC, and they have a bug in converting to np.uint64, worked around here : https://github.com/nipy/nibabel/blob/master/nibabel/casting.py#L17

In practice what this means is the maximum np.uint64 value we can generate from a float is a lot smaller on these systems than on any other system, and we are more likely to hit this threshold with our random numbers.

At the moment I'm checking whether we can get away with this kind of intercept-only scaling here : https://github.com/nipy/nibabel/blob/master/nibabel/arraywriters.py#L562

I need to think a little whether I should relax the threshold or make the check tighter to flip to full slope and intercept scaling in this situation. Will try and do that tomorrow.

@matthew-brett
Copy link
Member

No, sorry, I was wrong - it's a numpy + MKL + Windows bug:

numpy/numpy#6685

So, I think we can close this.

@bcipolli
Copy link
Contributor Author

Thanks for figuring this out @matthew-brett; I agree.

@bcipolli bcipolli closed this Nov 14, 2015
@bcipolli bcipolli deleted the appveyor branch November 14, 2015 05:18
@matthew-brett
Copy link
Member

Chris - Ben - would you mind sending me your public ssh keys over email? I want to give you access to the buildbot machines to make it easier for you to debug if you need to.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants