Skip to content

segfault running /pipeline/engine/tests/test_utils.py #1788

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
djarecka opened this issue Jan 26, 2017 · 25 comments
Closed

segfault running /pipeline/engine/tests/test_utils.py #1788

djarecka opened this issue Jan 26, 2017 · 25 comments

Comments

@djarecka
Copy link
Collaborator

djarecka commented Jan 26, 2017

Summary

Segfault when running tests from /pipeline/engine/tests/test_utils.py

Actual behavior

See e.g. https://travis-ci.org/nipy/nipype/builds/195258843
appears randomly, but never seen outside py2.7
It's probably the same problem as described in #1757, but the temporary @satra solution doesn't always work(?).

Not sure, if my recent failure in different tests might be related: https://travis-ci.org/djarecka/nipype/jobs/189379626

How to replicate the behavior

From my experience, it might appear more often on some systems than on the others. It should be python2.7 (other travis setting are not important IMO). You can use the docker container nipype/nipype_test:py27 and run pytest -v nipype/pipeline/engine/tests/test_utils.py (after installing pytest) multiple times, 10 should be enough.

It usually returns segfault when running test_mapnode_crash2 or test_mapnode_crash2, but running any of these tests separately do not give segfault (at least not as often).

@satra
Copy link
Member

satra commented Jan 27, 2017

and those particular tests will pass if unicode_literals is turned off. so it's some interaction between the traits and unicode literals

@oesteban
Copy link
Contributor

traits has C "fast" checking on some types. It looks to me that could be the source of the segfault

@djarecka
Copy link
Collaborator Author

looks better after updating traits to 4.6, but have to test more.

@djarecka
Copy link
Collaborator Author

Traits 4.6 either solves the problem or hides it pretty well so the segfault occurs very rare (having problem with getting one). Since I don't understand this segfault, can't really say which one is true.

btw. I've just found that this is not the first time you have problems with traits&unicode_literals errors that are not reported by Travis (at least not in every run) #1621

@satra
Copy link
Member

satra commented Jan 29, 2017

@djarecka - the change you made in interfaces.base in dynamic traits was based on traits 4.6. perhaps we should set that as the minimum version.

@djarecka
Copy link
Collaborator Author

yes, all test should pass with traits 4.6. I've created a PR, but would keep the issue open

@djarecka
Copy link
Collaborator Author

djarecka commented Feb 3, 2017

it looks like travis with traits4.6 still have sagfault

oesteban added a commit to oesteban/nipype that referenced this issue Feb 3, 2017
@mgxd
Copy link
Member

mgxd commented Feb 8, 2017

@djarecka @satra how do we progress with this? Will moving from traits to traitlets help this?

@mgxd mgxd added the bug label Feb 8, 2017
@djarecka
Copy link
Collaborator Author

djarecka commented Feb 9, 2017

@mgxd - I will check traitlets as @satra suggested.

@oesteban
Copy link
Contributor

The most prominent issue I can foresee switching to traitlets right now is the different use of metadata (https://traitlets.readthedocs.io/en/stable/migration.html#separation-of-metadata-and-keyword-arguments-in-traittype-contructors).

Other than that, I guess migration will be less painful than it could seem.

@satra
Copy link
Member

satra commented Feb 10, 2017

@djarecka - let's do migration after api change. for now let's figure out a fix for segfault (will require python in debug mode and gdb) and the new api.

@oesteban
Copy link
Contributor

I have run the tests 10000x, using the new nipype/nipype_test:py27 docker image:

for i in $(seq 10000); do pytest -v nipype/pipeline/engine/tests/test_utils.py >> log.txt; done

No segfaults.

@djarecka
Copy link
Collaborator Author

@oesteban - which version of traits do you have?
After changing traits version to 4.6 I also couldn't see segfaults in the nipype/nipype_test:py27 docker image.

Unfortunately, if you're patient enough, Travis still might return one. The branch includes your PRs.

@oesteban
Copy link
Contributor

oesteban commented Feb 17, 2017

@djarecka tested with traits 4.4, aware of the possible influence of the traits version.

For some reason, Travis settings are originating this segfault, you may want to take a look into Travis' docker image. Back in the time I was programming mostly C++, I remember that the reason # 1 for a random segfault is generally an uninitialized variable. However I cannot tell how this applies to our settings.

@djarecka
Copy link
Collaborator Author

@oesteban - was double checking, since the nipype_test:py27 image I pulled and used was giving me segfault before updating traits to 4.6. I will follow your suggestion and will build my own image, but the segfault is not related to the Travis settings only, I was able to reproduce the segfault in at least 3 different environment outside Travis.

@djarecka
Copy link
Collaborator Author

I will try to debug the segfault on my OSX first, might be easier.

@oesteban
Copy link
Contributor

oesteban commented Apr 4, 2017

@mgxd mgxd added the watchlist label Apr 4, 2017
@mgxd
Copy link
Member

mgxd commented Apr 24, 2017

Rare but still happening - latest #1968

environmental variables:

export INSTALL_DEB_DEPENDECIES=false
export NIPYPE_EXTRAS="doc,tests,fmri,profiler"
Python 2.7.12

error:

nipype/pipeline/plugins/sge.py::nipype.pipeline.plugins.sge.qsub_sanitize_job_name /home/travis/.travis/job_stages: line 53:  6816 Segmentation fault

@effigies
Copy link
Member

effigies commented May 9, 2017

I'm noticing this happening more. I wonder if it's related to the Precise update.

@oesteban
Copy link
Contributor

oesteban commented May 9, 2017

I agree

@mgxd
Copy link
Member

mgxd commented May 9, 2017

would it be wrong to simply skip test_mapnode_crash3 for the time being?

@effigies
Copy link
Member

effigies commented May 9, 2017

Also getting it on test_mapnode_crash2.

I will say these tests are extremely well named.

@djarecka
Copy link
Collaborator Author

@mgxd - you can get segfult from different test from this set, depending on the environment, order of running etc. If for some reason you decide to ignore them, it might be better to use xfail

@djarecka
Copy link
Collaborator Author

djarecka commented May 10, 2017

@mgxd, @satra I was wrong, the xfail works fine if the test fails, but the tests are not actually failing...
@satra - #2006

@mgxd
Copy link
Member

mgxd commented Oct 19, 2017

since we've decided to skip them, I'm closing the issue, but leaving the watchlist label should this reappear down the line

@mgxd mgxd closed this as completed Oct 19, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants