Skip to content

First point of LazyTractogram is not saved #586

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
nilgoyette opened this issue Dec 13, 2017 · 4 comments
Closed

First point of LazyTractogram is not saved #586

nilgoyette opened this issue Dec 13, 2017 · 4 comments
Labels

Comments

@nilgoyette
Copy link

This test should pass

def test_save_from_generator(self):
    tractogram = Tractogram(DATA['streamlines'],
                            affine_to_rasmm=np.eye(4))

    # Just to create a generator
    filtered = (s for s in tractogram.streamlines if True)
    lazy_tractogram = LazyTractogram(lambda: filtered,
                                     affine_to_rasmm=np.eye(4))

    for ext, _ in FORMATS.items():
        with InTemporaryDirectory():
            filename = 'streamlines' + ext
            nib.streamlines.save(lazy_tractogram, filename)
            tfile = nib.streamlines.load(filename, lazy_load=False)
            assert_tractogram_equal(tfile.tractogram, tractogram)

The problem comes from

first_item = next(iter(self.tractogram))
...
for t in tractogram:

When self.tractogram is a generator, first_item is not in the generator anymore.

I took 10 minutes to fix it but I couldn't find a clean way to do it. A simple itertools.chain([first_item], tractogram) is not enough :)

@effigies effigies added the bug label Dec 13, 2017
@effigies
Copy link
Member

@MarcCote Do you have time to look into this?

@nilgoyette
Copy link
Author

I fixed it, but I really don't like what I did.

reinsert_first_point = isinstance(self.tractogram, LazyTractogram)
...
try:
    first_item = next(iter(self.tractogram))
...
if reinsert_first_point:
    backup = self.tractogram._streamlines

    def chain_first_and_all_other():
        yield from itertools.chain([first_item.streamline], backup())

    self.tractogram._streamlines = chain_first_and_all_other
...
tractogram = self.tractogram.to_world(lazy=True)

If @MarcCote or anyone have a better solution, go for it!

@MarcCote
Copy link
Contributor

@nilgoyette the streamlines you provided to LazyTractogram is not a generator function, i.e. calling it should provide new generators every time. For instance,

In [3]: gen = (s for s in range(5))

In [4]: lmb = lambda: gen

In [5]: print(list(lmb()))
[0, 1, 2, 3, 4]

In [6]: print(list(lmb()))
[]

@MarcCote
Copy link
Contributor

MarcCote commented Dec 21, 2017

As @nilgoyette told me offline, this particular scenario happens when you do tracking and you want to dump streamlines into a file as you generate them. This is obviously a valid reason. We definitively want to support writing in "lazy" mode.

One thing we could do is to add a flag to LazyTractogram that tells us whether the streamlines (and other scalar infos) is a generator function or simply an instanciated generator (that can be consumed only once).

matthew-brett added a commit that referenced this issue Jan 12, 2018
MRG: fix #596 First point of `LazyTractogram` is not saved

With this PR, we now assume tractogram objects can be iterated over only once at saving time.

This PR fixes issue #586.
Thanks to @nilgoyette for providing the missing unit test and part of the solution
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants