-
Notifications
You must be signed in to change notification settings - Fork 262
Streamlines API #243
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Streamlines API #243
Conversation
@MrBago - any thoughts on this? |
Hello nibabel devs. Any feedback on this? It's been here from July. The way we save trk files now it's ugly and the current API does not suit with the rest of nibabel. This PR I think is a great improvement. I hope you can have a look at this soon. Dipy would greatly benefit from this as we use the trk format all the time. Thx! |
Sorry - I was hoping for some feedback from streamlines users. @MrBago - any thoughts? I will have a look in the next few days. |
This looks good to me overall and we've needed to improve this for some time now so thanks for working on this. In my opinion, the hardest part of doing this right is dealing with the coordinate systems. Specifically I think that if a user saves a '.trk' file, for example, they expect the points to be remapped so that "Trackvis" will display the streamlines correctly with respect to their background images. I know this is a PITA and I agree that this is a flaw in the ".trk" format (the format should use the affine information in the header to figure out the coordinate system, but that's just not how this format is used). I don't think points need to get remapped when they are read, as long there is a consistent way of getting the coordinate information like the I guess my point is that I'd like to push the abstraction here just a little bit more so that users can save and load files without knowing anything about the implementation of the specific file format they are using. I think you've already 90% of the way there, but we should really get to this tipping point because it'll make the code that much better. |
@MarcCote, what's the status of this PR? Can I help in some way? |
I'm not convince yet that '.trk' should always be saved as "Trackvis" wants them. Maybe for 'trk' and 'tck' format it is clear they are associated to 'Trackvis' and 'MRtrix' but some softwares save their streamlines using the 'vtk' format (e.g. ExploreDTI) and it is not clear what space their streamlines live in. That is why I preferred loading/saving streamlines as is and was planning on using name of the software to create transformation functions for streamlines: As proposed by @jchoude, maybe it is better to modify the @MrBago what do you think about the last approach? |
I think the goal in support existing file formats (especially file formats that seems kind of half baked to begin with) is interoperability with other software packages. To that end, I think we should strive to create files that conform to written specs and unwritten norms that have formed for the given extension types. For example if we choose to write ".tck" files, I think those files should be usable in any software that knows about and can use ".tck" files. We don't have to support all the features of a file format, but I don't think we should write files which will be misinterpreted by other software tools. Also when reading streamlines, I think a user should be able to use the streamlines without knowing what type of file they came from. Much the way that nibabel current allows me to read an analyze, nifti or MGH image and then use the image object pretty much the same way. Please let me know if I've misunderstood what you're proposing. |
Knock! Knock! |
On Mon, Sep 29, 2014 at 7:50 AM, Marc-Alexandre Côté <
Sorry it took me so long to get back to you. |
Sorry, I didn't realize you were waiting for my reply. I agree with what you said. Using a consistent internal representation is the approach I used when implementing the TractConverter. It required having to specify a reference point either as an affine matrix, a nifti image or the convention's name (e.g. LPS) so we can project the streamlines in a predetermined space. Some users found it a little bit confusing because depending on the format you don't have to specify a reference point (TRK vs TCK). We should determine and use an internal representation. What is the most convenient representation to manipulate streamlines when used in Dipy? Voxel or mm, RAS or something else, the coordinate (0,0,0) represents the center of voxel or one of its corner. |
In dipy we've been using an affine to represent the mapping between voxel Bago |
This sounds similar to reading and writing csv files with different encodings, delimiters, etc. Similar to loading volume data one could define functions like get_affine(), get_vertex_values(), get_streamline_values(), etc. whatever the file formats support. I guess, when writing streamline files we can specify to be compatible with some common software packages and when reading files there might be guesswork. If the file format does not tell which coordinate system it uses one has to guess or follow a given dialect. And it would be great to have a standard way to transform coordinates after loading. Personally, for vtk and tck files I would prefer the world coordinate system (because this is what e.g. ParaView and mrview use) but internally we should get voxel coordinates for best interaction with volumes. |
b55e6c8
to
229859a
Compare
Following @MrBago advices, I modified the streamlines API so Specifically, when loading from a file, streamlines will be projected in voxel space. I agree with @Jan-Schreiber that interactions with volumes are simpler that way (which is probably the main reason someone would want to load streamlines using NiBabel). It worth mentioning that in voxel space, the X coordinate of any streamlines point will lie between On another point, I'm starting to add support for other streamlines file format (e.g. TCK) and I was wondering what metadata should be kept when converting streamlines from one format to another? Right now, a |
Just to let you guys know how things work in dipy, streamlines are allowed to be represented in an arbitrary coordinate system. Dipy functions that require streamlines to interact with volumes require the streamlines and the affine which maps the volume voxels to the the streamlines coordinate system. For example, if the streamlines are in the (RAS+) coordinate system of a nifti file, than the affine of the nifti file should be used. Dipy tracking classes also take a data volume and an affine. They return the streamlines in the affine coordinate system. IE a point in the center of voxel I think on reading returning the streamlines in voxel-coordinates is ok, but I think returning them in any other coordinate system would also be fine as long as the affine is also returned. Does this PR require that streamlines to be saved are moved to voxel coordinates before they are given to the save functions? |
Yes, I agree with you, returning streamlines in any coordinate system would be fine as long you provide the affine matrix (volume voxel coordinates -> streamlines coordinates). I just think having them already in voxel coordinates would be simpler for interacting with volume (no need to apply the inverse affine matrix). Right now, Also, streamlines can be created from a list of points if desired (no need to be saved). For example, creating 10 random streamlines each having 20 points in 3D would look like this >>> s = Streamlines(points=np.random.rand(10, 20, 3))
>>> s.header[Field.VOXEL_TO_WORLD]
array([[ 1., 0., 0., 0.],
[ 0., 1., 0., 0.],
[ 0., 0., 1., 0.],
[ 0., 0., 0., 1.]])
>>> len(s)
10 So Streamlines objects are not force to be in voxel space and contain an affine matrix in their header (default is identity matrix). However, Whenever you have time I would appreciate some feedback about the two main files Do you think it is a good approach having a |
Will take a closer look next week. It's a pretty hefty PR, 2000+ lines but Bago On Wed, Mar 4, 2015 at 10:56 AM, Marc-Alexandre Côté <
|
@MrBago do you know if TrackVis (and the diffusion toolkit) considers the coordinate (0,0,0) of a streamline's point to be in the middle of the voxel or the corner? I know that MRtrix uses the former convention. |
Trackvis uses it as the corner, ie all negative points are outside the Bago On Thu, Mar 19, 2015, 8:38 AM Marc-Alexandre Côté [email protected]
|
@MrBago I refactored a lot of code and added a lot of comments. I'm sure it will be easier to read as it is now. Now, a |
@matthew-brett how can I see what coverage I'm missing exactly (which lines). When I click on the details link for "coverage/coverall", it brings me to a page where I can that the file |
@matthew-brett I struggled with that problem for a while trying symlinks and other solutions, the only solution seemed to be running from the package source directory. So for mne-python we run all tests except one Travis run from the source directory, and that last test we don't do a coverage report, but it checks to make sure everything still works as an installed package. @MarcCote you should be able to do something like |
# voxel whereas streamlines returned assume (0,0,0) to be the | ||
# center of the voxel. Thus, streamlines are shifted of half | ||
#a voxel. | ||
pts -= np.array(self.header[Field.VOXEL_SIZES])/2. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this right? after the points have been divided by VOXEL_SIZE
don't you just want to do -.5
?
I think you want (pts - voxel_size / 2) / voxel_size
or ( pts / voxel_size - .5)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, good catch, you are definitively right. Clearly, it is missing unit tests!
Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For IO stuff I like to include a write-read test to at least make sure that
I can at least recreate in python what I started with :).
On Thu, May 14, 2015 at 12:58 PM Marc-Alexandre Côté <
[email protected]> wrote:
In nibabel/streamlines/trk.py
#243 (comment):
break
# Read number of points of the next streamline.
nb_pts = struct.unpack(i4_dtype.str[:-1], nb_pts_str)[0]
# Read streamline's data
pts, scalars = read_pts_and_scalars(nb_pts)
properties = read_properties()
# TRK's streamlines are in 'voxelmm' space, we send them to voxel space.
pts = pts / self.header[Field.VOXEL_SIZES]
# TrackVis considers coordinate (0,0,0) to be the corner of the
# voxel whereas streamlines returned assume (0,0,0) to be the
# center of the voxel. Thus, streamlines are shifted of half
#a voxel.
pts -= np.array(self.header[Field.VOXEL_SIZES])/2.
Yep, you are definitively right. Clearly, it is missing unit tests!
I'll go with the -.5 fix then.
—
Reply to this email directly or view it on GitHub
https://github.com/nipy/nibabel/pull/243/files#r30356542.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I have that test but with an identity as the affine, so np.array(self.header[Field.VOXEL_SIZES])/2 == 0.5
.
Guys - where are we on this one? Bago - would you be interested in taking charge of this one, and merging when you think it's ready? |
cc26ae6
to
6aa94a9
Compare
I've been improving and refactoring the API a lot and I'm afraid this PR is becoming too big for a thorough review. I propose to split it in two: 1) Tractogram TractogramFile |
def _read(): | ||
for pts, scals, props in trk_reader: | ||
data_for_points = {k: scals[:, v] for k, v in data_per_point_slice.items()} | ||
data_for_streamline = {k: props[v] for k, v in data_per_streamline_slice.items()} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For Python 2.6 compatibility:
data_for_points = dict((k, scals[:, v]) for k, v in data_per_point_slice.items())
data_for_streamline = dict((k, props[v]) for k, v in data_per_streamline_slice.items())
@MarcCote this is an excellent work. But I think it is not a good idea to push this completely new design in this older PR because now it has changed some much and most of the previous discussions are not relevant any more. That makes reviewing this PR hard. I think to help the reviewers it will be better if you make a new PR (and close this one) and write in the description which are the main new features of this PR and the specific design decisions with a short explanation. For example, now we go to RAS+ by default, why?. Also we introduce the concept of a Tractogram which is a higher level of streamlines where you can save more info on the streamlines but we don't save header information there. This goes to the TractogramFile etc. But why we do such a thing? This will help a lot. Stay laconic but give the correct info please. This PR is absolutely crucial for the work we do in dMRI and it will be used by many people. It also solves an important memory/speed/management issue in Dipy. So, please @jchoude, @mdesco, @gabknight, @MrBago, @Jan-Schreiber, @martcous and @arokem try this PR and give some nice reviews asap so we can finally have a stable way of reading and saving streamline-related files. We really need to get better at this ... Also, about splitting in two files maybe it's not a good idea because it will make it difficult to try this with real data. The PR is indeed quite large but I don't see an easy way of splitting it and trying it with real data. Finally, @matthew-brett we worked together with @MarcCote to make this PR and we tried to make the API look as close as what has been done for the Nifti files so that the API looks homogeneous. This PR should be merged before the 0.11 Dipy release because many new PRs depend on this new API which as I wrote above simplifies many current issues both of file formats for tractography and internal Dipy design issues. So, I would hope that it can be prioritized and be merged relatively soon. Of course with some quality reviews first. But as you can see by yourself @MarcCote has already done some excellent work here. The day were loading/saving streamlines is as easy as loading/saving nifti files is closer than ever :) 👍 |
except: | ||
return False | ||
|
||
return True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will consume generators:
>>> def f():
... for i in range(5):
... yield i
...
>>> x = f()
>>> isiterable(x)
True
>>> list(x)
[]
What about:
def isiterable(val):
try:
iter(val)
except TypeError:
return False
return True
Edit: Apparently you can also just import collections
and use isinstance(val, collections.Iterable)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. However, isinstance(val, collections.Iterable)
simply checks if the object has an __iter__
method instead of testing that you can actually iterate through it. The same goes for testing that calling iter
is not failling.
I will agree that isiterable
might not be the right name for this function but I lack a better one right now (if you have suggestion let me know). I use this isiterable
function to loop through all elements of a given object, catching exception if any (could happen for my objects).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. I didn't look deeply into how it was being used. Maybe something like check_iteration
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like it. I replaced isiterable
by check_iteration
.
Ah yes, playing with different streamline format is indeed a horrible experience (and I just started playing with it), so prioritising this instead of playing with a ton of converters is a good idea. |
Agreeing with @Garyfallidis, I'm closing this one. See PR #391 instead.
|
This is an attempt at providing a general API to manage different streamlines file format. This is related to the BIAP5.
Right now, only the TRK format is supported but I can easily add support for both TCK and VTK. One difference you might found with
nibabel.trackvis
is that it loads the streamlines as they were saved. There is nopoints_space
argument but instead some functions should be used to change the space where streamlines live:change_space(streamlines, ref=affine/nifti)
,trackvis2mrtrix(streamlines, ref=affine/nifti)
, etc.I can't wait to know what you guys think about this. At some point, I really see the TractConverter being part of NiBabel.