Skip to content

ENH: PAR/REC data ordering and b-vector normalization #278

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 18 commits into from

Conversation

grlee77
Copy link
Contributor

@grlee77 grlee77 commented Nov 18, 2014

Additional PAR/REC functionality was added (building upon the recent merge of #253, #261 and #264 as implemented by @matthew-brett and @Eric89GXL)

Aside from a couple of minor fixes, there are two new features added:

1.) ability to normalize b-values & b-vectors

The --norm-bvs argument to parrec2nii enables the auto-scaling, otherwise the old behavior is maintained. If all b-vectors were properly normalized to 1.0 and the b-values were stored correctly, this new functionality is not be needed, so it is disabled by default.

I implemented this because I have some datasets where the .PAR header came from a custom pulse sequence where the b-value label was left at 3000 for all volumes (even the b=0 ones!), and the only way to determine the b=0 volumes was that the gradient vector is [0, 0, 0] for these. Scaling the b-values by the norm of the gradient vector gives the proper b-value array. (I did not rescale the b-values/vectors for arrays where the norm was within [1.0 +/- 1e-2] to avoid causing the b-value array to have non-unique but nearly identical b-values such as [2999.99, 3000.01, etc] due to numerical precision of the stored b-vector arrays.)

The non-normalized behavior is still required for other cases, e.g. when using a product DWI sequence with 15 different b-values and requesting only the trace volumes to be stored. In this case, the gradient vectors will all be [0, 0, 0], but the b-values are an array of 15 unique values. If the norm of the gradient vectors had been used, this would have caused the b-value array to return all zeros instead of the appropriate values.

2.) store the labels corresponding to the fourth data dimension
I think something along these lines is extremely important and would be interested on feedback on how best to implement it. For now, I search through a set of potentially varying image_defs keys such as "label type", "echo number", etc. For any that have non-unique values, I maintain a dictionary with the corresponding indices. This dictionary can then be used later by the user to provide custom resorting of the data or reshaping to >4D, etc. I did not implement any custom resorting options within the classes, but for now just write out the dictionary of orderings out to a .json file in parrec2nii and leave it up to the user to resort in the desired manner.

An example of where this is useful can be illustrated for a simple ASL experiment with 8 dynamics and 2 label types (dataobj.shape[3] = 16). Without labeling along the fourth dimension, we will not know which volumes correspond to label type = 1 and which correspond to label type = 2 (i.e. control vs. tag volumes). One logical data ordering for the 16 total volumes is:

{
"dynamic scan number": [
1,
1,
2,
2,
3,
3,
4,
4,
5,
5,
6,
6,
7,
7,
8,
8
],
"label type": [
1,
2,
1,
2,
1,
2,
1,
2,
1,
2,
1,
2,
1,
2,
1,
2
]
}

However depending on how the data was exported, the ordering within the .PAR could equally well have been:

{
"dynamic scan number": [
1,
2,
3,
4,
5,
6,
7,
8,
1,
2,
3,
4,
5,
6,
7,
8
],
"label type": [
1,
1,
1,
1,
1,
1,
1,
1,
2,
2,
2,
2,
2,
2,
2,
2
]
}

Without saving some sort of ordering info along with the .nii, there is no clean way to tell these two cases apart and properly process the resulting .nii files. I doubt json is the optimal way to do this, but it is easily readable and illustrates the desired functionality. I am certainly open to other suggestions. The .json output corresponds to the new PARRECArrayProxy._dim_4_labels property.

The sorted_labels function of the PARRECHeader also has the ability to include per-slice info such as slice angulation, and store this into PARRECArrayProxy._dim_3_4_labels. This allows .PAR/.REC files corresponding to scans such as 3-plane localizers, etc. to have the raw array data be read in along with its corresponding dimension info. These volumes with varying slice angulation still cannot be read into a PARRECImage object, though, because PARRECArrayProxy.get_slice_orientation only allows unique values for slice angulation so that a unique affine can be determined for the PARRECImage.

…d for sorting vector properties. proper 2D ndarray to list of lists for JSON export in parrec2nii
@grlee77
Copy link
Contributor Author

grlee77 commented Nov 18, 2014

fixed bug in PARRECHeader.sorted_labels for non-axial datasets. all tests are now passing

def sort_info_to_lists(sort_info):
sort_info = deepcopy(sort_info)
# convert from elements from numpy array to lists for easy JSON export
for key in sort_info:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

easier to do for key, value in sort_info.items() since you effectively use value a lot below

@larsoner
Copy link
Contributor

Most of my comments are minor style / structure issues, I'll let @matthew-brett address how he wants the API to look (e.g. JSON vs some other format).

@grlee77
Copy link
Contributor Author

grlee77 commented Nov 18, 2014

@Eric89GXL, thanks for the quick and detailed review. All of the suggested changes look reasonable to me. I will wait for any additional comments from @matthew-brett before implementing them.

@larsoner
Copy link
Contributor

It might be slightly better if you implement my changes before @matthew-brett looks, since that way he could look at a closer-to-final version of the PR, but it's not a big deal either way.

@@ -223,6 +249,18 @@ def proc_file(infile, opts):
fid.write('%s ' % val)
fid.write('\n')

opts.dim_info = True # TODO: remove hard-coded value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:) needs doing I guess

@matthew-brett
Copy link
Member

Thanks for doing this. I might have missed it, but there seems to be quite a bit of extra code with not much in the way of tests - is that right?

@grlee77
Copy link
Contributor Author

grlee77 commented Nov 19, 2014

yes, I have not yet added any tests specific to the new functionality. I wanted to see what the general opinion was on adopting the proposed changes before spending time adding tests.

@matthew-brett
Copy link
Member

I am planning to do a release candidate tomorrow, I think it's too late to get this branch in by then, but we should do another release soon in any case. Gregory - do you feel strongly about that, or are you happy to wait a month or so for the next release?

@grlee77
Copy link
Contributor Author

grlee77 commented Nov 19, 2014

Matthew - I am fine with waiting until the next release. I will be traveling most of today and tomorrow with limited internet access so it may take me a day or two to make the requested changes here.

@grlee77
Copy link
Contributor Author

grlee77 commented Nov 19, 2014

Several things changed based on the reviews above:

  1. I changed the name of the PARRECHeader method "sorted_labels" to the hopefully
    clearer "get_volume_labels". Also, in the case that the individual slice
    indices are requested, they are now reshaped to match the data shape along the
    3rd and 4th dimensions.
  2. I removed the private _dim_4_label, etc attributes from the PARRECArrayProxy object so there are no longer any changes proposed for that object.
  3. The parrec2nii code was simplified and now only exports labels for the 4th dimension. Advanced users can still access the slice-wise labels by calling img.header.get_volume_labels(collapse_slices=False)
  4. I added the dimension info command line option to parrec2nii. For now it
    defaults to False. It may be worth instead defaulting to
    False in the case that the there is only a single dynamic variable (or for
    diffusion scans, a combination of bvals/bvecs), but True by default if there
    are multiple dynamic variables that could lead to possible confusion if the
    ordering info is not exported. What do you think?
  5. For compatiblity with .PAR versions < 4.2 the following check was added to
    remove any of the dynamic keys that may not be missing in image_defs:
# remove dynamic keys that may not be present in older .PAR versions
 for key in dynamic_keys:
     if key not in image_defs.dtype.fields:
         dynamic_keys.remove(key)
  1. I added the optional normalize_bvecs argument to
    PARRECHeader.get_q_vectors so that it can pass this argument on to
    PARRECHeader.get_bvals_bvecs.

@grlee77
Copy link
Contributor Author

grlee77 commented Nov 19, 2014

I also added a few basic tests for the new functionality.

@larsoner
Copy link
Contributor

I'd prefer that parrec2nii only output a .nii(.gz) file by default, so I'm -1 on triaging True/False as the default for outputting an additional .txt / .json file based on whether data is 3D or 4D.

I think Py3k throws a warning if the thing you're iterating over (dynamic_keys) is changed during iteration, so it would be good to refactor that code not to change dynamic_keys during the loop.

@larsoner
Copy link
Contributor

Looks like Travis is unhappy, too

@grlee77
Copy link
Contributor Author

grlee77 commented Nov 20, 2014

I changed the offending dynamic_keys loop. The other problem seems to be that np.linalg.norm did not have the axis argument as an option in numpy 1.5.

@larsoner
Copy link
Contributor

You can just compute it manually as np.sqrt(np.sum(x * x, axis=axis)), sometimes it's actually faster anyway

@grlee77
Copy link
Contributor Author

grlee77 commented Feb 13, 2016

This has been abandoned for awhile, but I have just rebased and cleaned it up and would find it very useful to have something along the lines of the data labeling proposed here available in nibabel. There are a mixture of unrelated things implemented here, so I would prefer to close this PR and resubmit as 3 smaller PRs that cleanly divide into 3 independent topics to ease review.
1.) add a scaling = None (first commit above)
2.) b-vector normalization
3.) add option to export data labels for the 4th dimension

@grlee77
Copy link
Contributor Author

grlee77 commented Feb 13, 2016

I may only submit PRs for items 2 and 3 above. Item 1 (basically the first commit here) isn't really needed anymore because one can already use par.dataobj.get_unscaled() and this is what parrec2nii currently does for scaling='off'

@matthew-brett
Copy link
Member

Thanks - please do re-submit - I will do my best to review them quickly.

@grlee77
Copy link
Contributor Author

grlee77 commented Mar 16, 2016

The data labeling code was extracted an resubmitted as #427, so I am closing this now.

I could also submit the b-vector normalization as a separate PR, but I think they should already always be normalized.

@grlee77 grlee77 closed this Mar 16, 2016
matthew-brett added a commit that referenced this pull request Mar 18, 2016
MRG: add a function to retrieve .PAR data labels for the 4th dimension

This was pulled out of the previously submitted #278 to make reviewing easier.  This complements the functionality in the recently merged #409.

The primary enhancement is a new method in the PARREC header called `get_volume_labels` that lists which .PAR fields the data stored along the 4th dimension correspond to.  (i.e. all data >4D is still collapsed into 4 dimensions exactly as before, but labels can be retrieved to make the data ordering unambiguous).

The script interface to parrec can write these labels out as a CSV file with one row per volume.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants