ENH: PAR/REC data ordering and b-vector normalization #278

grlee77 · 2014-11-18T01:36:39Z

Additional PAR/REC functionality was added (building upon the recent merge of #253, #261 and #264 as implemented by @matthew-brett and @Eric89GXL)

Aside from a couple of minor fixes, there are two new features added:

1.) ability to normalize b-values & b-vectors

The --norm-bvs argument to parrec2nii enables the auto-scaling, otherwise the old behavior is maintained. If all b-vectors were properly normalized to 1.0 and the b-values were stored correctly, this new functionality is not be needed, so it is disabled by default.

I implemented this because I have some datasets where the .PAR header came from a custom pulse sequence where the b-value label was left at 3000 for all volumes (even the b=0 ones!), and the only way to determine the b=0 volumes was that the gradient vector is [0, 0, 0] for these. Scaling the b-values by the norm of the gradient vector gives the proper b-value array. (I did not rescale the b-values/vectors for arrays where the norm was within [1.0 +/- 1e-2] to avoid causing the b-value array to have non-unique but nearly identical b-values such as [2999.99, 3000.01, etc] due to numerical precision of the stored b-vector arrays.)

The non-normalized behavior is still required for other cases, e.g. when using a product DWI sequence with 15 different b-values and requesting only the trace volumes to be stored. In this case, the gradient vectors will all be [0, 0, 0], but the b-values are an array of 15 unique values. If the norm of the gradient vectors had been used, this would have caused the b-value array to return all zeros instead of the appropriate values.

2.) store the labels corresponding to the fourth data dimension
I think something along these lines is extremely important and would be interested on feedback on how best to implement it. For now, I search through a set of potentially varying image_defs keys such as "label type", "echo number", etc. For any that have non-unique values, I maintain a dictionary with the corresponding indices. This dictionary can then be used later by the user to provide custom resorting of the data or reshaping to >4D, etc. I did not implement any custom resorting options within the classes, but for now just write out the dictionary of orderings out to a .json file in parrec2nii and leave it up to the user to resort in the desired manner.

An example of where this is useful can be illustrated for a simple ASL experiment with 8 dynamics and 2 label types (dataobj.shape[3] = 16). Without labeling along the fourth dimension, we will not know which volumes correspond to label type = 1 and which correspond to label type = 2 (i.e. control vs. tag volumes). One logical data ordering for the 16 total volumes is:

{
"dynamic scan number": [
1,
1,
2,
2,
3,
3,
4,
4,
5,
5,
6,
6,
7,
7,
8,
8
],
"label type": [
1,
2,
1,
2,
1,
2,
1,
2,
1,
2,
1,
2,
1,
2,
1,
2
]
}

However depending on how the data was exported, the ordering within the .PAR could equally well have been:

{
"dynamic scan number": [
1,
2,
3,
4,
5,
6,
7,
8,
1,
2,
3,
4,
5,
6,
7,
8
],
"label type": [
1,
1,
1,
1,
1,
1,
1,
1,
2,
2,
2,
2,
2,
2,
2,
2
]
}

Without saving some sort of ordering info along with the .nii, there is no clean way to tell these two cases apart and properly process the resulting .nii files. I doubt json is the optimal way to do this, but it is easily readable and illustrates the desired functionality. I am certainly open to other suggestions. The .json output corresponds to the new PARRECArrayProxy._dim_4_labels property.

The sorted_labels function of the PARRECHeader also has the ability to include per-slice info such as slice angulation, and store this into PARRECArrayProxy._dim_3_4_labels. This allows .PAR/.REC files corresponding to scans such as 3-plane localizers, etc. to have the raw array data be read in along with its corresponding dimension info. These volumes with varying slice angulation still cannot be read into a PARRECImage object, though, because PARRECArrayProxy.get_slice_orientation only allows unique values for slice angulation so that a unique affine can be determined for the PARRECImage.

…PARRECArrayProxy

…d for sorting vector properties. proper 2D ndarray to list of lists for JSON export in parrec2nii

grlee77 · 2014-11-18T17:24:25Z

fixed bug in PARRECHeader.sorted_labels for non-axial datasets. all tests are now passing

larsoner · 2014-11-18T17:29:32Z

bin/parrec2nii

+def sort_info_to_lists(sort_info):
+    sort_info = deepcopy(sort_info)
+    # convert from elements from numpy array to lists for easy JSON export
+    for key in sort_info:


easier to do for key, value in sort_info.items() since you effectively use value a lot below

larsoner · 2014-11-18T17:50:16Z

Most of my comments are minor style / structure issues, I'll let @matthew-brett address how he wants the API to look (e.g. JSON vs some other format).

grlee77 · 2014-11-18T18:59:55Z

@Eric89GXL, thanks for the quick and detailed review. All of the suggested changes look reasonable to me. I will wait for any additional comments from @matthew-brett before implementing them.

larsoner · 2014-11-18T20:05:48Z

It might be slightly better if you implement my changes before @matthew-brett looks, since that way he could look at a closer-to-final version of the PR, but it's not a big deal either way.

matthew-brett · 2014-11-19T00:04:29Z

bin/parrec2nii

@@ -223,6 +249,18 @@ def proc_file(infile, opts):
                        fid.write('%s ' % val)
                    fid.write('\n')

+    opts.dim_info = True  # TODO: remove hard-coded value


:) needs doing I guess

matthew-brett · 2014-11-19T00:19:58Z

Thanks for doing this. I might have missed it, but there seems to be quite a bit of extra code with not much in the way of tests - is that right?

grlee77 · 2014-11-19T01:44:27Z

yes, I have not yet added any tests specific to the new functionality. I wanted to see what the general opinion was on adopting the proposed changes before spending time adding tests.

matthew-brett · 2014-11-19T06:38:35Z

I am planning to do a release candidate tomorrow, I think it's too late to get this branch in by then, but we should do another release soon in any case. Gregory - do you feel strongly about that, or are you happy to wait a month or so for the next release?

grlee77 · 2014-11-19T11:23:56Z

Matthew - I am fine with waiting until the next release. I will be traveling most of today and tomorrow with limited internet access so it may take me a day or two to make the requested changes here.

grlee77 · 2014-11-19T19:10:10Z

Several things changed based on the reviews above:

I changed the name of the PARRECHeader method "sorted_labels" to the hopefully
clearer "get_volume_labels". Also, in the case that the individual slice
indices are requested, they are now reshaped to match the data shape along the
3rd and 4th dimensions.
I removed the private _dim_4_label, etc attributes from the PARRECArrayProxy object so there are no longer any changes proposed for that object.
The parrec2nii code was simplified and now only exports labels for the 4th dimension. Advanced users can still access the slice-wise labels by calling img.header.get_volume_labels(collapse_slices=False)
I added the dimension info command line option to parrec2nii. For now it
defaults to False. It may be worth instead defaulting to
False in the case that the there is only a single dynamic variable (or for
diffusion scans, a combination of bvals/bvecs), but True by default if there
are multiple dynamic variables that could lead to possible confusion if the
ordering info is not exported. What do you think?
For compatiblity with .PAR versions < 4.2 the following check was added to
remove any of the dynamic keys that may not be missing in image_defs:

# remove dynamic keys that may not be present in older .PAR versions
 for key in dynamic_keys:
     if key not in image_defs.dtype.fields:
         dynamic_keys.remove(key)

I added the optional normalize_bvecs argument to
PARRECHeader.get_q_vectors so that it can pass this argument on to
PARRECHeader.get_bvals_bvecs.

grlee77 · 2014-11-19T19:13:24Z

I also added a few basic tests for the new functionality.

larsoner · 2014-11-19T21:06:45Z

I'd prefer that parrec2nii only output a .nii(.gz) file by default, so I'm -1 on triaging True/False as the default for outputting an additional .txt / .json file based on whether data is 3D or 4D.

I think Py3k throws a warning if the thing you're iterating over (dynamic_keys) is changed during iteration, so it would be good to refactor that code not to change dynamic_keys during the loop.

larsoner · 2014-11-19T21:07:30Z

Looks like Travis is unhappy, too

grlee77 · 2014-11-20T13:42:13Z

I changed the offending dynamic_keys loop. The other problem seems to be that np.linalg.norm did not have the axis argument as an option in numpy 1.5.

larsoner · 2014-11-20T14:37:34Z

You can just compute it manually as np.sqrt(np.sum(x * x, axis=axis)), sometimes it's actually faster anyway

grlee77 · 2016-02-13T21:25:07Z

This has been abandoned for awhile, but I have just rebased and cleaned it up and would find it very useful to have something along the lines of the data labeling proposed here available in nibabel. There are a mixture of unrelated things implemented here, so I would prefer to close this PR and resubmit as 3 smaller PRs that cleanly divide into 3 independent topics to ease review.
1.) add a scaling = None (first commit above)
2.) b-vector normalization
3.) add option to export data labels for the 4th dimension

grlee77 · 2016-02-13T21:29:58Z

I may only submit PRs for items 2 and 3 above. Item 1 (basically the first commit here) isn't really needed anymore because one can already use par.dataobj.get_unscaled() and this is what parrec2nii currently does for scaling='off'

matthew-brett · 2016-02-13T21:31:14Z

Thanks - please do re-submit - I will do my best to review them quickly.

grlee77 · 2016-03-16T00:01:09Z

The data labeling code was extracted an resubmitted as #427, so I am closing this now.

I could also submit the b-vector normalization as a separate PR, but I think they should already always be normalized.

MRG: add a function to retrieve .PAR data labels for the 4th dimension This was pulled out of the previously submitted #278 to make reviewing easier. This complements the functionality in the recently merged #409. The primary enhancement is a new method in the PARREC header called `get_volume_labels` that lists which .PAR fields the data stored along the 4th dimension correspond to. (i.e. all data >4D is still collapsed into 4 dimensions exactly as before, but labels can be retrieved to make the data ordering unambiguous). The script interface to parrec can write these labels out as a CSV file with one row per volume.

grlee77 added 3 commits November 17, 2014 19:32

docstring/typo fix and allow scaling=None

256d0d9

add option to normalize b-vectors and b-values

85ae8dc

add sorted_labels function to PARRECHeader and corresponding info to …

f5ef5fd

…PARRECArrayProxy

grlee77 mentioned this pull request Nov 18, 2014

parrec: support additional dimensions, permute, etc. #259

Closed

grlee77 added 2 commits November 18, 2014 10:35

add sorted_labels property to FakeHeader in test_parrec.py

b223a89

bugfix to replace np.unique() with _unique_rows() for 2D inputs neede…

994fc58

…d for sorting vector properties. proper 2D ndarray to list of lists for JSON export in parrec2nii

larsoner reviewed Nov 18, 2014
View reviewed changes

matthew-brett reviewed Nov 19, 2014
View reviewed changes

unknown added 5 commits November 19, 2014 10:31

clarify/streamline code based on feedback

23acded

bugfix to newly refactored code

68743ed

bugfix to newly refactored code

28f19a3

parrec: test_header_dimension_labels() added

bb64b08

slight revision to test

816f558

unknown added 3 commits November 19, 2014 11:41

expand test_diffusion_parameters to test normalization case

9933117

fix comment test_diffusion_parameters

c8d946e

added normalize_bvecs option to get_q_vectors

7b04b22

test bugfix: assert_equal to asssert_array_equal in new test

ef45763

trim dynamic_keys via list comprehension

4cb0f0b

unknown and others added 3 commits November 20, 2014 09:59

fix numpy1.5 compatibility of tests

0cbec64

x**2 -> x*x

466ec30

remove another instance of np.linalg.norm for numpy 1.5

a23d976

grlee77 mentioned this pull request Mar 15, 2016

ENH: add a function to retrieve .PAR data labels for the 4th dimension #427

Merged

grlee77 closed this Mar 16, 2016

ENH: PAR/REC data ordering and b-vector normalization #278

ENH: PAR/REC data ordering and b-vector normalization #278

Uh oh!

Conversation

grlee77 commented Nov 18, 2014

Uh oh!

grlee77 commented Nov 18, 2014

Uh oh!

larsoner Nov 18, 2014

Choose a reason for hiding this comment

Uh oh!

larsoner commented Nov 18, 2014

Uh oh!

grlee77 commented Nov 18, 2014

Uh oh!

larsoner commented Nov 18, 2014

Uh oh!

matthew-brett Nov 19, 2014

Choose a reason for hiding this comment

Uh oh!

matthew-brett commented Nov 19, 2014

Uh oh!

grlee77 commented Nov 19, 2014

Uh oh!

matthew-brett commented Nov 19, 2014

Uh oh!

grlee77 commented Nov 19, 2014

Uh oh!

grlee77 commented Nov 19, 2014

Uh oh!

grlee77 commented Nov 19, 2014

Uh oh!

larsoner commented Nov 19, 2014

Uh oh!

larsoner commented Nov 19, 2014

Uh oh!

grlee77 commented Nov 20, 2014

Uh oh!

larsoner commented Nov 20, 2014

Uh oh!

grlee77 commented Feb 13, 2016

Uh oh!

grlee77 commented Feb 13, 2016

Uh oh!

matthew-brett commented Feb 13, 2016

Uh oh!

grlee77 commented Mar 16, 2016

Uh oh!

Uh oh!