-
Notifications
You must be signed in to change notification settings - Fork 262
parrec: support additional dimensions, permute, etc. #259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Upon posting I see that this has substantial overlap with Eric89GXL's pull request #253. I will take a closer look in the coming days and maybe look into just merging parts of this into that branch. I have not yet had a chance to download Eric's branch and try it on the set of .PAR/.REC that I had developed this for. |
@grlee77 yeah, feel free to open a PR into my branch. It does deal with some of the same issues and @matthew-brett designed the main way to deal with some of them (specialized array class). |
Alternatively, I think @matthew-brett intends to open a PR into my branch and then (hopefully) merge into master soon. You could wait a few days / a couple weeks for that to happen and then open a new PR. |
Okay, @Eric89GXL. I think it will probably be easiest if I just wait until @matthew-brett merges your changes into master and then upon a new PR from there. (Being new to github / collaborative development, I neglected to check the pending pull requests before putting in this one). I will close this pull request for now. |
@grlee77 - did you have a chance to look at the current master branch with the merged parrec changes? If you have test data sets that you can release that would be a big help. If you can release them as a github repo with a sharing license like PDDL, that would be ideal. I thought about whether to make 5D, 6D files, etc, but I was worried about the case where there was a volume missing. For example, imagine a dynamic with 10 scans and 3 echos. Imagine the scan stopped after 2 echos. If we usually reconstruct the dataset as 5D (x y slices 10 3) then we'd have to drop all the echos for the last volume, to make (x y slice 9 3), whereas if we store the dataset as (x y slices 30) then we could just drop the lost echo for (x y slices 29). |
Naive related questions here. First, are there formats in One potential option is to raise an error in the use case presented above, or add a reading option (along the lines of |
Nifti handles up to 7D data. nibabel aims to return the image to you in something as close to the original format as practical - so you'll get 7D data arrays from a 7D image, and you can save 7D arrays too. Putting nans in to signal lost data is one option, but it would mean that we couldn't distinguish between a nan for that reason and a nan for another reason (the image had them, for example). We would also have to store the data as floating point, when we would usually use integers and scaling. |
Reopening this one for now, to keep the comments more visible. |
Sorry to be MIA for the last month, but there were a ton of other deadlines at work. I will take a look at the new PAR/REC code merged into master and test it on some of the datasets I have. I will also look into making some additional .PAR/.REC files publicly available for testing. I do not think it should be an issue, but want to confirm before putting anything in a public repo. I agree that the cases with 5D, 6D data, etc. are tricky and situations like @matthew-brett mentioned where there would be missing values in the output array can definitely occur. Aside from truncated scans, this is likely to occur in cases where a single processed image is saved into the .PAR/.REC along with the higher dimensional raw data (e.g. T2* map in a multiecho acquisition, DTI Trace, etc). |
new pull request #278 supersedes this one |
Hello Matthew,
Thank you very much for developing the nibabel package. Here is a pull request to add support for a number of more general .PAR/.REC files that were not previously being handled correctly. A detailed summary is given below.
I can provide some example .PAR/.REC files for ASL-based fMRI and fielmap acquisitions that I used during testing if you are interested.
I hope this will be a useful addition to nibabel. I welcome any feedback on the code.
#1. Additional Data Dimensions
PARRECHeader.get_data_shape_in_file() was rewritten to check a several other potentially varying dimensions of image_def_dtd
Previously only the following 4 labels were considered:
'slice number',
'dynamic scan number'
'gradient orientation number'
'diffusion b value number'
This has been expanded to also include the following four explicit dimensions:
'label type' (used in ASL imaging)
'echo number' (e.g. multi-echo fMRI or dual GRE field map acquisition)
'cardiac phase number'
'type' : ('image_type_mr', 'scanning sequence') : different values reflecting whether the data corresponds to Magnitude, Phase, etc. Two fields are combined here: see notes in section 4 below
A warning is issued if there is more than one value for the following, because the affine will not match all frames:
'image angulation' : e.g. MRA MIPs rendered at a range of angles
'slice orientation' : e.g. for Survey/3-Plane Localizer scans
Note: 'slice orientation' is not an independent dimension from 'slice number'
e.g. in 3-plane loc with 3 slices per plane: '
'slice number' values are [1, 2, 3, 4, 5, 6, 7, 8, 9]
'slice orientation' values are [1, 1, 1, 2, 2, 2, 3, 3, 3]
parrec2nii changes
parrec2nii prints a warning if multiple dimensions were collapsed into the 4th dimension. The dimensions and labels prior to restacking are printed for reference.
An error is raised if either 'image angulation' or 'slice orientation' is non-unique because a single affine cannot be defined for the NIFTI volume.
#2. Sorting Data Dimensions
Previously get_data() would return whatever the Fortran ordering of the data was in the .REC file. In the proposed code, the data shape and stride for each of the 8 dimensions above is determined. If the shape is one, that dimension is squeezed (only exception to this is the 'slice' dimension which is maintained
even for length 1). Data is then permuted to always match the order:
('inplane x', 'inplane y', 'slice', ...)
As NIFTI, etc don't support saving as many as 8 dimensions image.get_data() and image.header.shape use reshaped data with all dimensions >4 folded into the 4th dimension (order='F')
To get the permuted data array with the full set of dimensions, use:
image.get_data_unraveled()
To get all dimensions prior to any permutation (e.g. reflecting the raw ordering in the .REC), use:
image.raw_data_from_fileobj (unscaled by slope, intercept)
image.data_from_fileobj (scaled)
In addition to the data itself, the methods get_data_unraveled, raw_data_from_fileobj, and data_from_fileobj also return a tuple with labels for each dimension.
Implementation
The data dimensions and strides are determined within:
PARRECHeader.get_data_shape_in_file()
arrayproxy.PARArrayProxy was defined as a subclass of arrayproxy.ArrayProxy that supports permutation & reshaping
new attributes were added to the PARRECHeader:
image.header.original_shape : stores the original shape prior to any permutation or reshaping
image.header.original_labels : tuple of string labels corresponding to the dimensions in original_shape
image.header.requires_permute : boolean reflecting whether data permutation was required
image.header.permuted_shape : stores the shape after permutation to the desired order, but prior to reshape down to 4D
image.header.permuted_labels : tuple of string labels corresponding to the dimensions in permuted_shape
TODO: The order along each dimension still reflects what is in the .PAR. May want to auto sort slices to ascending, etc...
#3. Support for Non-uniform scale, intercept
Some datasets have different data scaling for different frames within the .PAR. A commmon instance of this is in field map acquisitions where the magnitude and phase data are both stored within the .REC. Each channel will have its own scaling factors.
PARRECHeader.get_data_scaling() determines the scale & intercept ndarrays if there are multiple scale factors detected. It also applies any permutations that were done to the data to these scale factors as well before returning them. The first two dimensions of each array are size 1 so they are broadcastable
across the full in-plane frame.
volumeutils.apply_read_scaling() was modified to work with ndarray inputs for scale, inter
For now, parrec2nii was modified to raise an error if the scale or intercept are non-unique.
TODO?: update parrec2nii to split out to separate NIFTI files for each scale?
#4. Misc comments
Image/Sequence Type
Handling of 'image_type_mr' & 'scanning sequence' is a bit tricky:
For a .PAR corresponding to fieldmap data with magnitude and phase saved:
unique values for image_type_mr were: [0, 3] # magnitude/phase
unique values for scanning sequence were: [2, 4] # meaning?
However, these are not independent data dimensions. There are only two unique combinations present in the file. For this reason, they are collapsed into one dimension with size equal to the number of unique pairs
However, they do not always vary together in all cases.
For a PCA angio dataset I looked at:
unique values for image_type_mr were: [0]
unique values for scanning sequence were: [2, 4]
Sorting along each dimension
Although, permutation of dimensions was add, reordering of slices or other dimensions into sequential order rather than the order in the .PAR is not currently being done.
Pre V4.2 .PAR files
The 'label type' dimension did not exist prior to V4.2. Removal of the 'label type' tag was added for that case, but has not been tested