Skip to content

incorrect datatype in GIFTI output (INT64 instead of INT32) #792

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
hyperbolicTom opened this issue Aug 15, 2019 · 2 comments
Closed

incorrect datatype in GIFTI output (INT64 instead of INT32) #792

hyperbolicTom opened this issue Aug 15, 2019 · 2 comments
Milestone

Comments

@hyperbolicTom
Copy link

hyperbolicTom commented Aug 15, 2019

Having an issue with NODE_INDEX. To reproduce:

#! /usr/bin/env python

import sys
import numpy as np
import nibabel

print("nibabel.__version__ =", nibabel.__version__)

idx = np.arange(10)

da = nibabel.gifti.GiftiDataArray(idx,
                                  intent = 'NIFTI_INTENT_NODE_INDEX',
                                  datatype = 'NIFTI_TYPE_INT32')

gim = nibabel.gifti.gifti.GiftiImage(darrays = [da])
nibabel.save(gim, "idx.gii")

gim = nibabel.load("idx.gii")
print(gim.darrays[0].data)

da = nibabel.gifti.GiftiDataArray(idx, encoding = 'GIFTI_ENCODING_ASCII',
                                  intent = 'NIFTI_INTENT_NODE_INDEX',
                                  datatype = 'NIFTI_TYPE_INT32')

gim = nibabel.gifti.gifti.GiftiImage(darrays = [da])
nibabel.save(gim, "idx.gii")

gim = nibabel.load("idx.gii")
print(gim.darrays[0].data)

What I get here, on a 64-bit machine:

nibabel.__version__ = 3.0.0dev
[0 0 1 0 2 0 3 0 4 0 5 0 6 0 7 0 8 0 9 0]
[0 1 2 3 4 5 6 7 8 9]

The problem is that internally, the node index is stored as an int64 array, and although the output is specified as being INT32, it isn't converted before .tostring() is called.

Here's a fix:

diff --git a/nibabel/gifti/gifti.py b/nibabel/gifti/gifti.py
index 22d6449e..e1149647 100644
--- a/nibabel/gifti/gifti.py
+++ b/nibabel/gifti/gifti.py
@@ -464,7 +464,7 @@ class GiftiDataArray(xml.XmlSerializable):
         # write data array depending on the encoding
         dt_kind = data_type_codes.dtype[self.datatype].kind
         data_array.append(
-            _data_tag_element(self.data,
+            _data_tag_element(self.data.astype(dt_kind),
                               gifti_encoding_codes.specs[self.encoding],
                               KIND2FMT[dt_kind],
                               self.ind_ord))

Perhaps the bug is that self.data is int64 already, but I'm not sure about that.
Anyway with this change the output is the same as with ASCII encoding.

@effigies effigies added the bug label Aug 15, 2019
@effigies effigies added this to the 2.5.1 milestone Aug 15, 2019
@effigies
Copy link
Member

Hi @hyperbolicTom. Thanks for the report. Would you be interested in submitting a PR?

It would be nice to start out with some tests that attempt to create some DataArrays whose inputs match and don't match their datatype, and making sure that writing either raises an exception (e.g., if you have floats that can't be safely coerced to int) or produces a file that reads consistently with the datatype code.

Once the tests are in place, and the failures are where we expect, then we can apply a fix. I think (but it's better to use tests to verify this) that probably the better way to make this fix is to adjust the datatype parameter to take the actual datatype, as opposed to KIND2FMT[data_type_codes[self.datatype].kind]. Then, if enclabel is ASCII, it uses the format string, and if it's B64BIN or B64GZ, then it uses astype(data_type_codes[self.datatype]).tostring() to coerce into an appropriately serializable form.

def _data_tag_element(dataarray, encoding, datatype, ordering):
""" Creates data tag with given `encoding`, returns as XML element
"""
import zlib
ord = array_index_order_codes.npcode[ordering]
enclabel = gifti_encoding_codes.label[encoding]
if enclabel == 'ASCII':
da = _arr2txt(dataarray, datatype)
elif enclabel in ('B64BIN', 'B64GZ'):
out = dataarray.tostring(ord)
if enclabel == 'B64GZ':
out = zlib.compress(out)
da = base64.b64encode(out).decode()
elif enclabel == 'External':
raise NotImplementedError("In what format are the external files?")
else:
da = ''
data = xml.Element('Data')
data.text = da
return data

@effigies
Copy link
Member

Closed in #806.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants