-
Notifications
You must be signed in to change notification settings - Fork 262
MRG: implementing / testing get_fdata #551
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
09e1360
WIP: implement get_fdata
matthew-brett 5cd1994
More proxy tests
matthew-brett b5343c1
DOC: typos spotted by Chris M
matthew-brett 178cae2
TST: restore test of invalid caching string
matthew-brett c800493
RF+TST: make caches specific to data type
matthew-brett 33138b5
STY: fix anal PEP8 complaint
matthew-brett File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -41,6 +41,7 @@ def __init__(self, dataobj, header=None, extra=None, file_map=None): | |
super(DataobjImage, self).__init__(header=header, extra=extra, | ||
file_map=file_map) | ||
self._dataobj = dataobj | ||
self._fdata_cache = None | ||
self._data_cache = None | ||
|
||
@property | ||
|
@@ -55,7 +56,19 @@ def _data(self): | |
return self._dataobj | ||
|
||
def get_data(self, caching='fill'): | ||
""" Return image data from image with any necessary scalng applied | ||
""" Return image data from image with any necessary scaling applied | ||
|
||
.. WARNING:: | ||
|
||
We recommend you use the ``get_fdata`` method instead of the | ||
``get_data`` method, because it is easier to predict the return | ||
data type. We will deprecate the ``get_data`` method around April | ||
2018, and remove it around April 2020. | ||
|
||
If you don't care about the predictability of the return data type, | ||
and you want the minimum possible data size in memory, you can | ||
replicate the array that would be returned by ``img.get_data()`` by | ||
using ``np.asanyarray(img.dataobj)``. | ||
|
||
The image ``dataobj`` property can be an array proxy or an array. An | ||
array proxy is an object that knows how to load the image data from | ||
|
@@ -125,7 +138,7 @@ def get_data(self, caching='fill'): | |
(no reference to the array). If the cache is full, "unchanged" leaves | ||
the cache full and returns the cached array reference. | ||
|
||
The cache can effect the behavior of the image, because if the cache is | ||
The cache can affect the behavior of the image, because if the cache is | ||
full, or you have an array image, then modifying the returned array | ||
will modify the result of future calls to ``get_data()``. For example | ||
you might do this: | ||
|
@@ -191,11 +204,160 @@ def get_data(self, caching='fill'): | |
self._data_cache = data | ||
return data | ||
|
||
def get_fdata(self, caching='fill', dtype=np.float64): | ||
""" Return floating point image data with necessary scaling applied | ||
|
||
The image ``dataobj`` property can be an array proxy or an array. An | ||
array proxy is an object that knows how to load the image data from | ||
disk. An image with an array proxy ``dataobj`` is a *proxy image*; an | ||
image with an array in ``dataobj`` is an *array image*. | ||
|
||
The default behavior for ``get_fdata()`` on a proxy image is to read | ||
the data from the proxy, and store in an internal cache. Future calls | ||
to ``get_fdata`` will return the cached array. This is the behavior | ||
selected with `caching` == "fill". | ||
|
||
Once the data has been cached and returned from an array proxy, if you | ||
modify the returned array, you will also modify the cached array | ||
(because they are the same array). Regardless of the `caching` flag, | ||
this is always true of an array image. | ||
|
||
Parameters | ||
---------- | ||
caching : {'fill', 'unchanged'}, optional | ||
See the Notes section for a detailed explanation. This argument | ||
specifies whether the image object should fill in an internal | ||
cached reference to the returned image data array. "fill" specifies | ||
that the image should fill an internal cached reference if | ||
currently empty. Future calls to ``get_fdata`` will return this | ||
cached reference. You might prefer "fill" to save the image object | ||
from having to reload the array data from disk on each call to | ||
``get_fdata``. "unchanged" means that the image should not fill in | ||
the internal cached reference if the cache is currently empty. You | ||
might prefer "unchanged" to "fill" if you want to make sure that | ||
the call to ``get_fdata`` does not create an extra (cached) | ||
reference to the returned array. In this case it is easier for | ||
Python to free the memory from the returned array. | ||
dtype : numpy dtype specifier | ||
A numpy dtype specifier specifying a floating point type. Data is | ||
returned as this floating point type. Default is ``np.float64``. | ||
|
||
Returns | ||
------- | ||
fdata : array | ||
Array of image data of data type `dtype`. | ||
|
||
See also | ||
-------- | ||
uncache: empty the array data cache | ||
|
||
Notes | ||
----- | ||
All images have a property ``dataobj`` that represents the image array | ||
data. Images that have been loaded from files usually do not load the | ||
array data from file immediately, in order to reduce image load time | ||
and memory use. For these images, ``dataobj`` is an *array proxy*; an | ||
object that knows how to load the image array data from file. | ||
|
||
By default (`caching` == "fill"), when you call ``get_fdata`` on a | ||
proxy image, we load the array data from disk, store (cache) an | ||
internal reference to this array data, and return the array. The next | ||
time you call ``get_fdata``, you will get the cached reference to the | ||
array, so we don't have to load the array data from disk again. | ||
|
||
Array images have a ``dataobj`` property that already refers to an | ||
array in memory, so there is no benefit to caching, and the `caching` | ||
keywords have no effect. | ||
|
||
For proxy images, you may not want to fill the cache after reading the | ||
data from disk because the cache will hold onto the array memory until | ||
the image object is deleted, or you use the image ``uncache`` method. | ||
If you don't want to fill the cache, then always use | ||
``get_fdata(caching='unchanged')``; in this case ``get_fdata`` will not | ||
fill the cache (store the reference to the array) if the cache is empty | ||
(no reference to the array). If the cache is full, "unchanged" leaves | ||
the cache full and returns the cached array reference. | ||
|
||
The cache can effect the behavior of the image, because if the cache is | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. affect |
||
full, or you have an array image, then modifying the returned array | ||
will modify the result of future calls to ``get_fdata()``. For example | ||
you might do this: | ||
|
||
>>> import os | ||
>>> import nibabel as nib | ||
>>> from nibabel.testing import data_path | ||
>>> img_fname = os.path.join(data_path, 'example4d.nii.gz') | ||
|
||
>>> img = nib.load(img_fname) # This is a proxy image | ||
>>> nib.is_proxy(img.dataobj) | ||
True | ||
|
||
The array is not yet cached by a call to "get_fdata", so: | ||
|
||
>>> img.in_memory | ||
False | ||
|
||
After we call ``get_fdata`` using the default `caching` == 'fill', the | ||
cache contains a reference to the returned array ``data``: | ||
|
||
>>> data = img.get_fdata() | ||
>>> img.in_memory | ||
True | ||
|
||
We modify an element in the returned data array: | ||
|
||
>>> data[0, 0, 0, 0] | ||
0.0 | ||
>>> data[0, 0, 0, 0] = 99 | ||
>>> data[0, 0, 0, 0] | ||
99.0 | ||
|
||
The next time we call 'get_fdata', the method returns the cached | ||
reference to the (modified) array: | ||
|
||
>>> data_again = img.get_fdata() | ||
>>> data_again is data | ||
True | ||
>>> data_again[0, 0, 0, 0] | ||
99.0 | ||
|
||
If you had *initially* used `caching` == 'unchanged' then the returned | ||
``data`` array would have been loaded from file, but not cached, and: | ||
|
||
>>> img = nib.load(img_fname) # a proxy image again | ||
>>> data = img.get_fdata(caching='unchanged') | ||
>>> img.in_memory | ||
False | ||
>>> data[0, 0, 0] = 99 | ||
>>> data_again = img.get_fdata(caching='unchanged') | ||
>>> data_again is data | ||
False | ||
>>> data_again[0, 0, 0, 0] | ||
0.0 | ||
""" | ||
if caching not in ('fill', 'unchanged'): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Want to test |
||
raise ValueError('caching value should be "fill" or "unchanged"') | ||
dtype = np.dtype(dtype) | ||
if not issubclass(dtype.type, np.inexact): | ||
raise ValueError('{} should be floating point type'.format(dtype)) | ||
# Return cache if cache present and of correct dtype. | ||
if self._fdata_cache is not None: | ||
if self._fdata_cache.dtype.type == dtype.type: | ||
return self._fdata_cache | ||
data = np.asanyarray(self._dataobj).astype(dtype) | ||
if caching == 'fill': | ||
self._fdata_cache = data | ||
return data | ||
|
||
@property | ||
def in_memory(self): | ||
""" True when array data is in memory | ||
""" True when any array data is in memory cache | ||
|
||
There are separate caches for `get_data` reads and `get_fdata` reads. | ||
This property is True if either of those caches are set. | ||
""" | ||
return (isinstance(self._dataobj, np.ndarray) or | ||
self._fdata_cache is not None or | ||
self._data_cache is not None) | ||
|
||
def uncache(self): | ||
|
@@ -206,23 +368,24 @@ def uncache(self): | |
* *array images* where the data ``img.dataobj`` is an array | ||
* *proxy images* where the data ``img.dataobj`` is a proxy object | ||
|
||
If you call ``img.get_data()`` on a proxy image, the result of reading | ||
If you call ``img.get_fdata()`` on a proxy image, the result of reading | ||
from the proxy gets cached inside the image object, and this cache is | ||
what gets returned from the next call to ``img.get_data()``. If you | ||
what gets returned from the next call to ``img.get_fdata()``. If you | ||
modify the returned data, as in:: | ||
|
||
data = img.get_data() | ||
data = img.get_fdata() | ||
data[:] = 42 | ||
|
||
then the next call to ``img.get_data()`` returns the modified array, | ||
then the next call to ``img.get_fdata()`` returns the modified array, | ||
whether the image is an array image or a proxy image:: | ||
|
||
assert np.all(img.get_data() == 42) | ||
assert np.all(img.get_fdata() == 42) | ||
|
||
When you uncache an array image, this has no effect on the return of | ||
``img.get_data()``, but when you uncache a proxy image, the result of | ||
``img.get_data()`` returns to its original value. | ||
``img.get_fdata()``, but when you uncache a proxy image, the result of | ||
``img.get_fdata()`` returns to its original value. | ||
""" | ||
self._fdata_cache = None | ||
self._data_cache = None | ||
|
||
@property | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason why float64 over float32 is the default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@satra - the default up till now has been to return float64 for images with scaling. Also, some images do have float64 and it seems a shame to downgrade them with this call. So float32 seems to me like an option rather than a default.