|
| 1 | +################# |
| 2 | +Images and memory |
| 3 | +################# |
| 4 | + |
| 5 | +We saw in :doc:`nibabel_images` that images loaded from disk are usually |
| 6 | +*proxy images*. Proxy images are images that have a ``dataobj`` property that |
| 7 | +is not a numpy array, but an *array proxy* that can fetch the array data from |
| 8 | +disk. |
| 9 | + |
| 10 | +>>> import os |
| 11 | +>>> import numpy as np |
| 12 | +>>> from nibabel.testing import data_path |
| 13 | +>>> example_file = os.path.join(data_path, 'example4d.nii.gz') |
| 14 | + |
| 15 | +>>> import nibabel as nib |
| 16 | +>>> img = nib.load(example_file) |
| 17 | +>>> img.dataobj |
| 18 | +<nibabel.arrayproxy.ArrayProxy object at ...> |
| 19 | + |
| 20 | +Nibabel does not load the image array from the proxy when you ``load`` the |
| 21 | +image. It waits until you ask for the array data. The standard way to ask |
| 22 | +for the array data is to call the ``get_data()`` method: |
| 23 | + |
| 24 | +>>> data = img.get_data() |
| 25 | +>>> data.shape |
| 26 | +(128, 96, 24, 2) |
| 27 | + |
| 28 | +We also saw in :ref:`proxies-caching` that this call to ``get_data()`` will |
| 29 | +(by default) load the array data into an internal image cache. The image |
| 30 | +returns the cached copy on the next call to ``get_data()``: |
| 31 | + |
| 32 | +>>> data_again = img.get_data() |
| 33 | +>>> data is data_again |
| 34 | +True |
| 35 | + |
| 36 | +This behavior is convenient if you want quick and repeated access to the image |
| 37 | +array data. The down-side is that the image keeps a reference to the image |
| 38 | +data array, so the array can't be cleared from memory until the image object |
| 39 | +gets deleted. You might prefer to keep loading the array from disk instead of |
| 40 | +keeping the cached copy in the image. |
| 41 | + |
| 42 | +This page describes ways of using the image array proxies to save memory and |
| 43 | +time. |
| 44 | + |
| 45 | +*************************************************** |
| 46 | +Using ``in_memory`` to check the state of the cache |
| 47 | +*************************************************** |
| 48 | + |
| 49 | +You can use the ``in_memory`` property to check if the image has cached the |
| 50 | +array. |
| 51 | + |
| 52 | +The ``in_memory`` property is always True for array images, because the image |
| 53 | +data is always an array in memory: |
| 54 | + |
| 55 | +>>> array_data = np.arange(24, dtype=np.int16).reshape((2, 3, 4)) |
| 56 | +>>> affine = np.diag([1, 2, 3, 1]) |
| 57 | +>>> array_img = nib.Nifti1Image(array_data, affine) |
| 58 | +>>> array_img.in_memory |
| 59 | +True |
| 60 | + |
| 61 | +For a proxy image, the ``in_memory`` property is False when the array is not |
| 62 | +in cache, and True when it is in cache: |
| 63 | + |
| 64 | +>>> img = nib.load(example_file) |
| 65 | +>>> img.in_memory |
| 66 | +False |
| 67 | +>>> data = img.get_data() |
| 68 | +>>> img.in_memory |
| 69 | +True |
| 70 | + |
| 71 | + |
| 72 | +***************** |
| 73 | +Using ``uncache`` |
| 74 | +***************** |
| 75 | + |
| 76 | +As y'all know, the proxy image has the array in cache, ``get_data()`` returns |
| 77 | +the cached array: |
| 78 | + |
| 79 | +>>> data_again = img.get_data() |
| 80 | +>>> data_again is data # same array returned from cache |
| 81 | +True |
| 82 | + |
| 83 | +You can uncache a proxy image with the ``uncache()`` method: |
| 84 | + |
| 85 | +>>> img.uncache() |
| 86 | +>>> img.in_memory |
| 87 | +False |
| 88 | +>>> data_once_more = img.get_data() |
| 89 | +>>> data_once_more is data # a new copy read from disk |
| 90 | +False |
| 91 | + |
| 92 | +``uncache()`` has no effect if the image is an array image, or if the cache is |
| 93 | +already empty. |
| 94 | + |
| 95 | +You need to be careful when you modify arrays returned by ``get_data()`` on |
| 96 | +proxy images, because ``uncache`` will then change the result you get back |
| 97 | +from ``get_data()``: |
| 98 | + |
| 99 | +>>> proxy_img = nib.load(example_file) |
| 100 | +>>> data = proxy_img.get_data() # array cached and returned |
| 101 | +>>> data[0, 0, 0, 0] |
| 102 | +0 |
| 103 | +>>> data[0, 0, 0, 0] = 99 # modify returned array |
| 104 | +>>> data_again = proxy_img.get_data() # return cached array |
| 105 | +>>> data_again[0, 0, 0, 0] # cached array modified |
| 106 | +99 |
| 107 | + |
| 108 | +So far the proxy image behaves the same as an array image. ``uncache()`` has |
| 109 | +no effect on an array image, but it does have an effect on the returned array |
| 110 | +of a proxy image: |
| 111 | + |
| 112 | +>>> proxy_img.uncache() # cached array discarded from proxy image |
| 113 | +>>> data_once_more = proxy_img.get_data() # new copy of array loaded |
| 114 | +>>> data_once_more[0, 0, 0, 0] # array modifications discarded |
| 115 | +0 |
| 116 | + |
| 117 | +************* |
| 118 | +Saving memory |
| 119 | +************* |
| 120 | + |
| 121 | +Uncache the array |
| 122 | +================= |
| 123 | + |
| 124 | +If you do not want the image to keep the array in its internal cache, you can |
| 125 | +use the ``uncache()`` method: |
| 126 | + |
| 127 | +>>> img.uncache() |
| 128 | + |
| 129 | +Use the array proxy instead of ``get_data()`` |
| 130 | +============================================= |
| 131 | + |
| 132 | +The ``dataobj`` property of a proxy image is an array proxy. We can ask the |
| 133 | +proxy to return the array directly by passing ``dataobj`` to the numpy |
| 134 | +``asarray`` function: |
| 135 | + |
| 136 | +>>> proxy_img = nib.load(example_file) |
| 137 | +>>> data_array = np.asarray(proxy_img.dataobj) |
| 138 | +>>> type(data_array) |
| 139 | +<type 'numpy.ndarray'> |
| 140 | + |
| 141 | +This also works for array images, because ``np.asarray`` returns the array: |
| 142 | + |
| 143 | +>>> array_img = nib.Nifti1Image(array_data, affine) |
| 144 | +>>> data_array = np.asarray(array_img.dataobj) |
| 145 | +>>> type(data_array) |
| 146 | +<type 'numpy.ndarray'> |
| 147 | + |
| 148 | +If you want to avoid caching you can avoid ``get_data()`` and always use |
| 149 | +``np.asarray(img.dataobj)``. |
| 150 | + |
| 151 | +Use the ``caching`` keyword to ``get_data()`` |
| 152 | +============================================= |
| 153 | + |
| 154 | +The default behavior of the ``get_data()`` function is to always fill the |
| 155 | +cache, if it is empty. This corresponds to the default ``'fill'`` value |
| 156 | +to the ``caching`` keyword. So, this: |
| 157 | + |
| 158 | +>>> proxy_img = nib.load(example_file) |
| 159 | +>>> data = proxy_img.get_data() # default caching='fill' |
| 160 | +>>> proxy_img.in_memory |
| 161 | +True |
| 162 | + |
| 163 | +is the same as this: |
| 164 | + |
| 165 | +>>> proxy_img = nib.load(example_file) |
| 166 | +>>> data = proxy_img.get_data(caching='fill') |
| 167 | +>>> proxy_img.in_memory |
| 168 | +True |
| 169 | + |
| 170 | +Sometimes you may want to avoid filling the cache, if it is empty. In this |
| 171 | +case, you can use ``caching='unchanged'``: |
| 172 | + |
| 173 | +>>> proxy_img = nib.load(example_file) |
| 174 | +>>> data = proxy_img.get_data(caching='unchanged') |
| 175 | +>>> proxy_img.in_memory |
| 176 | +False |
| 177 | + |
| 178 | +``caching='unchanged'`` will leave the cache full if it is already full. |
| 179 | + |
| 180 | +>>> data = proxy_img.get_data(caching='fill') |
| 181 | +>>> proxy_img.in_memory |
| 182 | +True |
| 183 | +>>> data = proxy_img.get_data(caching='unchanged') |
| 184 | +>>> proxy_img.in_memory |
| 185 | +True |
| 186 | + |
| 187 | +See the :meth:`get_data() docstring |
| 188 | +<nibabel.spatialimages.SpatialImage.get_data>` for more detail. |
| 189 | + |
| 190 | +********************** |
| 191 | +Saving time and memory |
| 192 | +********************** |
| 193 | + |
| 194 | +You can use the array proxy to get slices of data from disk in an efficient |
| 195 | +way. |
| 196 | + |
| 197 | +The array proxy API allows you to do slicing on the proxy. In most cases this |
| 198 | +will mean that you only load the data from disk that you actually need, often |
| 199 | +saving both time and memory. |
| 200 | + |
| 201 | +For example, let us say you only wanted the second volume from the example |
| 202 | +dataset. You could do this: |
| 203 | + |
| 204 | +>>> proxy_img = nib.load(example_file) |
| 205 | +>>> data = proxy_img.get_data() |
| 206 | +>>> data.shape |
| 207 | +(128, 96, 24, 2) |
| 208 | +>>> vol1 = data[..., 1] |
| 209 | +>>> vol1.shape |
| 210 | +(128, 96, 24) |
| 211 | + |
| 212 | +The problem is that you had to load the whole data array into memory before |
| 213 | +throwing away the first volume and keeping the second. |
| 214 | + |
| 215 | +You can use array proxy slicing to do this more efficiently: |
| 216 | + |
| 217 | +>>> proxy_img = nib.load(example_file) |
| 218 | +>>> vol1 = proxy_img.dataobj[..., 1] |
| 219 | +>>> vol1.shape |
| 220 | +(128, 96, 24) |
| 221 | + |
| 222 | +The slicing call in ``proxy_img.dataobj[..., 1]`` will only load the data from |
| 223 | +disk that you need to fill the memory of ``vol1``. |
| 224 | + |
| 225 | +.. include:: links_names.txt |
0 commit comments