Skip to content

Commit 26aa100

Browse files
committed
DOC: move image signature discussion up a level
Move discussion - of keeping track of image object correspondence to filenames - out of usecases directory into devel directory, and remove usecases directory. Rewrite the discussion, although it's still not very clear.
1 parent 1964b07 commit 26aa100

File tree

4 files changed

+125
-122
lines changed

4 files changed

+125
-122
lines changed

doc/source/devel/devdiscuss.rst

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,11 +13,12 @@
1313
Developer discussions
1414
*********************
1515

16-
Some discussions of usecases and other design issues. Most of these don't apply
17-
to the current codebase, and are only for discussion of future directions.
16+
Some miscellaneous documents on background, future development and work in
17+
progress.
1818

1919
.. toctree::
2020
:maxdepth: 2
2121

22-
usecases/index
22+
spm_use
23+
modified_images
2324
data_pkg_design

doc/source/devel/modified_images.rst

Lines changed: 121 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,121 @@
1+
.. -*- rst -*-
2+
3+
#############################################################
4+
Keeping track of whether images have been modified since load
5+
#############################################################
6+
7+
*******
8+
Summary
9+
*******
10+
11+
This is a discussion of a missing feature in nibabel: the ability to keep
12+
track of whether an image object in memory still corresponds to an image file
13+
(or files) on disk.
14+
15+
**********
16+
Motivation
17+
**********
18+
19+
We may need to know whether the image in memory corresponds to the image file
20+
on disk.
21+
22+
For example, we often need to get filenames for images when passing
23+
images to external programs. Imagine a realignment, in this case, in nipy_
24+
(the package)::
25+
26+
import nipy
27+
img1 = nibabel.load('meanfunctional.nii')
28+
img2 = nibabel.load('anatomical.nii')
29+
realigner = nipy.interfaces.fsl.flirt()
30+
params = realigner.run(source=img1, target=img2)
31+
32+
In ``nipy.interfaces.fsl.flirt.run`` there may at some point be calls like::
33+
34+
source_filename = nipy.as_filename(source_img)
35+
target_filename = nipy.as_filename(target_img)
36+
37+
As the authors of the ``flirt.run`` method, we need to make sure that the
38+
``source_filename`` corresponds to the ``source_img``.
39+
40+
Of course, in the general case, if ``source_img`` has no corresponding
41+
filename (from ``source_img.get_filename()``, then we will have to save a copy
42+
to disk, maybe with a temporary filename, and return that temporary name as
43+
``source_filename``.
44+
45+
In our particular case, ``source_img`` does have a filename
46+
(``meanfunctional.nii``). We would like to return that as
47+
``source_filename``. The question is, how can we be sure that the user has
48+
done nothing to ``source_img`` to make it diverge from its original state?
49+
Could ``source_img`` have diverged, in memory, from the state recorded in
50+
``meantunctional.nii``?
51+
52+
If the image and file have not diverged, we return ``meanfunctional.nii`` as
53+
the ``source_filename``, otherwise we will have to do something like::
54+
55+
import tempfile
56+
fname = tempfile.mkstemp('.nii')
57+
img = source_img.to_filename(fname)
58+
59+
and return ``fname`` as ``source_filename``.
60+
61+
Another situation where we might like to pass around image objects that are
62+
known to correspond to images on disk is when working in parallel. A set of
63+
nodes may have fast common access to a filesystem on which the images are
64+
stored. If a master is farming out images to nodes, a master node
65+
distribution jobs to workers might want to check if the image was identical to
66+
something on file and pass around a lightweight (proxied) image (with the data
67+
not loaded into memory), relying on the node pulling the image from disk when
68+
it uses it.
69+
70+
***********************
71+
Possible implementation
72+
***********************
73+
74+
One implementation is to have ``dirty`` flag, which, if set, would tell
75+
you that the image might not correspond to the disk file. We set this
76+
flag when anyone asks for the data, on the basis that the user may then
77+
do something to the data and you can't know if they have::
78+
79+
img = nibabel.load('some_image.nii')
80+
data = img.get_data()
81+
data[:] = 0
82+
img2 = nibabel.load('some_image.nii')
83+
assert not np.all(img2.get_data() == img.get_data())
84+
85+
The image consists of the data, the affine and a header. In order to
86+
keep track of the header and affine, we could cache them when loading
87+
the image::
88+
89+
img = nibabel.load('some_image.nii')
90+
hdr = img.header
91+
assert img._cache['header'] == img.header
92+
hdr.set_data_dtype(np.complex64)
93+
assert img._cache['header'] != img.header
94+
95+
When we need to know whether the image object and image file correspond, we
96+
could check the current header and current affine (the header may be separate
97+
from the affine for an SPM Analyze image) against their cached copies, if they
98+
are the same and the 'dirty' flag has not been set by a previous call to
99+
``get_data()``, we know that the image file does correspond to the image
100+
object.
101+
102+
This may be OK for small bits of memory like the affine and the header,
103+
but would quickly become prohibitive for larger image metadata such as
104+
large nifti header extensions. We could just always assume that images
105+
with large header extensions are *not* the same as for on disk.
106+
107+
The user might be able to override the result of these checks directly::
108+
109+
img = nibabel.load('some_image.nii')
110+
assert img.is_dirty == False
111+
hdr = img.header
112+
hdr.set_data_dtype(np.complex64)
113+
assert img.is_dirty == True
114+
img.is_dirty == False
115+
116+
The checks are magic behind the scenes stuff that do some safe optimization
117+
(in the sense that we are not re-saving the data if that is not necessary),
118+
but drops back to the default (re-saving the data) if there is any
119+
uncertainty, or the cost is too high to be able to check.
120+
121+
.. include:: ../links_names.txt

doc/source/devel/usecases/index.rst

Lines changed: 0 additions & 8 deletions
This file was deleted.

doc/source/devel/usecases/loading_saving.rst

Lines changed: 0 additions & 111 deletions
This file was deleted.

0 commit comments

Comments
 (0)