|
| 1 | +.. _data-package-design: |
| 2 | + |
| 3 | +Design of data packages for the nipy suite |
| 4 | +========================================== |
| 5 | + |
| 6 | +When developing or using nipy, many data files can be useful. We divide |
| 7 | +the data files nipy uses into at least 3 categories |
| 8 | + |
| 9 | +#. *test data* - data files required for routine code testing |
| 10 | +#. *template data* - data files required for algorithms to function, |
| 11 | + such as templates or atlases |
| 12 | +#. *example data* - data files for running examples, or optional tests |
| 13 | + |
| 14 | +Files used for routine testing are typically very small data files. They are |
| 15 | +shipped with the software, and live in the code repository. For example, in the |
| 16 | +case of ``nipy`` itself, there are some test files that live in the module path |
| 17 | +``nipy.testing.data``. |
| 18 | + |
| 19 | +*template data* and *example data* are example of *data packages*. What |
| 20 | +follows is a discussion of the design and use of data packages. |
| 21 | + |
| 22 | +Use cases for data packages |
| 23 | ++++++++++++++++++++++++++++ |
| 24 | + |
| 25 | +Using the data package |
| 26 | +`````````````````````` |
| 27 | + |
| 28 | +The programmer will want to use the data something like this: |
| 29 | + |
| 30 | +.. testcode:: |
| 31 | + |
| 32 | + from nibabel.data import make_datasource |
| 33 | + |
| 34 | + templates = make_datasource('nipy', 'templates') |
| 35 | + fname = templates.get_filename('ICBM152', '2mm', 'T1.nii.gz') |
| 36 | + |
| 37 | +where ``fname`` will be the absolute path to the template image |
| 38 | +``ICBM152/2mm/T1.nii.gz``. |
| 39 | + |
| 40 | +The programmer can insist on a particular version of a ``datasource``: |
| 41 | + |
| 42 | +.. testcode:: |
| 43 | + |
| 44 | + if templates.version < '0.4': |
| 45 | + raise ValueError('Need datasource version at least 0.4') |
| 46 | + |
| 47 | +If the repository cannot find the data, then: |
| 48 | + |
| 49 | +>>> make_datasource('nipy', 'implausible') |
| 50 | +Traceback |
| 51 | + ... |
| 52 | +nibabel.data.DataError |
| 53 | + |
| 54 | +where ``DataError`` gives a helpful warning about why the data was not |
| 55 | +found, and how it should be installed. |
| 56 | + |
| 57 | +Warnings during installation |
| 58 | +```````````````````````````` |
| 59 | + |
| 60 | +The example data and template data may be important, and it would be |
| 61 | +useful to warn the user if NIPY cannot find either of the two sets of |
| 62 | +data when installing the package. Thus:: |
| 63 | + |
| 64 | + python setup.py install |
| 65 | + |
| 66 | +will import nipy after installation to check whether these raise an error: |
| 67 | + |
| 68 | +>>> from nibabel.data import make_datasource |
| 69 | +>>> template = make_datasource('nipy', 'templates') |
| 70 | +>>> example_data = make_datasource('nipy', 'data') |
| 71 | + |
| 72 | +and warn the user accordingly, with some basic instructions for how to |
| 73 | +install the data. |
| 74 | + |
| 75 | +.. _find-data: |
| 76 | + |
| 77 | +Finding the data |
| 78 | +```````````````` |
| 79 | + |
| 80 | +The routine ``make_datasource`` will need to be able to find the data |
| 81 | +that has been installed. For the following call: |
| 82 | + |
| 83 | +>>> templates = make_datasource('nipy', 'templates') |
| 84 | + |
| 85 | +We propose to: |
| 86 | + |
| 87 | +#. Get a list of paths where data is known to be stored with |
| 88 | + ``nipy.data.get_data_path()`` |
| 89 | +#. For each of these paths, search for directory ``nipy/templates``. If |
| 90 | + found, and of the correct format (see below), return a datasource, |
| 91 | + otherwise raise an Exception |
| 92 | + |
| 93 | +The paths collected by ``nipy.data.get_data_paths()`` will be |
| 94 | +constructed from ':' (Unix) or ';' separated strings. The source of the |
| 95 | +strings (in the order in which they will be used in the search above) |
| 96 | +are: |
| 97 | + |
| 98 | +#. The value of the ``NIPY_DATA_PATH`` environment variable, if set |
| 99 | +#. A section = ``DATA``, parameter = ``path`` entry in a |
| 100 | + ``config.ini`` file in ``nipy_dir`` where ``nipy_dir`` is |
| 101 | + ``$HOME/.nipy`` or equivalent. |
| 102 | +#. Section = ``DATA``, parameter = ``path`` entries in configuration |
| 103 | + ``.ini`` files, where the ``.ini`` files are found by |
| 104 | + ``glob.glob(os.path.join(etc_dir, '*.ini')`` and ``etc_dir`` is |
| 105 | + ``/etc/nipy`` on Unix, and some suitable equivalent on Windows. |
| 106 | +#. The result of ``os.path.join(sys.prefix, 'share', 'nipy')`` |
| 107 | +#. If ``sys.prefix`` is ``/usr``, we add ``/usr/local/share/nipy``. We |
| 108 | + need this because Python 2.6 in Debian / Ubuntu does default installs |
| 109 | + to ``/usr/local``. |
| 110 | +#. The result of ``get_nipy_user_dir()`` |
| 111 | + |
| 112 | +Requirements for a data package |
| 113 | +``````````````````````````````` |
| 114 | + |
| 115 | +To be a valid NIPY project data package, you need to satisfy: |
| 116 | + |
| 117 | +#. The installer installs the data in some place that can be found using |
| 118 | + the method defined in :ref:`find-data`. |
| 119 | + |
| 120 | +We recommend that: |
| 121 | + |
| 122 | +#. By default, you install data in a standard location such as |
| 123 | + ``<prefix>/share/nipy`` where ``<prefix>`` is the standard Python |
| 124 | + prefix obtained by ``>>> import sys; print sys.prefix`` |
| 125 | + |
| 126 | +Remember that there is a distinction between the NIPY project - the |
| 127 | +umbrella of neuroimaging in python - and the NIPY package - the main |
| 128 | +code package in the NIPY project. Thus, if you want to install data |
| 129 | +under the NIPY *package* umbrella, your data might go to |
| 130 | +``/usr/share/nipy/nipy/packagename`` (on Unix). Note ``nipy`` twice - |
| 131 | +once for the project, once for the pacakge. If you want to install data |
| 132 | +under - say - the ```pbrain`` package umbrella, that would go in |
| 133 | +``/usr/share/nipy/pbrain/packagename``. |
| 134 | + |
| 135 | +Data package format |
| 136 | +``````````````````` |
| 137 | + |
| 138 | +The following tree is an example of the kind of pattern we would expect |
| 139 | +in a data directory, where the ``nipy-data`` and ``nipy-templates`` |
| 140 | +packages have been installed:: |
| 141 | + |
| 142 | + <ROOT> |
| 143 | + `-- nipy |
| 144 | + |-- data |
| 145 | + | |-- config.ini |
| 146 | + | `-- placeholder.txt |
| 147 | + `-- templates |
| 148 | + |-- ICBM152 |
| 149 | + | `-- 2mm |
| 150 | + | `-- T1.nii.gz |
| 151 | + |-- colin27 |
| 152 | + | `-- 2mm |
| 153 | + | `-- T1.nii.gz |
| 154 | + `-- config.ini |
| 155 | + |
| 156 | +The ``<ROOT>`` directory is the directory that will appear somewhere in |
| 157 | +the list from ``nipy.data.get_data_path()``. The ``nipy`` subdirectory |
| 158 | +signifies data for the ``nipy`` package (as opposed to other |
| 159 | +NIPY-related packages such as ``pbrain``). The ``data`` subdirectory of |
| 160 | +``nipy`` contains files from the ``nipy-data`` package. In the |
| 161 | +``nipy/data`` or ``nipy/templates`` directories, there is a |
| 162 | +``config.ini`` file, that has at least an entry like this:: |
| 163 | + |
| 164 | + [DEFAULT] |
| 165 | + version = 0.1 |
| 166 | + |
| 167 | +giving the version of the data package. |
| 168 | + |
| 169 | +.. _data-package-design-install: |
| 170 | + |
| 171 | +Installing the data |
| 172 | +``````````````````` |
| 173 | + |
| 174 | +We will use python distutils to install data packages, and the |
| 175 | +``data_files`` mechanism to install the data. On Unix, with the |
| 176 | +following command:: |
| 177 | + |
| 178 | + python setup.py install --prefix=/my/prefix |
| 179 | + |
| 180 | +data will go to:: |
| 181 | + |
| 182 | + /my/prefix/share/nipy |
| 183 | + |
| 184 | +For the example above this will result in these subdirectories:: |
| 185 | + |
| 186 | + /my/prefix/share/nipy/nipy/data |
| 187 | + /my/prefix/share/nipy/nipy/templates |
| 188 | + |
| 189 | +because ``nipy`` is both the project, and the package to which the data |
| 190 | +relates. |
| 191 | + |
| 192 | +If you install to a particular location, you will need to add that location to |
| 193 | +the output of ``nipy.data.get_data_path()`` using one of the mechanisms above, |
| 194 | +for example, in your system configuration:: |
| 195 | + |
| 196 | + export NIPY_DATA_PATH=/my/prefix/share/nipy |
| 197 | + |
| 198 | +Packaging for distributions |
| 199 | +``````````````````````````` |
| 200 | + |
| 201 | +For a particular data package - say ``nipy-templates`` - distributions |
| 202 | +will want to: |
| 203 | + |
| 204 | +#. Install the data in set location. The default from ``python setup.py install`` for the data packages will be ``/usr/share/nipy`` on Unix. |
| 205 | +#. Point a system installation of NIPY to these data. |
| 206 | + |
| 207 | +For the latter, the most obvious route is to copy an ``.ini`` file named |
| 208 | +for the data package into the NIPY ``etc_dir``. In this case, on Unix, |
| 209 | +we will want a file called ``/etc/nipy/nipy_templates.ini`` with |
| 210 | +contents:: |
| 211 | + |
| 212 | + [DATA] |
| 213 | + path = /usr/share/nipy |
| 214 | + |
| 215 | +Current implementation |
| 216 | +`````````````````````` |
| 217 | + |
| 218 | +This section describes how we (the nibabel package) implement data packages |
| 219 | +at the moment. |
| 220 | + |
| 221 | +The data in the data packages will not usually be under source control. This is |
| 222 | +because images don't compress very well, and any change in the data will result |
| 223 | +in a large extra storage cost in the repository. If you're pretty clear that |
| 224 | +the data files aren't going to change, then a repository could work OK. |
| 225 | + |
| 226 | +The data packages will be available at a central release location. For |
| 227 | +now this will be: http://nipy.sourceforge.net/data-packages/ . |
| 228 | + |
| 229 | +A package, such as ``nipy-templates-0.1.tar.gz`` will have the following |
| 230 | +sort of structure:: |
| 231 | + |
| 232 | + |
| 233 | + <ROOT> |
| 234 | + |-- setup.py |
| 235 | + |-- README.txt |
| 236 | + |-- MANIFEST.in |
| 237 | + `-- templates |
| 238 | + |-- ICBM152 |
| 239 | + | `-- 2mm |
| 240 | + | `-- T1.nii.gz |
| 241 | + |-- colin27 |
| 242 | + | `-- 2mm |
| 243 | + | `-- T1.nii.gz |
| 244 | + `-- config.ini |
| 245 | + |
| 246 | + |
| 247 | +There should be only one ``nipy/packagename`` directory delivered by a |
| 248 | +particular package. For example, this package installs |
| 249 | +``nipy/templates``, but does not contain ``nipy/data``. |
| 250 | + |
| 251 | +Making a new package tarball is simply: |
| 252 | + |
| 253 | +#. Downloading and unpacking e.g ``nipy-templates-0.1.tar.gz`` to form |
| 254 | + the directory structure above. |
| 255 | +#. Making any changes to the directory |
| 256 | +#. Running ``setup.py sdist`` to recreate the package. |
| 257 | + |
| 258 | +The process of making a release should be: |
| 259 | + |
| 260 | +#. Increment the major or minor version number in the ``config.ini`` file |
| 261 | +#. Make a package tarball as above |
| 262 | +#. Upload to distribution site |
| 263 | + |
| 264 | +There is an example nipy data package ``nipy-examplepkg`` in the |
| 265 | +``examples`` directory of the NIPY repository. |
| 266 | + |
| 267 | +The machinery for creating and maintaining data packages is available with:: |
| 268 | + |
| 269 | + svn co https://nipy.svn.sourceforge.net/svnroot/nipy/data-packaging/trunk |
| 270 | + |
| 271 | +See the ``README.txt`` file there for more information. |
0 commit comments