Skip to content

Commit 399f845

Browse files
committed
[new] v0.8.0, h5/snirf reader/writer, neuroj client, cmd, new tests
1 parent 45eb1a9 commit 399f845

File tree

17 files changed

+3947
-462
lines changed

17 files changed

+3947
-462
lines changed

Makefile

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,3 @@
1-
21
PY=python3
32

43
all: pretty test build
@@ -13,4 +12,4 @@ build:
1312

1413

1514
.DEFAULT_GOAL=all
16-
.PHONY: all pretty test
15+
.PHONY: all pretty test build

README.md

Lines changed: 200 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,26 +1,57 @@
11
![](https://neurojson.org/wiki/upload/neurojson_banner_long.png)
22

3-
# JData for Python - lightweight and serializable data annotations for Python
3+
# JData - NeuroJSON client with fast parsers for JSON, binary JSON, NIFTI, SNIRF, CSV/TSV, HDF5 data files
44

55
- Copyright: (C) Qianqian Fang (2019-2025) <q.fang at neu.edu>
66
- License: Apache License, Version 2.0
7-
- Version: 0.7.1
7+
- Version: 0.8.0
88
- URL: https://github.com/NeuroJSON/pyjdata
99
- Acknowledgement: This project is supported by US National Institute of Health (NIH)
1010
grant [U24-NS124027](https://reporter.nih.gov/project-details/10308329)
1111

1212
![Build Status](https://github.com/NeuroJSON/pyjdata/actions/workflows/run_test.yml/badge.svg)
1313

14-
The [JData Specification](https://github.com/NeuroJSON/jdata/) defines a lightweight
14+
## Table of Contents
15+
16+
- [Introduction](#introduction)
17+
- [File formats](#file-formats)
18+
- [Submodules](#submodules)
19+
- [How to install](#how-to-install)
20+
- [How to build](#how-to-build)
21+
- [How to use](#how-to-use)
22+
- [Advanced interfaces](#advanced-interfaces)
23+
- [Reading JSON via REST-API](#reading-json-via-rest-api)
24+
- [Using JSONPath to access and query complex datasets](#using-jsonpath-to-access-and-query-complex-datasets)
25+
- [Downloading and caching `_DataLink_` referenced external data files](#downloading-and-caching-_datalink_-referenced-external-data-files)
26+
- [Utility](#utility)
27+
- [How to contribute](#how-to-contribute)
28+
- [Test](#test)
29+
30+
## Introduction
31+
32+
`jdata` is a lightweight and fast neuroimaging data file parser, with built
33+
in support for NIfTI-1/2 (`.nii`, `.nii.gz`), two-part Analyze 7.5 (`.img/.hdr`, `.img.gz`),
34+
HDF5 (`.h5`), SNIRF (`.snirf`), MATLAB .mat files (`.mat`), CSV/TSV (`.csv`, `.csv.gz`,
35+
`.tsv`, `.tsv.gz`), JSON (`.json`), and various binary-JSON data formats, including
36+
BJData (`.bjd`), UBJSON (`.ubj`), and MessagePack (`.msgpack`) formats. `jdata` can
37+
load data files both from local storage and REST-API via URLs. To maximize portability,
38+
the outputs of `jdata` data parsers are intentionally based upon only the **native Python**
39+
data structures (`dict/list/tuple`) plus `numpy` arrays. The entire package is less than
40+
60KB in size and is platform-independent.
41+
42+
`jdata` highly compatible to the [JSONLab toolbox](https://github.com/NeuroJSON/jsonlab)
43+
for MATLAB/Octave, serving as the reference library for Python for the
44+
[JData Specification](https://github.com/NeuroJSON/jdata/),
45+
The JData Specification defines a lightweight
1546
language-independent data annotation interface enabling easy storing
1647
and sharing of complex data structures across different programming
1748
languages such as MATLAB, JavaScript, Python etc. Using JData formats, a
1849
complex Python data structure, including numpy objects, can be encoded
1950
as a simple `dict` object that is easily serialized as a JSON/binary JSON
2051
file and share such data between programs of different languages.
2152

22-
Since 2021, the development of PyJData module and the underlying data format specificaitons
23-
[JData](https://neurojson.org/jdata/draft3) and [BJData](https://neurojson.org/bjdata/draft2)
53+
Since 2021, the development of the `jdata` module and the underlying data format specificaitons
54+
[JData](https://neurojson.org/jdata/draft3) and [BJData](https://neurojson.org/bjdata/draft3)
2455
have been funded by the US National Institute of Health (NIH) as
2556
part of the NeuroJSON project (https://neurojson.org and https://neurojson.io).
2657

@@ -30,6 +61,62 @@ produced from the NeuroJSON project will be using JSON/Binary JData formats as t
3061
underlying serialization standards and the lightweight JData specification as
3162
language-independent data annotation standard.
3263

64+
## File formats
65+
66+
The supported data formats can be found in the below table. All file types
67+
support reading and writing, except those specified below.
68+
69+
| Format | Name | | Format | Name |
70+
| ------ | ------ | --- |-----------------------------------| ------ |
71+
| **JSON-compatible files** | | | **Binary JSON (same format)** **[1]** | |
72+
|`.json` | ✅ JSON files | |`.bjd` | ✅ binary JSON (BJD) files |
73+
|`.jnii` | ✅ JSON-wrapper for NIfTI data (JNIfTI)| |`.bnii` | ✅ BJD-wrapper for NIfTI data |
74+
|`.jnirs` | ✅ JSON-wrapper for SNIRF data (JSNIRF)| |`.bnirs` | ✅ BJD-wrapper for SNIRF data |
75+
|`.jmsh` | ✅ JSON-encoded mesh data (JMesh) | |`.bmsh` | ✅ BJD-encoded for mesh data |
76+
|`.jdt` | ✅ JSON files with JData annotations | |`.jdb` | ✅ BJD files with JData annotations |
77+
|`.jdat` | ✅ JSON files with JData annotations | |`.jbat` | ✅ BJD files with JData annotations |
78+
|`.jbids` | ✅ JSON digest of a BIDS dataset | |`.pmat` | ✅ BJD encoded .mat files |
79+
| **NIfTI formats** | | | **CSV/TSV formats** | |
80+
|`.nii` | ✅ uncompressed NIfTI-1/2 files | |`.csv` | ✅ CSV files |
81+
|`.nii.gz` | ✅ compressed NIfTI files | |`.csv.gz` | ✅ compressed CSV files |
82+
|`.img/.hdr` | ✅ Analyze 7.5 two-part files | |`.tsv` | ✅ TSV files |
83+
|`.img.gz` | ✅ compressed Analyze files | |`.tsv.gz` | ✅ compressed TSV files |
84+
| **HDF5 formats** **[2]** | | | **Other formats (read-only)** | |
85+
|`.h5` | ✅ HDF5 files | |`.mat` | ✅ MATLAB .mat files **[3]** |
86+
|`.hdf5` | ✅ HDF5 files | |`.bval` | ✅ EEG .bval files |
87+
|`.snirf` | ✅ HDF5-based SNIRF data | |`.bvec` | ✅ EEG .bvec files |
88+
|`.nwb` | ✅ HDF5-based NWB files | |`.msgpack`| ✅ Binary JSON MessagePack format **[4]** |
89+
90+
- [1] requires `bjdata` Python module when needed, `pip install bjdata`
91+
- [2] requires `h5py` Python module when needed, `pip install h5py`
92+
- [3] requires `scipy` Python module when needed, `pip install scipy`
93+
- [4] requires `msgpack` Python module when needed, `pip install msgpack`
94+
95+
## Submodules
96+
97+
The `jdata` module further partition the functions into smaller submodules, including
98+
- **jdata.jfile** provides `loadjd`, `savejd`, `load`, `save`, `loadt`, `savet`, `loadb`, `saveb`, `loadts`, `loadbs`, `jsoncache`, `jdlink`, ...
99+
- **jdata.jdata** provides `encode`, `decode`, `jdataencode`, `jdatadecode`, `{zlib,gzip,lzma,lz4,base64}encode`, `{zlib,gzip,lzma,lz4,base64}decode`
100+
- **jdata.jpath** provides `jsonpath`
101+
- **jdata.jnifti** provides `load{jnifti,nifti}`, `save{jnifti,nifti,jnii,bnii}`, `nii2jnii`, `jnii2nii`, `nifticreate`, `jnifticreate`, `niiformat`, `niicodemap`
102+
- **jdata.neurojson** provides `neuroj`, `neurojgui`
103+
- **jdata.h5** provides `loadh5`, `saveh5`, `regrouph5`, `aos2soa`, `soa2aos`, `jsnirfcreate`, `snirfcreate`, `snirfdecode`
104+
105+
All these functions can be found in the MATLAB/GNU Octave equivalent, JSONLab toolbox. Each function can be individually imported
106+
```
107+
# individually imported
108+
from jdata.jfile import loadjd
109+
data=loadjd(...)
110+
111+
# import everything
112+
from jdata import *
113+
data=loadjd(...)
114+
115+
# import under jdata namespace
116+
import jdata as jd
117+
data=jd.loadjd(...)
118+
```
119+
33120
## How to install
34121

35122
* Github: download from https://github.com/NeuroJSON/pyjdata
@@ -51,17 +138,23 @@ Dependencies:
51138
* **numpy**: PIP: run `pip install numpy` or `sudo apt-get install python3-numpy`
52139
* (optional) **bjdata**: PIP: run `pip install bjdata` or `sudo apt-get install python3-bjdata`, see https://pypi.org/project/bjdata/, only needed to read/write BJData/UBJSON files
53140
* (optional) **lz4**: PIP: run `pip install lz4`, only needed when encoding/decoding lz4-compressed data
54-
* (optional) **backports.lzma**: PIP: run `sudo apt-get install liblzma-dev` and `pip install backports.lzma` (needed for Python 2.7), only needed when encoding/decoding lzma-compressed data
141+
* (optional) **h5py**: PIP: run `pip install h5py`, only needed when reading/writing .h5 and .snirf files
142+
* (optional) **scipy**: PIP: run `pip install scipy`, only needed when loading MATLAB .mat files
143+
* (optional) **msgpack**: PIP: run `pip install msgpack`, only needed when loading MessagePack .msgpack files
55144
* (optional) **blosc2**: PIP: run `pip install blosc2`, only needed when encoding/decoding blosc2-compressed data
145+
* (optional) **backports.lzma**: PIP: run `sudo apt-get install liblzma-dev` and `pip install backports.lzma` (needed for Python 2.7), only needed when encoding/decoding lzma-compressed data
146+
* (optional) **python3-tk**: run `sudo apt-get install python3-tk` to install the Tk support on a Linux in order to run `neurojgui` function
56147

57148
Replacing `pip` by `pip3` if you are using Python 3.x. If either `pip` or `pip3`
58149
does not exist on your system, please run
59150
```
60-
sudo apt-get install python3-pip
151+
sudo apt-get install python3-pip
61152
```
62153
Please note that in some OS releases (such as Ubuntu 20.04), python2.x and python-pip
63154
are no longer supported.
64155

156+
## How to build
157+
65158
One can also install this module from the source code. To do this, you first
66159
check out a copy of the latest code from Github by
67160
```
@@ -76,17 +169,72 @@ or, if you prefer, install to the system folder for all users by
76169
```
77170
sudo python3 setup.py install
78171
```
79-
Please replace `python` by `python3` if you want to install it for Python 3.x instead of 2.x.
80172

81173
Instead of installing the module, you can also import the jdata module directly from
82174
your local copy by cd the root folder of the unzipped pyjdata package, and run
83175
```
84176
import jdata as jd
85177
```
86178

179+
87180
## How to use
88181

89-
The PyJData module is easy to use. You can use the `encode()/decode()` functions to
182+
The `jdata` module provides a unified data parsing and saving interface: `jd.loadjd()` and `jd.savejd()`.
183+
These two functions supports all file format described in the above "File formats" section.
184+
The `jd.loadjd()` function also supports loading online data via URLs.
185+
186+
```
187+
import jdata as jd
188+
nii = jd.loadjd('/path/to/img.nii.gz')
189+
snirf = jd.loadjd('/path/to/mydata.snirf')
190+
nii2 = jd.loadjd('https://example.com/data/vol.nii.gz')
191+
jsondata = jd.loadjd('https://example.com/rest/api/')
192+
matlabdata = jd.loadjd('matlabdata.mat')
193+
jd.savejd(matlabdata, 'newdata.mat')
194+
jd.savejd(matlabdata, 'newdata.jdb', compression='zlib')
195+
196+
jd.savejd(nii2, 'newdata.jnii', compression='lzma')
197+
jd.savejd(nii, 'newdata.bnii', compression='gzip')
198+
jd.savejd(nii, 'newdata.nii.gz')
199+
```
200+
201+
The `jdata` module also serves as the front-end for the free data resources hosted at
202+
NeuroJSON.io. The NeuroJSON client (`neuroj()`) can be started in the GUI mode using
203+
204+
```
205+
import jdata as jd
206+
jd.neuroj('gui')
207+
```
208+
209+
the above command will pop up a window displaying the databases, datasets and data
210+
records for the over 1500 datasets currently hosted on NeuroJSON.io.
211+
212+
The `neuroj` client also supports command-line mode, using the below format
213+
214+
```
215+
import jdata as jd
216+
help(jd.neuroj) # print help info for jd.neuroj()
217+
jd.neuroj('list') # list all databases on NeuroJSON.io
218+
[db['id'] for db in jd.neuroj('list')['database']] # list all database IDs
219+
jd.neuroj('list', 'openneuro') # list all datasets under the `openneuro` database
220+
jd.neuroj('list', 'openneuro', limit=5, skip=5) # list the 6th to 10th datasets under the `openneuro` database
221+
jd.neuroj('list', 'openneuro', 'ds000001') # list all versions for the `openneuro/ds00001` dataset
222+
jd.neuroj('get', 'openneuro', 'ds000001') # download and parse the `openneuro/ds00001` dataset as a Python object
223+
jd.neuroj('info', 'openneuro', 'ds000001') # lightweight header information of the `openneuro/ds00001` dataset
224+
jd.neuroj('find', '/abide/') # find both abide-1 and abide-2 databases using filters
225+
jd.neuroj('find', 'openneuro', '/00[234]$/') # use regular experssion to filter all openneuro datasets
226+
jd.neuroj('find', 'mcx', {'selector': ..., 'find': ...}) # use CouchDB _find API to search data
227+
jd.neuroj('find', 'mcx', {'selector': ..., 'find': ...}) # use CouchDB _find API to search data
228+
jd.neuroj('info', db='mcx', ds='colin27') # use named inputs
229+
jd.neuroj('get', db='mcx', ds='colin27', file='att1') # download the attachment `att1` for the `mcx/colin27` dataset
230+
jd.neuroj('put', 'sandbox1d', 'test', '{"obj":1}') # update `sandbox1d/test` dataset with a new JSON string (need admin account)
231+
jd.neuroj('delete', 'sandbox1d', 'test') # delete `sandbox1d/test` dataset (need admin account)
232+
```
233+
234+
235+
## Advanced interfaces
236+
237+
The `jdata` module is easy to use. You can use the `encode()/decode()` functions to
90238
encode Python data into JData annotation format, or decode JData structures into
91239
native Python data, for example
92240

@@ -97,9 +245,9 @@ import numpy as np
97245
a={'str':'test','num':1.2,'list':[1.1,[2.1]],'nan':float('nan'),'np':np.arange(1,5,dtype=np.uint8)}
98246
jd.encode(a)
99247
jd.decode(jd.encode(a))
100-
d1=jd.encode(a,{'compression':'zlib','base64':1})
248+
d1=jd.encode(a, compression='zlib',base64=True})
101249
d1
102-
jd.decode(d1,{'base64':1})
250+
jd.decode(d1,base64=True)
103251
```
104252

105253
One can further save the JData annotated data into JSON or binary JSON (UBJSON) files using
@@ -235,10 +383,43 @@ jd.jdlink(extlinks) # download all links
235383
One can convert from JSON based data files (`.json, .jdt, .jnii, .jmsh, .jnirs`) to binary-JData
236384
based binary files (`.bjd, .jdb, .bnii, .bmsh, .bnirs`) and vice versa using command
237385
```
238-
python3 -mjdata /path/to/text/json/file.json # convert to /path/to/text/json/file.jdb
239-
python3 -mjdata /path/to/text/json/file.jdb # convert to /path/to/text/json/file.json
240-
python3 -mjdata -h # show help info
386+
python3 -m jdata /path/to/file.json # convert to /path/to/text/json/file.jdb
387+
python3 -m jdata /path/to/file.jdb # convert to /path/to/text/json/file.json
388+
python3 -m jdata /path/to/file.jdb -t 2 # convert to /path/to/text/json/file.json with indentation of 2 spaces
389+
python3 -m jdata file1 file2 ... # batch convert multiple files
390+
python3 -m jdata file1 -f # force overwrite output files if exist (`-f`/`--force`)
391+
python3 -m jdata file1 -O /output/dir # save output files to /output/dir (`-O`/`--outdir`)
392+
python3 -m jdata file1.json -s .bnii # force output suffix/file type (`-s`/`--suffix`)
393+
python3 -m jdata file1.json -c zlib # set compression method (`-c`/`--compression`)
394+
python3 -m jdata -h # show help info (`-h`/`--help`)
395+
```
396+
397+
## How to contribute
398+
399+
`jdata` uses an open-source license - the Apache 2.0 license. This license is a "permissive" license
400+
and can be used in commercial products without needing to release the source code.
401+
402+
To contribute `jdata` source code, you can modify the Python units inside the `jdata/` folder. Please
403+
minimize the dependencies to external 3rd party packages. Please use Python's built-in packages whenever
404+
pissible.
405+
406+
All jdata source codes have been formatted using `black`. To reformat all units, please type
407+
```
408+
make pretty
241409
```
410+
inside the top-folder of the source repository
411+
412+
For every newly added function, please add a unittest unit or test inside the files under `test/`, and run
413+
```
414+
make test
415+
```
416+
to make sure the modified code can pass all tests.
417+
418+
To build a local installer, please install the `build` python module, and run
419+
```
420+
make build
421+
```
422+
The output wheel can be found inside the `dist/` folder.
242423

243424
## Test
244425

@@ -247,3 +428,8 @@ To see additional data type support, please run the built-in test using below co
247428
```
248429
python3 -m unittest discover -v test
249430
```
431+
or one can run individual set of unittests by calling
432+
```
433+
python3 -m unittest -v test.testnifti
434+
python3 -m unittest -v test.testsnirf
435+
```

0 commit comments

Comments
 (0)