Skip to content

Commit 5cf2d5a

Browse files
committed
[release] bump version number to v0.6.0
1 parent 57cb6cb commit 5cf2d5a

File tree

6 files changed

+133
-15
lines changed

6 files changed

+133
-15
lines changed

.github/workflows/run_test.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ jobs:
77
runs-on: ubuntu-20.04
88
strategy:
99
matrix:
10-
python-version: ["3.7.17", "3.7", "3.8", "3.9", "3.10", "3.11", "3.12"]
10+
python-version: ["3.7", "3.8", "3.9", "3.10", "3.11", "3.12"]
1111

1212
steps:
1313
- uses: actions/checkout@v3

README.md

Lines changed: 124 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -4,18 +4,31 @@
44

55
- Copyright: (C) Qianqian Fang (2019-2024) <q.fang at neu.edu>
66
- License: Apache License, Version 2.0
7-
- Version: 0.5.5
7+
- Version: 0.6.0
88
- URL: https://github.com/NeuroJSON/pyjdata
9+
- Acknowledgement: This project is supported by US National Institute of Health (NIH)
10+
grant [U24-NS124027](https://reporter.nih.gov/project-details/10308329)
911

1012
![Build Status](https://github.com/NeuroJSON/pyjdata/actions/workflows/run_test.yml/badge.svg)
1113

1214
The [JData Specification](https://github.com/NeuroJSON/jdata/) defines a lightweight
13-
language-independent data annotation interface targetted at
14-
storing and sharing complex data structures across different programming
15+
language-independent data annotation interface enabling easy storing
16+
and sharing of complex data structures across different programming
1517
languages such as MATLAB, JavaScript, Python etc. Using JData formats, a
16-
complex Python data structure can be encoded as a `dict` object that is easily
17-
serialized as a JSON/binary JSON file and share such data between
18-
programs of different languages.
18+
complex Python data structure, including numpy objects, can be encoded
19+
as a simple `dict` object that is easily serialized as a JSON/binary JSON
20+
file and share such data between programs of different languages.
21+
22+
Since 2021, the development of PyJData module and the underlying data format specificaitons
23+
[JData](https://neurojson.org/jdata/draft3) and [BJData](https://neurojson.org/bjdata/draft2)
24+
have been funded by the US National Institute of Health (NIH) as
25+
part of the NeuroJSON project (https://neurojson.org and https://neurojson.io).
26+
27+
The goal of the NeuroJSON project is to develop scalable, searchable, and
28+
reusable neuroimaging data formats and data sharing platforms. All data
29+
produced from the NeuroJSON project will be using JSON/Binary JData formats as the
30+
underlying serialization standards and the lightweight JData specification as
31+
language-independent data annotation standard.
1932

2033
## How to install
2134

@@ -102,6 +115,13 @@ newdata=jd.load('test.json')
102115
newdata
103116
```
104117

118+
One can use `loadt` or `savet` to read/write JSON-based data files and `loadb` and `saveb` to
119+
read/write binary-JSON based data files. By default, JData annotations are automatically decoded
120+
after loading and encoded before saving. One can set `{'encode': False}` in the save functions
121+
or `{'decode': False}` in the load functions as the `opt` to disable further processing of JData
122+
annotations. We also provide `loadts` and `loadbs` for parsing a string-buffer made of text-based
123+
JSON or binary JSON stream.
124+
105125
PyJData supports multiple N-D array data compression/decompression methods (i.e. codecs), similar
106126
to HDF5 filters. Currently supported codecs include `zlib`, `gzip`, `lz4`, `lzma`, `base64` and various
107127
`blosc2` compression methods, including `blosc2blosclz`, `blosc2lz4`, `blosc2lz4hc`, `blosc2zlib`,
@@ -111,6 +131,104 @@ decompress the data based on the `_ArrayZipType_` annotation present in the data
111131
compression methods support multi-threading. To set the thread number, one should define an `nthread`
112132
value in the option (`opt`) for both encoding and decoding.
113133

134+
## Reading JSON via REST-API
135+
136+
If a REST-API (URL) is given as the first input of `load`, it reads the JSON data directly
137+
from the URL and parse the content to native Python data structures. To avoid repetitive download,
138+
`load` automatically cache the downloaded file so that future calls directly load the
139+
locally cached file. If one prefers to always load from the URL without local cache, one should
140+
use `loadurl()` instead. Here is an example
141+
142+
```
143+
import jdata as jd
144+
data = jd.load('https://neurojson.io:7777/openneuro/ds000001');
145+
data.keys()
146+
```
147+
148+
## Using JSONPath to access and query complex datasets
149+
150+
Starting from v0.6.0, PyJData provides a lightweight implementation [JSONPath](https://goessner.net/articles/JsonPath/),
151+
a widely used format for query and access a hierarchical dict/list structure, such as those
152+
parsed by `load` or `loadurl`. Here is an example
153+
154+
```
155+
import jdata as jd
156+
157+
data = jd.loadurl('https://raw.githubusercontent.com/fangq/jsonlab/master/examples/example1.json');
158+
jd.jsonpath(data, '$.age')
159+
jd.jsonpath(data, '$.address.city')
160+
jd.jsonpath(data, '$.phoneNumber')
161+
jd.jsonpath(data, '$.phoneNumber[0]')
162+
jd.jsonpath(data, '$.phoneNumber[0].type')
163+
jd.jsonpath(data, '$.phoneNumber[-1]')
164+
jd.jsonpath(data, '$.phoneNumber..number')
165+
jd.jsonpath(data, '$[phoneNumber][type]')
166+
jd.jsonpath(data, '$[phoneNumber][type][1]')
167+
```
168+
169+
The `jd.jsonpath` function does not support all JSONPath features. If more complex JSONPath
170+
queries are needed, one should install `jsonpath_ng` or other more advanced JSONPath support.
171+
Here is an example using `jsonpath_ng`
172+
173+
```
174+
import jdata as jd
175+
from jsonpath_ng.ext import parse
176+
177+
data = jd.loadurl('https://raw.githubusercontent.com/fangq/jsonlab/master/examples/example1.json');
178+
179+
val = [match.value for match in parse('$.address.city').find(data)]
180+
val = [match.value for match in parse('$.phoneNumber').find(data)]
181+
```
182+
183+
## Downloading and caching `_DataLink_` referenced external data files
184+
185+
Similarly to [JSONLab](https://github.com/fangq/jsonlab?tab=readme-ov-file#jsoncachem),
186+
PyJData also provides similar external data file downloading/caching capability.
187+
188+
The `_DataLink_` annotation in the JData specification permits linking of external data files
189+
in a JSON file - to make downloading/parsing externally linked data files efficient, such as
190+
processing large neuroimaging datasets hosted on http://neurojson.io, we have developed a system
191+
to download files on-demand and cache those locally. jsoncache.m is responsible of searching
192+
the local cache folders, if found the requested file, it returns the path to the local cache;
193+
if not found, it returns a SHA-256 hash of the URL as the file name, and the possible cache folders
194+
195+
When loading a file from URL, below is the order of cache file search paths, ranking in search order
196+
```
197+
global-variable NEUROJSON_CACHE | if defined, this path will be searched first
198+
[pwd '/.neurojson'] | on all OSes
199+
/home/USERNAME/.neurojson | on all OSes (per-user)
200+
/home/USERNAME/.cache/neurojson | if on Linux (per-user)
201+
/var/cache/neurojson | if on Linux (system wide)
202+
/home/USERNAME/Library/neurojson| if on MacOS (per-user)
203+
/Library/neurojson | if on MacOS (system wide)
204+
C:\ProgramData\neurojson | if on Windows (system wide)
205+
```
206+
When saving a file from a URL, under the root cache folder, subfolders can be created;
207+
if the URL is one of a standard NeuroJSON.io URLs as below
208+
```
209+
https://neurojson.org/io/stat.cgi?action=get&db=DBNAME&doc=DOCNAME&file=sub-01/anat/datafile.nii.gz
210+
https://neurojson.io:7777/DBNAME/DOCNAME
211+
https://neurojson.io:7777/DBNAME/DOCNAME/datafile.suffix
212+
```
213+
the file datafile.nii.gz will be downloaded to /home/USERNAME/.neurojson/io/DBNAME/DOCNAME/sub-01/anat/ folder
214+
if a URL does not follow the neurojson.io format, the cache folder has the below form
215+
```
216+
CACHEFOLDER{i}/domainname.com/XX/YY/XXYYZZZZ...
217+
```
218+
where XXYYZZZZ.. is the SHA-256 hash of the full URL, XX is the first two digit, YY is the 3-4 digits
219+
220+
In PyJData, we provide `jdata.jdlink()` function to dynamically download and locally cache
221+
externally linked data files. `jdata.jdlink()` only parse files with JSON/binary JSON suffixes that
222+
`load` supports. Here is a example
223+
224+
```
225+
import jdata as jd
226+
227+
data = jd.load('https://neurojson.io:7777/openneuro/ds000001');
228+
extlinks = jd.jsonpath(data, '$..anat.._DataLink_') # deep-scan of all anatomical folders and find all linked NIfTI files
229+
jd.jdlink(extlinks, {'regex': 'sub-0[12]_.*nii'}) # download only the nii files for sub-01 and sub-02
230+
jd.jdlink(extlinks) # download all links
231+
```
114232

115233
## Utility
116234

jdata/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,7 @@
5252
from .jdata import encode, decode, jdtype, jsonfilter
5353
from .jpath import jsonpath
5454

55-
__version__ = "0.5.5"
55+
__version__ = "0.6.0"
5656
__all__ = [
5757
"load",
5858
"save",

jdata/jfile.py

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -437,10 +437,9 @@ def jdlink(uripath, opt={}, **kwargs):
437437

438438
if isinstance(uripath, list):
439439
if "regex" in opt:
440-
haspattern = [
441-
True if re.search(opt["regex"], x) is None else False for x in uripath
442-
]
443-
uripath = [x for i, x in enumerate(uripath) if haspattern[i]]
440+
pat = re.compile(opt["regex"])
441+
uripath = [uri for uri in uripath if pat.search(uri)]
442+
print(uripath)
444443
if "showsize" in opt:
445444
totalsize = 0
446445
nosize = 0

setup.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
setup(
77
name="jdata",
88
packages=["jdata"],
9-
version="0.5.5",
9+
version="0.6.0",
1010
license="Apache license 2.0",
1111
description="Encoding and decoding Python data structrues using portable JData-annotated formats",
1212
long_description=readme,
@@ -15,7 +15,7 @@
1515
author_email="[email protected]",
1616
maintainer="Qianqian Fang",
1717
url="https://github.com/NeuroJSON/pyjdata",
18-
download_url="https://github.com/NeuroJSON/pyjdata/archive/v0.5.5.tar.gz",
18+
download_url="https://github.com/NeuroJSON/pyjdata/archive/v0.6.0.tar.gz",
1919
keywords=[
2020
"JSON",
2121
"JData",
@@ -44,6 +44,7 @@
4444
"Programming Language :: Python :: 3.8",
4545
"Programming Language :: Python :: 3.9",
4646
"Programming Language :: Python :: 3.10",
47+
"Programming Language :: Python :: 3.11",
4748
"Topic :: Software Development :: Libraries",
4849
"Topic :: Software Development :: Libraries :: Python Modules",
4950
],

test/testjd.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111
1212
in the root folder.
1313
14-
Copyright (c) 2019 Qianqian Fang <q.fang at neu.edu>
14+
Copyright (c) 2019-2024 Qianqian Fang <q.fang at neu.edu>
1515
"""
1616

1717
import unittest

0 commit comments

Comments
 (0)