Skip to content

Commit c15ba20

Browse files
merge
2 parents 004e596 + 526e549 commit c15ba20

File tree

9 files changed

+81
-23
lines changed

9 files changed

+81
-23
lines changed

.github/workflows/Documenter.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ jobs:
1010
name: Documentation
1111
runs-on: ubuntu-latest
1212
steps:
13-
- uses: actions/checkout@v2
13+
- uses: actions/checkout@v4
1414
- run: |
1515
sudo apt-get install python3-matplotlib
1616
- uses: julia-actions/julia-buildpkg@latest

.github/workflows/ci.yml

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -23,8 +23,8 @@ jobs:
2323
arch:
2424
- x64
2525
steps:
26-
- uses: actions/checkout@v2
27-
- uses: julia-actions/setup-julia@v1
26+
- uses: actions/checkout@v4
27+
- uses: julia-actions/setup-julia@v2
2828
with:
2929
version: ${{ matrix.version }}
3030
arch: ${{ matrix.arch }}
@@ -41,6 +41,7 @@ jobs:
4141
- uses: julia-actions/julia-buildpkg@v1
4242
- uses: julia-actions/julia-runtest@v1
4343
- uses: julia-actions/julia-processcoverage@v1
44-
- uses: codecov/codecov-action@v1
44+
- uses: codecov/codecov-action@v4
4545
with:
4646
file: lcov.info
47+
token: ${{ secrets.CODECOV_TOKEN }}

README.md

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ function copernicus_marine_catalog(product_id,dataset_id,
2424
stac_url = "https://stac.marine.copernicus.eu/metadata/catalog.stac.json",
2525
asset = "timeChunked")
2626

27-
cat = STAC.Catalog(stac_url);
27+
cat = STAC.Catalog(stac_url);
2828
item_canditates = filter(startswith(dataset_id),keys(cat[product_id].items))
2929
# use last version per default
3030
dataset_version_id = sort(item_canditates)[end]
@@ -36,9 +36,7 @@ product_id = "MEDSEA_MULTIYEAR_PHY_006_004"
3636
dataset_id = "med-cmcc-ssh-rean-d"
3737

3838
url = copernicus_marine_catalog(product_id,dataset_id)
39-
# surprisingly requesting missing data chunks results in the HTTP error
40-
# code 403 (permission denied) rather than 404 (not found) for the CMEMS server.
41-
ds = ZarrDataset(url,_omitcode=[404,403]);
39+
ds = ZarrDataset(url);
4240

4341
# longitude, latitude and time are the coordinate variables defined in the
4442
# zarr dataset

docs/src/index.md

Lines changed: 45 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,57 @@
11

22
## ZarrDatasets
33

4+
5+
See the [documentation of JuliaGeo/CommonDataModel.jl](https://juliageo.org/CommonDataModel.jl/stable/) for the full documentation of the API. As a quick reference, here is an example how to create and read a Zarr file store as a quick reference.
6+
7+
### Create a Zarr file store
8+
9+
The following example create a Zarr file store in the directory `"/tmp/test-zarr"`:
10+
11+
```julia
12+
using ZarrDatasets
13+
14+
# sample data
15+
data = [i+j for i = 1:3, j = 1:5]
16+
17+
directoryname = "/tmp/test-zarr"
18+
mkdir(directoryname)
19+
20+
ds = ZarrDataset(directoryname,"c")
21+
defDim(ds,"lon",size(data,1))
22+
defDim(ds,"lat",size(data,2))
23+
zv = defVar(ds,"varname",Int64,("lon","lat"))
24+
zv[:,:] = data
25+
zv.attrib["units"] = "m"
26+
close(ds)
27+
```
28+
29+
### Loading a Zarr file store
30+
31+
The data and units can be loaded by indexing the data set structure `ds`.
32+
33+
```julia
34+
using ZarrDatasets
35+
directoryname = "/tmp/test-zarr"
36+
ds = ZarrDataset(directoryname)
37+
data = ds["varname"][:,:]
38+
data_units = ds["varname"].attrib["units"]
39+
```
40+
41+
42+
443
```@autodocs
544
Modules = [ZarrDatasets]
645
```
746

847

48+
49+
50+
951
### Differences between Zarr and NetCDF files
1052

1153
* All metadata (in particular attributes) is stored in JSON files for the Zarr format with the following implications:
12-
* JSON does not distinguish between integers and real numbers. They are all considered as generic numbers. Whole numbers are loaded as `Int64` and decimal numbers `Float64`. It is not possible to store the number `1.0` as a real number.
54+
* JSON does not distinguish between integers and real numbers. They are all considered as generic numbers. Whole numbers are loaded as `Int64` and real numbers `Float64`. It is not possible to store the number `1.0` as a real number.
1355
* The order of keys in a JSON document is undefined. It is therefore not possible to have a consistent ordering of the attributes or variables.
14-
* The JSON standard does not allow NaN, +Inf, -Inf (https://github.com/capnproto/capnproto/issues/261).
56+
* The JSON standard does not allow the values NaN, +Inf, -Inf which is problematic for attributes ([zarr-python #412](https://github.com/zarr-developers/zarr-python/issues/412), [zarr-specs #81](https://github.com/zarr-developers/zarr-specs/issues/81)). However, there is a special case for the fill-value to handle NaN, +Inf and -Inf.
57+
* All dimensions must be associated to Zarr variables.

src/dataset.jl

Lines changed: 17 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -61,16 +61,21 @@ CDM.maskingvalue(ds::ZarrDataset) = ds.maskingvalue
6161

6262
"""
6363
ds = ZarrDataset(url::AbstractString,mode = "r";
64-
_omitcode = 404,
64+
_omitcode = [404,403],
6565
maskingvalue = missing)
6666
ZarrDataset(f::Function,url::AbstractString,mode = "r";
6767
maskingvalue = missing)
6868
69-
Open the zarr dataset at the url or path `url`. Only the read-mode is
70-
currently supported. `ds` supports the API of the
69+
Open the zarr dataset at the url or path `url`. The mode can only be `"r"` (read-only)
70+
or `"c"` (create). `ds` supports the API of the
7171
[JuliaGeo/CommonDataModel.jl](https://github.com/JuliaGeo/CommonDataModel.jl).
72-
The experimental `_omitcode` allows to work-around servers that return
73-
HTTP error different than 404 for missing chunks.
72+
The experimental `_omitcode` allows to define which HTTP error code should be used
73+
for missing chunks. For compatibility with python's Zarr, the HTTP error 403
74+
(permission denied) is also used to missing chunks in addition to 404 (not
75+
found).
76+
77+
The parameter `maskingvalue` allows to define which special value should be used
78+
as replacement for fill values. The default is `missing`.
7479
7580
Example:
7681
@@ -101,11 +106,10 @@ zos1 = ZarrDataset(url) do ds
101106
ds["zos"][:,:,end,1]
102107
end # implicit call to close(ds)
103108
```
104-
105109
"""
106110
function ZarrDataset(url::AbstractString,mode = "r";
107111
parentdataset = nothing,
108-
_omitcode = 404,
112+
_omitcode = [404,403],
109113
maskingvalue = missing,
110114
attrib = Dict(),
111115
)
@@ -134,7 +138,7 @@ function ZarrDataset(url::AbstractString,mode = "r";
134138
end
135139
elseif mode == "c"
136140
store = Zarr.DirectoryStore(url)
137-
zg = zgroup(store, "",attrs = Dict(attrib))
141+
zg = zgroup(store, "",attrs = Dict{String,Any}(attrib))
138142
iswritable = true
139143
end
140144
ZarrDataset(parentdataset,zg,dimensions,iswritable,maskingvalue)
@@ -153,3 +157,8 @@ function ZarrDataset(f::Function,args...; kwargs...)
153157
close(ds)
154158
end
155159
end
160+
161+
export ZarrDataset
162+
export defDim
163+
export defVar
164+
#export defGroup

src/variable.jl

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,7 @@ function CDM.defVar(ds::ZarrDataset,name::SymbolOrString,vtype::DataType,dimensi
5252
fillvalue = get(attrib,"_FillValue",nothing)
5353
end
5454

55-
_attrib = Dict(attrib)
55+
_attrib = Dict{String,Any}(attrib)
5656
_attrib["_ARRAY_DIMENSIONS"] = reverse(dimensionnames)
5757

5858
_size = ntuple(length(dimensionnames)) do i
@@ -62,6 +62,7 @@ function CDM.defVar(ds::ZarrDataset,name::SymbolOrString,vtype::DataType,dimensi
6262
if isnothing(chunksizes)
6363
chunksizes = _size
6464
end
65+
6566
zarray = zcreate(
6667
vtype, ds.zgroup, name, _size...;
6768
chunks = chunksizes,

test/Project.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,5 +8,6 @@ Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40"
88

99
[compat]
1010
Aqua = "0.8"
11+
CommonDataModel = "0.3.6"
1112
NCDatasets = "0.14"
1213
julia = "1"

test/test_aqua.jl

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,7 @@
11
using Aqua
22
using ZarrDatasets
33

4+
45
Aqua.test_ambiguities(ZarrDatasets)
6+
# some internal ambiguities in DiskArray 0.3 probably fixed in 0.4
7+
Aqua.test_all(ZarrDatasets, ambiguities = false)

test/test_write.jl

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
using Test
12
using ZarrDatasets
23
using ZarrDatasets:
34
defDim,
@@ -7,13 +8,14 @@ data = rand(Int32,3,5)
78

89
fname = tempname()
910
mkdir(fname)
10-
gattrib = Dict{String,Any}("title" => "this is the title")
11+
gattrib = Dict("title" => "this is the title")
1112
ds = ZarrDataset(fname,"c",attrib = gattrib)
1213

14+
ds.attrib["number"] = 1
1315
defDim(ds,"lon",3)
1416
defDim(ds,"lat",5)
1517

16-
attrib = Dict{String,Any}(
18+
attrib = Dict(
1719
"units" => "m/s",
1820
"long_name" => "test",
1921
)
@@ -25,7 +27,7 @@ vtype = Int32
2527

2628
zv = defVar(ds,varname,vtype,dimensionnames, attrib = attrib)
2729
zv[:,:] = data
28-
zv.attrib["lala"] = 12
30+
zv.attrib["number"] = 12
2931
zv.attrib["standard_name"] = "test"
3032
ds.attrib["history"] = "test"
3133
close(ds)
@@ -34,7 +36,7 @@ ds = ZarrDataset(fname)
3436

3537
zv = ds[varname]
3638

37-
@test zv.attrib["lala"] == 12
39+
@test zv.attrib["number"] == 12
3840
@test zv.attrib["standard_name"] == "test"
3941
@test ds.attrib["history"] == "test"
4042

0 commit comments

Comments
 (0)