Skip to content

Commit dc6cc91

Browse files
Merge pull request zarr-developers#61 from MSanKeys963/add_meeting_notes
Add meeting notes
2 parents 3a3a6e2 + ad875ae commit dc6cc91

File tree

13 files changed

+502
-3
lines changed

13 files changed

+502
-3
lines changed

meetings/2024/2024-01-11.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
layout: default
33
title: 11th January
4-
description: ZEPs Meeting Notes for 2023-01-11
4+
description: ZEPs Meeting Notes for 2024-01-11
55
grand_parent: ZEP meetings
66
parent: 2024 meetings
77
nav_order: 1

meetings/2024/2024-01-25.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
layout: default
33
title: 25th January
4-
description: ZEPs Meeting Notes for 2023-01-25
4+
description: ZEPs Meeting Notes for 2024-01-25
55
grand_parent: ZEP meetings
66
parent: 2024 meetings
77
nav_order: 2

meetings/2024/2024-02-08.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
layout: default
33
title: 8th February
4-
description: ZEPs Meeting Notes for 2023-02-08
4+
description: ZEPs Meeting Notes for 2024-02-08
55
grand_parent: ZEP meetings
66
parent: 2024 meetings
77
nav_order: 3

meetings/2024/2024-02-22.md

Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
---
2+
layout: default
3+
title: 22nd February
4+
description: ZEPs Meeting Notes for 2024-02-22
5+
grand_parent: ZEP meetings
6+
parent: 2024 meetings
7+
nav_order: 4
8+
---
9+
10+
# 2024-02-22
11+
12+
**Attending:** Sanket Verma (SV), Ward Fisher (WF), Josh Moore (JM), Martin Durant (MD), Tom Nicholas (TN)
13+
14+
## TL;DR:
15+
16+
The meeting covered the use of LLMs in training, feedback on the Zarr-Specs website redesign, progress on the V3 refactor, and discussions on integrating kerchunk into Zarr, focusing on chunk manifest standardization and virtualized array concatenation.
17+
18+
**Updates:**
19+
20+
- HTTP Extension meeting: <https://docs.google.com/document/d/14TJfrjbfU1R2REjrZ35GjV74MJ18j_m6geWdj0oB83Y/edit?usp=sharing>
21+
22+
**Meeting Notes:**
23+
24+
- LLMs and how WF is using them in trainings
25+
- Feedback for new design for Zarr-Specs website (combines ZEP and Zarr-Specs together)
26+
- Link: <https://docs-test-sanket.readthedocs.io/en/latest/>
27+
- MD: How's V3 refactor work going on these days?
28+
- JM: Quite good progress taking place these days
29+
- SV: V3 PRs can be found here - <https://github.com/zarr-developers/zarr-python/pulls?q=is%3Apr+is%3Aopen+label%3AV3>
30+
- MD: <https://zarr.dev/zeps/draft/ZEP0003.html>
31+
- TN: Been discussing → <https://github.com/zarr-developers/zarr-specs/issues/287>
32+
- interested in integrating kerchunk into zarr, especially two ZEPs
33+
- (1) chunk manifest (Joe) - standardizing what chunk json files do
34+
- (2) concatenation - <https://github.com/zarr-developers/zarr-specs/issues/288>
35+
- 1. manifest: opinion that it's an incredible idea that is very popular
36+
- fsspec relationship makes things complicated
37+
- move to the zarr spec for other implementations?
38+
- goal is readable in any language
39+
- difficult position
40+
- three things to think about
41+
- read byte ranges
42+
- write JSON
43+
- combine module
44+
- roadmap:
45+
- standardize json for the chunks. manifest file?
46+
- JM: storage in zarr array itself
47+
- JM: log file anytime you read a full file into memory
48+
- Josh: virtual zarr (access pattern)
49+
- 2. concatenation
50+
- multi-zarr-to-zarr leads to a loop
51+
- more sense to think of concat of virtualized arrays objects
52+
- see kerchunk array notebook
53+
- read in byte ranges with kerchunk. array class which only stores byte-offset arrays in memory
54+
- can be done in xarray. concat-classes can be put into xarray and can use higher-order API
55+
- JM: store that xarray as a zarr :smile: (but need additional metadata for realizing the array)
56+
- TN: part of notebook that isn't done. exactly.
57+
- common case in geo. multiple NC files, concat those array.
58+
- possibly compression options change over time.
59+
- prevents it from being one zarr array
60+
- JM: or just always serialize to the chunk manifest
61+
- JM: i.e. where do we stop? (when does Zarr become Turing Complete?)
62+
- TN: thought at concat (clear use case). but jeremy thought indexing (also clear use case)
63+
- JM: starting to sound like transforms (<https://github.com/ome/ngff/pull/138#issuecomment-1948424000>)
64+
- WF: periodically get requests for operations on the data
65+
- no one has come close to making the argument for adding that into the storage
66+
- so many math libraries that would do it better
67+
- TN: no computation since you don't need the values. can do some subset of concat & indexing without values.
68+
- TN: have now become a zarr producer :tada:
69+
- JM: cross-language motivation
70+
- SV: pyramiding ZEP discussions

meetings/2024/2024-03-07.md

Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
---
2+
layout: default
3+
title: 7th March
4+
description: ZEPs Meeting Notes for 2024-03-07
5+
grand_parent: ZEP meetings
6+
parent: 2024 meetings
7+
nav_order: 5
8+
---
9+
10+
# 2024-03-07
11+
12+
**Attending:** Sanket Verma (SV), Ward Fisher (WF), Davis Bennett (DB), Josh Moore (JM), Thomas Nicholas (TN), Jeremy Maitin-Shepard (JMS)
13+
14+
## TL;DR:
15+
16+
The meeting discussed enhancing Zarr store browsing, pushing Kerchunk functionality into Zarr, and the potential for a chunk manifest as a ZEP. Additionally, the group explored ideas for revising ZEP0, creating a Zarr specification IETF standard, and shared updates on upcoming Zarr HTTP Extension and SEA conference events.
17+
18+
**Updates:**
19+
20+
- Zarr HTTP Extension Meeting next week
21+
- Check here: <https://zarr.dev/community-calls>
22+
- TN: <https://hackmd.io/t9Myqt0HR7O0nq6wiHWCDA?view>
23+
- Had a conversation with folks over at Development Seed, NASA, Earthmover
24+
25+
**Meeting Minutes:**
26+
27+
- TN: Jed wants to have nice ways to browse the Zarr stores - they have nice ways to browse `.tiff` files already
28+
- Wants to propose an extension to add more information in the metadata
29+
- The end result would look more like a Xarray HTML wrapper
30+
- TN: <https://hackmd.io/t9Myqt0HR7O0nq6wiHWCDA?view>
31+
- Had a conversation with folks over at Development Seed, NASA, Earthmover
32+
- DB: Pushing Kerchunk functionality into Zarr stores
33+
- DB: Whether the feature could be file format agnostic?
34+
- TN: Argues that it should be a ZEP - and can be read every Zarr implementation
35+
- JM: Having same thing implemented in FSSPEC
36+
- DB: Would ZEP
37+
- WF: HDF5 group may be open to a conversation
38+
- SV: <https://zarr.dev/zeps/meetings/2023/2023-08-10.html> might have some useful information
39+
- TN: _recaps the conversation for JMS_
40+
- TN: Should concatenation be a part of the current ZEP?
41+
- DB: Any reason you don't want to concatenate HDF5 and other file formats?
42+
- TN: Chunk manifest would point inside the arrays - chunk manifest could let you create a Zarr store over other formats as well
43+
- DB: This would make Zarr as an API/access pattern
44+
- TN: Can be created and tested fairly separate to Zarr - personally think chunk manifest is neat feature - implementation can support/not support it
45+
- DB: Array mutation can break the concatenation - having guidelines for archival arrays would help
46+
- TN: Currently we're thinking about read-only case
47+
- TN: Virtualisation in Kerchunk is a spotlight feature
48+
- JMS: Manifest is a good idea and keeping it separate would be a minor difference - needs to align with Kerchunk
49+
- JM: report/ZEP idea (time permitting)
50+
- <https://w3id.org/ro/crate/>
51+
- JM: Putting ro-create inside Zarr - or making Zarr specification a IETF standard
52+
- JM: Would probably go ahead and write a convention in NGFF space
53+
- <https://fairdo.org/>
54+
- JM: Have a mechanism for going up/down the hierarchy - useful for the HTTP extension discussions
55+
- Revising ZEP0
56+
- <https://github.com/zarr-developers/zeps/pull/59> - comments/feedback welcome
57+
- DB: :+1:
58+
- DB: Would be easy to have a single PR for my ZEP
59+
- JMS: Putting narrative document in PR description
60+
- JM: Weird for commenting on the PR description and for the public visibility
61+
- JMS: Rationale can be put down as a footnote
62+
- JMS: Having numeric numbering is something Python follows
63+
- JMS: The actual specification change can also serve as a ZEP narrative
64+
- SV: We can pick out certain sections out of the ZEP narrative document
65+
- JMS: Having a PR template similar to ZEP's narrative could also help us
66+
- WF: <https://sea.ucar.edu/conference/2024>
67+
- In-person and virtual registrations are available

meetings/2024/2024-03-21.md

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
---
2+
layout: default
3+
title: 21st March
4+
description: ZEPs Meeting Notes for 2024-03-21
5+
grand_parent: ZEP meetings
6+
parent: 2024 meetings
7+
nav_order: 6
8+
---
9+
10+
# 2024-03-21
11+
12+
**Attending:** Sanket Verma (SV), Thomas Nicholas (TN), Ward Fisher (WF)
13+
14+
## TL;DR:
15+
16+
**Updates:**
17+
18+
- Join ZulipChat: <https://ossci.zulipchat.com/>
19+
- HTTP Extension meeting took place on 3/14
20+
- Trying to figure out the best way forward, i.e. a ZEP or not
21+
- Guaging interest and use cases from others in the community
22+
23+
**Meeting Minutes:**
24+
25+
- HTTP Extension
26+
- WF: Can see the shape of it, and I think it would be useful
27+
- SV: Existing thread: <https://ossci.zulipchat.com/#narrow/stream/423692-Zarr/topic/HTTP.20Extension>
28+
- TN: Tom's company may have a use case for the HTTP work
29+
- Showing [VirtualiZarr](https://github.com/TomNicholas/VirtualiZarr) (related to the "chunk manifest" ZEP)
30+
- TN: Been working on the packages for the last 2 weeks - could potentially replace Kerchunk
31+
- TN: _code walkthrough via screen sharing_
32+
- TN: Storing the virtual Zarr manifests, not the actual array values
33+
- TN: Could move `class ManifestArray` to Zarr-Python - arguments in favour and against it
34+
- TN: Could see donating VirtualiZarr to zarr-developers
35+
- SV: **Action items**
36+
- TN to create a topic for VirtualiZarr to gather feedback/comments
37+
- SV to try VirtualiZarr
38+
- TN and SV to work on ZEP Extension proposal for virtual Zarr manifest and formally present it for broader feedback
39+
- TABLED
40+
- Revising ZEP0
41+
- <https://github.com/zarr-developers/zeps/pull/59>

meetings/2024/2024-04-04.md

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
---
2+
layout: default
3+
title: 4th April
4+
description: ZEPs Meeting Notes for 2024-04-04
5+
grand_parent: ZEP meetings
6+
parent: 2024 meetings
7+
nav_order: 7
8+
---
9+
10+
# 2024-04-04
11+
12+
**Attending:** Sanket Verma (SV), Josh Moore (JM), Ward Fisher (WF)
13+
14+
## TL;DR:
15+
16+
**Updates:**
17+
18+
- CZI EOSS6 Application not funded
19+
20+
**Meeting Minutes:**
21+
22+
- NASA Grant (WF)
23+
- <https://nspires.nasaprs.com/external/solicitations/summary.do?solId=%7b910CC61E-4616-9958-C26F-F8D9BC5AB8D9%7d&path=&method=init>
24+
- Townhall meeting slides: <https://docs.google.com/presentation/d/14g5UPUQFsk4QW3gqwB4gtNSHm8vcdUVAmwzSFjtdVN4/edit?usp=sharing>
25+
- Looking towards sustaining the already established open source software
26+
- NetCDF is looking for collaboration for their application
27+
- JM: Collaborators in US could be NF, OpenCollective, NVIDA, Columbia etc.
28+
- JM: Will reach out to NF for their NASA grants' experience

meetings/2024/2024-04-18.md

Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
---
2+
layout: default
3+
title: 18th April
4+
description: ZEPs Meeting Notes for 2024-04-18
5+
grand_parent: ZEP meetings
6+
parent: 2024 meetings
7+
nav_order: 8
8+
---
9+
10+
# 2024-04-18
11+
12+
**Attending:** Josh Moore (JM), Vicent Immler (VI), Sanket Verma (SV), Ward Fisher (WF), Altay Sansal (AS), Jeremy Maitin-Shepard (JMS)
13+
14+
## TL;DR:
15+
16+
The meeting covered the proposal to remove implicit groups in Zarr, progress on ZEP4 and ZEP3, and updates on V3 implementation. Additionally, discussions included async read optimizations for Zarr and the impact on performance, especially concerning large datasets and parallel data ingestion.
17+
18+
**Updates:**
19+
20+
- Davis wants to remove implicit groups: <https://github.com/zarr-developers/zarr-specs/pull/292>
21+
- Activity going-on at ZEP4 Review PR
22+
23+
**Meeting Minutes:**
24+
25+
- Introductions w/ last gift you got
26+
- Sanket - cologne and clothes
27+
- Vincent - wooden board forged with family crescent
28+
- Ward - camping tent
29+
- Josh - pecan nuts
30+
- Altay - lead data scientist - lego
31+
- Removing Implicit groups
32+
- JM: Discussed at community meeting - needs to go back to root node to figure out the group
33+
- JM: Tensorstore doesn't use Zarr groups at all
34+
- WF: Supposition from my side
35+
- WF: Dennis completed the V3 implementation!
36+
- JM: Are we closer to parity in V3 work - a question for Dennis!
37+
- VI: How does implicit groups affect performance?
38+
- JM: No, implicit groups means performance improvement
39+
- VI: Working on a new software implementation for students
40+
- JMS: No experience in working with groups
41+
- JM: Lot of callbacks
42+
- JMS: You'd definitely want to remove the looking upward
43+
- AL: Couldn't see a use-case for parallel creation of groups
44+
- JMS: You're ingesting lot of data in S3 and they read group metadata and have implicit groups
45+
- AL: `.zattrs` would have race condition?
46+
- JMS: Kind of a niche use-case
47+
- AL: Are Multi-processing locks concern metadata?
48+
- JMS: Multiple machine can leverage this!
49+
- AL: Removing would be a good idea!
50+
- AL: ZEP4 and ZEP3 progress
51+
- SV: AL, are you using V2 or V3?
52+
- AL: Using V2 and would love to move to V3 - have 20-30 PB data
53+
- AL: Want to work on `dimension_names` - what would be the best time to do it?
54+
- SV: After V3 release
55+
- VI: _explains GSoC application_
56+
- AS: Hacked Zarr to submit reads in a async manner to the machine to circumvent the problem
57+
- AS: Zarr V3 is going to be fully async so, it helps alleviates the problem
58+
- VI: Would be good to have a way to improve the read speeds for Zarr

meetings/2024/2024-05-02.md

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
---
2+
layout: default
3+
title: 2nd May
4+
description: ZEPs Meeting Notes for 2024-05-02
5+
grand_parent: ZEP meetings
6+
parent: 2024 meetings
7+
nav_order: 9
8+
---
9+
10+
# 2024-05-02
11+
12+
**Attending:** Josh Moore (JM), Sanket Verma (SV), Ward Fisher (WF), Jeremy Maitin-Shepard (JMS), Thomas Nicholas (TN)
13+
14+
## TL;DR:
15+
16+
The meeting discussed the upcoming Zarr V3 release candidate, status and integration of the chunk manifest ZEP, and potential revisions for combining ZEPs. They also covered progress on ZEP0 and plans to move ZEP2 to "Active" status, while tabling the discussion on removing implicit groups.
17+
18+
**Updates:**
19+
20+
**Meeting Minutes:**
21+
22+
- WF: Dennis has a PR coming up to revise the Zarr V3 - 6 months long work - release candidate coming soon!
23+
- TN: Status of unfinished ZEP w.r.t. to chunk manifest?
24+
- E.g. Have a sharded chunk manifest?
25+
- JMS: Chunk manifest could refer to entire shard - use case might not be clear
26+
- JMS: For viz tools you would not load entire shard at once
27+
- TN: Changing chunk manifest ZEP or accomodate chunk manifest in sharding codec?
28+
- JMS: Not really need to change
29+
- JM: Maybe there's a way to re-write the ZEP in a way which the existing ZEPs are composable - basically how extensions would interact with each other
30+
- TN:
31+
- JMS: There might be cases where combination of codecs may not work well
32+
- TN: Combining codecs seems straightforward compared to variable chunking which specifies what is allowed and what not
33+
- JMS: The proposal which changes the data model are tricky - wanted to add non-zero origin
34+
- TN: Interested in variable chunking
35+
- SV: Would you in be interested in contributing to ZEP3?
36+
- JM: We could also start thinking about ZEP2+ZEP3, ZEP3+ZEP4, ZEP4+ZEP2...
37+
- JMS: Any reason for not using Kerchunk?
38+
- TN: Chunk manifest is clearly defined Zarr store compared to kerchunk (which kinda looks like Zarr) - reference file-system are not defined - there's value of getting chunk manifest into Zarr specification as Kerchunk is more than Zarr
39+
- JMS: The actual implementation of the file-system would be same across the various libraries
40+
- TN: Relying on single maintainer code is not an ideal situation
41+
- SV: ZP V3 implementation was outdated which led to creation of Zarrita and then finally re-using Zarrita for ZP V3 refactor
42+
- TN: Working on nit-picking Xarray for Virtuali-Zarr
43+
- TN: Zarr arrays are kind-of lazy arrays - when you index into Z-arrays they provide you with bytes not the actual Zarr arrays - Xarray has lazy-loading hidden inside in codebase and there has been discussion to make it a standalone library
44+
- JMS: We could have two sizes for chunks - stored size and actual size for variable chunking strategy
45+
- Move ZEP2 from `Accepted` to `Active`
46+
- JM: Would be good to move ZEP1 and ZEP2 both at the same time
47+
- SV: ZP V3 refactor would be a good time to move ZEP1 to active
48+
- Finalise ZEP0 revisions
49+
- <https://github.com/zarr-developers/zeps/pull/59>
50+
- Re-start the conversation and finalise it
51+
- **TABLED**
52+
- Removing implicit groups - <https://github.com/zarr-developers/zarr-specs/pull/292>

meetings/2024/2024-05-16.md

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
---
2+
layout: default
3+
title: 16th May
4+
description: ZEPs Meeting Notes for 2024-05-16
5+
grand_parent: ZEP meetings
6+
parent: 2024 meetings
7+
nav_order: 10
8+
---
9+
10+
# 2024-05-16
11+
12+
**Attending:** Dennis Heimbigner (DH), Sanket Verma (SV), Josh Moore (JM), Jeremy Maitin-Shepard (JMS)
13+
14+
## TL;DR:
15+
16+
The meeting discussed the release of Zarr-Python 2.18.0, a new blog post by Joe Hamman, and updates on the sharding support in the R implementation. They also covered the implementation of manifest storage transformers, standardizing URLs for Zarr, and the removal of implicit groups in Zarr-Python V3.
17+
18+
**Updates:**
19+
20+
**Meeting Minutes:**
21+
22+
- Zarr-Python 2.18.0 out now: <https://github.com/zarr-developers/zarr-python/releases/tag/v2.18.0>
23+
- One of the last few releases for Zarr Spec 2 - if there's anything you want to get in, please reply/tag us in the PRs/issues
24+
- New blog post by Joe Hamman: <https://zarr.dev/blog/zarr-python-v3-update/>
25+
- Zarr-Python developers meeting new schedule - check here: <https://zarr.dev/community-calls/>
26+
- Lachlan Deakin added support for sharding in his R implementation: <https://github.com/LDeakin/zarrs/releases/tag/v0.13.1>
27+
28+
**Open agenda (add here 👇🏻):**
29+
30+
- JM: In sharding you can recurse and browse through the chunks - somthing like _chunks([x, y])_
31+
- DH: Treating sub-chunks as regular chunks - like what we decided during the storage transformers proposal -
32+
- _DH understands this proposal better and favours it_
33+
- The relevant issue: <https://github.com/zarr-developers/zarr-specs/issues/220>
34+
- DH: Time to gets hands dirty with the implementation and figure out any problems we have
35+
- JMS: Using storage transformers and codecs in Neuroglancer to achieve sharding
36+
- DH:
37+
- SV: Manifest storage transformers - <https://github.com/zarr-developers/zarr-specs/issues/287> - defines and implements on top of the storage transformer in V3 core spec - discussion 👇🏻
38+
- JMS: Good to define the `JSON` and add other formats later on
39+
- DH: FSSPEC interprets the URL in Kerchunk
40+
- DH: Having complete key values in URL would help in the long run - DAP made a mistake earlier and we fixed it - having a complete URL is a better option and you can replace the contents within it later on
41+
- JM: Having a complete URL in manifest storage transformer for Zarr would help us but there's a question of backward compatibility
42+
- JMS: <https://github.com/zarr-developers/zeps/pull/48/> standardise the URL
43+
- DH: URL spec defines the format and correct way of defninig a URL - if you consider things other than FSSPEC you should have a more standardised URL
44+
- DH: Conforming to the [URL Spec](https://www.w3.org/Addressing/URL/url-spec.txt) should be avoided actively
45+
- JM: Having URL defined in the storage transformed would help - currently not defined
46+
- Fix typo - <https://github.com/zarr-developers/zarr-specs/pull/294> - **MERGED**
47+
- Updated Zarr-Specs license to CC-BY-4.0 - <https://github.com/zarr-developers/zarr-specs/pull/295>
48+
- Implicit groups removed in Zarr-Python V3 via <https://github.com/zarr-developers/zarr-python/pull/1827>
49+
- Corresponding PR in Zarr-Specs - <https://github.com/zarr-developers/zarr-specs/pull/292>

0 commit comments

Comments
 (0)