Skip to content

Conversation

d-v-b
Copy link
Contributor

@d-v-b d-v-b commented Jan 30, 2025

while working on #2665, I had the following experience in the group.py module:

  • I write a collection of async functions that together support some useful public functionality.
  • I want to expose some (but not all) of these async functions as public synchronous functions.
  • Because the async functions and the sync functions are tightly coupled (the sync version is a light wrapper around the async version), I want them to be close together in code space, so that the relationship between the two is obvious.
  • I do not want the async and sync functions to have different names. So they need to be in different namespaces.

We already have something like this in our api module, which is split into asynchronous and synchronous modules, where the latter imports from the former. I am proposing to use the same logical arrangement inside the group and array modules.

This PR is a rough outline of what that would look like. tests will not pass. My request is that you look at how the code in the new group module is arranged, and weigh in on whether you think this direction makes sense.

@github-actions github-actions bot added the needs release notes Automatically applied to PRs which haven't added release notes label Jan 30, 2025
@d-v-b
Copy link
Contributor Author

d-v-b commented Jan 30, 2025

in this PR, i split group into core, sync, and metadata modules.

  • metadata contains the group metadata classes.
  • core contains async code and necessary utilities. The main export here is the AsyncGroup class, but also functions that do useful things to AsyncGroup instances.
  • sync contains blocking versions of a subset of the async stuff in core, including the Group class definition.

I also sketched out what it might look like if we partitioned array.py into sub-modules (core, sync, indexing, etc).

This layout would allow me to define a useful async function called do_something_with_group in group.core, then import that function to group.sync, and wrap it inside group.sync.do_something_with_group, e.g.

# group/sync.py
# looks dumb but it's explicit
import zarr.core.group.core as core
from zarr.core.sync import sync
def do_something_with_group():
  return sync(core.do_something_with_group())

Instead of defining these synchronous wrappers of async functions in api.synchronous, we would define them in group.sync or array.sync; api.synchronous would just re-export these routines.

@d-v-b
Copy link
Contributor Author

d-v-b commented Jan 31, 2025

after getting positive feedback in the community meeting, I'm going to close this and go ahead with an implementation of this layout in #2665

@d-v-b d-v-b closed this Jan 31, 2025
@d-v-b d-v-b deleted the experiment/refactor_core branch January 31, 2025 16:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs release notes Automatically applied to PRs which haven't added release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant