-
Notifications
You must be signed in to change notification settings - Fork 40
Requirements for data loaders for scivision #511
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,92 @@ | ||
| # Template for new SCIPs | ||
|
|
||
| ## Metadata | ||
|
|
||
| Editor: | ||
| ... | ||
|
|
||
| Status (raw | draft | stable | deprecated | retired): | ||
| raw | ||
|
|
||
| ## Description | ||
|
|
||
| ... | ||
|
|
||
| ## Requirements | ||
|
|
||
| ### What is included in the catalog entry for a datasource? | ||
|
|
||
| - A URL to a remote location (as given below) | ||
| - The URL should be a browsable location, structured according to one of the supported 'data-sharing patterns' (see below) | ||
| - | ||
|
|
||
| - An indication of additional scivision plugins required to load the data, if not (?) | ||
|
|
||
| ### Image formats | ||
|
|
||
| - Built-in support for any common format (via a library, such as skimage) | ||
| - Built-in support for formats common across scientific domains, not included in the above | ||
| - Whether to support a given format should be considered against the cost of the additional dependencies it requires, and the burden of these (e.g. something that makes core scivision less portable, or adds an extra installation step might be rejected, but a single python-only dependency considered acceptable) | ||
|
|
||
| - A 'plugin' system for extending to additional formats | ||
|
|
||
ots22 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| #### Additional image formats | ||
|
|
||
| Below is a list of additional image formats to consider for built-in support | ||
|
|
||
| - | ||
| - | ||
|
|
||
|
|
||
| ### Supported data services | ||
|
|
||
| #### Notes | ||
|
|
||
| - 'Core' scivision (without additional) should maintain support for several remote data is commonly archived. | ||
ots22 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| - Often locations are specified using URLs with a http/https scheme, but e.g. directory browsing is not supported by http, which limits the generality or usefulness of this approach. | ||
|
|
||
| - One possibility that is supported by plain http is a direct link to 'archive' file system (e.g. a zip file containing one of the patterns below). | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This way does seem like it would be easy for a data provider to comply with There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is at least partly supported already, but it seemed worth being explicit about the issue, and that a zip file (or other archive) is available as a single-file option. One motivation for having special handling of other data hosting services is so data providers have a better chance of including their data as-is, but with this option as a fallback, potentially. |
||
|
|
||
| - Examples consisting of a single image are supported for the same reason, but might not be particularly interesting | ||
|
|
||
| - A single file containing some metadata for Intake or a scivision plugin is another possibility | ||
|
|
||
| #### Particular services to support | ||
|
|
||
| - Automatic support for single image files and archives | ||
| - The URL of an Intake catalog | ||
| - The URL of some data-plugin metadata | ||
| - Zenodo | ||
| - GitHub | ||
| - ... | ||
ots22 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| #### Pull requests accepted | ||
|
|
||
| - Improve, updating, maintaining the existing supported services (e.g. fixing the library to work after an API change | ||
|
|
||
| - Adding support for other common remote locations (a test might be: are there two or more independent data sources in the catalog that) | ||
|
|
||
|
|
||
| ### Native support for common data sharing patterns | ||
|
|
||
| #### Directory of image files | ||
|
|
||
| #### Image + csv labels | ||
|
|
||
| #### An Intake yaml file catalog | ||
|
|
||
| #### A yaml file, with metadata for a custom data plugin | ||
|
|
||
|
|
||
| ## High-level software design | ||
|
|
||
| For Scivision.Py | ||
|
|
||
| - Consider using fsspec for handling remote locations (get archive support, variety of URL schemes) | ||
|
|
||
| - Abstract base class for a `DataService` | ||
|
|
||
| ## Remaining questions | ||
|
|
||
| ... | ||
Uh oh!
There was an error while loading. Please reload this page.