-
Notifications
You must be signed in to change notification settings - Fork 25
Modify dset/attr builders based on sidecar JSON #677
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Changes from 22 commits
9e4ba60
1f53919
dafc650
de5fefe
3f1f8f2
036fa1e
b4b5419
151c69d
32d1397
933ef40
393e5b3
2fda06d
28c6893
6da168d
618ab1c
393ffdf
729e989
ecd244d
168f4a9
1c57573
62ed248
7078ca1
9faf7a2
827d61d
ef22dc5
2bb7185
fee5245
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,154 @@ | ||||||||||||||||||||||
| .. _modifying_with_sidecar: | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| Modifying an HDMF File with a Sidecar JSON File | ||||||||||||||||||||||
| =============================================== | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| Users may want to update part of an HDMF file without rewriting the entire file. | ||||||||||||||||||||||
|
||||||||||||||||||||||
| Users may want to update part of an HDMF file without rewriting the entire file. | |
| Users may want to update part of an HDMF file without rewriting the entire file. |
I think it would be useful to elaborate a little bit on this to clarify the intent and scope of the sidecar file, i.e., this is for small updates and corrections only.
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| delete a dataset or attribute but cannot create a new dataset or attribute. | |
| hide a dataset or attribute so that it will not be read by HDFM but cannot create a new dataset or attribute. |
I think delete is misleading since we are not actually deleting any data from a file but the JSON file can only indicate that the dataset/attribute should be ignored on read (maybe hide or invalid would be more precise).
Does delete also apply to groups?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I'll make the change. For now, I have not allowed hiding of groups because the use case is unclear. But it is technically not very different from hiding of datasets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a main use-case for hiding groups would instances of a data_type, e.g., to hide a TimeSeries that for some reason contains bad data. If it's trivial, then I think allowing to hide groups is something we could allow, but if it adds a lot of complexity then I would hold off until a specific need arises.
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| delete a dataset or attribute but cannot create a new dataset or attribute. | |
| hide a dataset or attribute so that it will not be read by HDFM but cannot create a new dataset or attribute. |
I think delete is misleading since we are not actually deleting any data from a file but the JSON file can only indicate that the dataset/attribute should be ignored on read (maybe hide or invalid would be more precise).
Does delete also apply to groups?
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are sidecar files automatically validated by the validator as well?
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are sidecar files automatically validated by the validator as well?
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Operations can result in invalid files, i.e., files that do not conform to the specification. It is strongly | |
| recommended that the file is validated against the schema after loading the sidecar JSON. In some cases, the | |
| file cannot be read because the file is invalid. | |
| .. warning: | |
| Modifying a file via a sidecar file can result in files that are no longer compliant with the format | |
| specification of the file. E.g., we may ``delete`` a required dataset via a sidecar operation, resulting | |
| in an invalid file that in the worst case, may longer be readable because required arguments are missing. | |
| It is strongly recommended that the file is validated against the schema after loading the sidecar JSON. | |
rly marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,194 @@ | ||
| { | ||
| "$schema": "http://json-schema.org/draft-07/schema#", | ||
| "$id": "sidecar.schema.json", | ||
| "title": "Schema for the sidecar JSON file", | ||
| "description": "A schema for validating HDMF sidecar JSON files", | ||
| "version": "0.1.0", | ||
| "type": "object", | ||
| "additionalProperties": false, | ||
| "required": [ | ||
| "description", | ||
| "author", | ||
| "contact", | ||
| "operations", | ||
| "schema_version" | ||
| ], | ||
| "properties": { | ||
| "description": { | ||
| "description": "A free-form string describing the modifications specified in this file.", | ||
| "type": "string" | ||
| }, | ||
| "author": { | ||
| "description": "A list of free-form strings containing the names of the people who created this file.", | ||
| "type": "array", | ||
| "items": {"type": "string"} | ||
| }, | ||
| "contact": { | ||
| "description": "A list of email addresses for the people who created this file. Each author listed in the 'author' key *should* have a corresponding email address.", | ||
| "type": "array", | ||
| "items": { | ||
| "type": "string", | ||
| "pattern": "^.*@.*$" | ||
| } | ||
| }, | ||
| "operations": { | ||
| "description": "A list of operations to perform on the data in the file.", | ||
| "type": "array", | ||
| "items": { | ||
| "type": "object", | ||
| "additionalProperties": false, | ||
| "required": [ | ||
| "type", | ||
| "description", | ||
| "object_id", | ||
| "relative_path" | ||
| ], | ||
| "properties": { | ||
| "type": { | ||
| "description": "The type of modification to perform.", | ||
| "member_region": { | ||
| "type": ["replace", "delete"] | ||
| } | ||
| }, | ||
| "description": { | ||
| "description": "A description of the specified modification.", | ||
| "type": "string" | ||
| }, | ||
| "object_id": { | ||
| "description": "The object ID (UUID) of the data type that is closest in the file hierarchy to the field being modified. Must be in the UUID-4 format with hyphens.", | ||
| "type": "string", | ||
| "pattern": "^[0-9a-f]{8}\\-[0-9a-f]{4}\\-4[0-9a-f]{3}\\-[89ab][0-9a-f]{3}\\-[0-9a-f]{12}$" | ||
| }, | ||
| "relative_path": { | ||
| "description": " The relative path from the data type with the given object ID to the field being modified.", | ||
| "type": "string" | ||
| }, | ||
| "element_type": { | ||
| "anyOf": [ | ||
| { | ||
| "type": "string", | ||
| "enum": [ | ||
| "group", | ||
| "dataset", | ||
| "attribute" | ||
| ] | ||
| } | ||
| ] | ||
| }, | ||
| "value": { | ||
| "description": "The new value for the dataset/attribute.", | ||
| "member_region": { | ||
| "type": ["array", "string", "number", "boolean", "null"] | ||
| } | ||
| }, | ||
| "dtype": {"$ref": "#/definitions/dtype"} | ||
| }, | ||
| "allOf": [ | ||
| { | ||
| "description": "if type==replace, then value is required.", | ||
| "if": { | ||
| "properties": { "type": { "const": "replace" } } | ||
| }, | ||
| "then": { | ||
| "required": [ "value" ] | ||
| } | ||
| }, | ||
| { | ||
| "description": "if type==delete, then value and dtype are not allowed.", | ||
| "if": { | ||
| "properties": { "type": { "const": "delete" } } | ||
| }, | ||
| "then": { | ||
| "properties": { | ||
| "value": false, | ||
| "dtype": false | ||
| } | ||
| } | ||
| }, | ||
| { | ||
| "description": "if type==create, then element_type is required.", | ||
| "if": { | ||
| "properties": { "type": { "const": "create" } } | ||
| }, | ||
| "then": { | ||
| "required": [ "element_type" ] | ||
| } | ||
| } | ||
| ] | ||
| } | ||
| }, | ||
| "schema_version": { | ||
| "description": "The version of the sidecar JSON schema that the file conforms to. Must confirm to Semantic Versioning v2.0.", | ||
| "type": "string", | ||
| "pattern": "^(0|[1-9]\\d*)\\.(0|[1-9]\\d*)\\.(0|[1-9]\\d*)(?:-((?:0|[1-9]\\d*|\\d*[a-zA-Z-][0-9a-zA-Z-]*)(?:\\.(?:0|[1-9]\\d*|\\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?(?:\\+([0-9a-zA-Z-]+(?:\\.[0-9a-zA-Z-]+)*))?$" | ||
| } | ||
| }, | ||
| "definitions": { | ||
| "dtype": { | ||
| "anyOf": [ | ||
| {"$ref": "#/definitions/flat_dtype"}, | ||
| {"$ref": "#/definitions/compound_dtype"} | ||
| ] | ||
| }, | ||
| "flat_dtype": { | ||
| "description": "String describing the data type of the dataset or attribute.", | ||
| "anyOf": [ | ||
| { | ||
| "type": "string", | ||
| "enum": [ | ||
| "float", | ||
| "float32", | ||
| "double", | ||
| "float64", | ||
| "long", | ||
| "int64", | ||
| "int", | ||
| "int32", | ||
| "int16", | ||
| "int8", | ||
| "uint", | ||
| "uint32", | ||
| "uint16", | ||
| "uint8", | ||
| "uint64", | ||
| "text", | ||
| "utf", | ||
| "utf8", | ||
| "utf-8", | ||
| "ascii", | ||
| "bool", | ||
| "isodatetime" | ||
| ] | ||
| }, | ||
| {"$ref": "#/definitions/ref_dtype"} | ||
| ] | ||
| }, | ||
| "ref_dtype": { | ||
| "type": "object", | ||
| "required": ["target_type", "reftype"], | ||
| "properties": { | ||
| "target_type": { | ||
| "description": "Describes the data_type of the target that the reference points to", | ||
| "type": "string" | ||
| }, | ||
| "reftype": { | ||
| "description": "Describes the kind of reference", | ||
| "type": "string", | ||
| "enum": ["ref", "reference", "object", "region"] | ||
| } | ||
| } | ||
| }, | ||
| "compound_dtype": { | ||
| "type": "array", | ||
| "items": { | ||
| "type": "object", | ||
| "required": ["name", "doc", "dtype"], | ||
| "properties": { | ||
| "name": {"$ref": "#/definitions/protectedString"}, | ||
| "doc": {"type": "string"}, | ||
| "dtype": {"$ref": "#/definitions/flat_dtype"} | ||
| } | ||
| } | ||
| } | ||
| } | ||
| } |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1 +1,2 @@ | ||
| from . import hdf5 | ||
| from .builderupdater import SidecarValidationError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be useful to elaborate a little bit on this to clarify the intent and scope of the sidecar file, i.e., this is for small updates and corrections only.