Pseudo-versioning of metadata changes #1513
tmchartrand
started this conversation in
General
Replies: 1 comment
-
One point to consider is there may be performance implications on queries that need to be run quite frequently in the analysis architecture - not having thought through this thoroughly, I imagine it might be helpful to bubble the change date up to a higher schema level as well and index the DB on it for instance. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Getting down in writing some discussions with @dbirman and @dyf -
In general we treat data assets as immutable, and this is reflected in the processing schema which identifies a DataAsset input by its s3 URI only. However, we actually expect QC/Curation metadata to change, and even core metadata can change for a hotfix or schema update.
This creates a reproducibility challenge for downstream analysis that uses metadata, which we expect to be common for Curation at least (ie the average of some cell-wise metric across a curated population of cells within an asset).
Open questions:
Beta Was this translation helpful? Give feedback.
All reactions