FAIR data decisions: Lossy or lossless

[//]: # "==Do not write above this line==
Instructions for posting issues: 
(1) Review what is already there. Perhaps a comment to an existing issue would be more appropriate than opening a new one?
(2) Write your post below using Markdown (as per https://guides.github.com/features/mastering-markdown/ ) or just plain text.
(3) Don't worry about these introductory lines - you can leave or delete them, as they won't display anyway (you can check this via Preview).
(4) Hit the 'Submit new issue' button.
==Write below this line=="
One of the issues often confronted by depositors of  aspiring  FAIR data is how much data loss to tolerate.  I give just one example, crystallographic data in chemistry (often described as the Gold Standard in  chemical Data).  There are the following hierarchies, with increasing data loss:

1. The raw instrument data
2. The processed instrument data, including "hkl" information
3. The processed instrument data, including rich structure information but excluding  "hkl" data
4. The processed minimum dataset, which suffices for perhaps  90% of most user's needs
5. A graphical representation of the minimum dataset, as a  JPEG or PDF...
6. which itself can be lossy.

So most consumers of say category 4 would find it adequately FAIR for their needs, but some specialist users would find it too lossy, and might need to go as high as category  1.  The trouble is that this type of data might be as much as  10,000 times larger than the minimal set.

Unfortunately there is no easy way of specifying the degree of data loss in any aspiring FAIR dataset as metadata information.  This remember is considered the  "gold" standard.   One finds similar situations in other types of chemical data.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FAIR data decisions: Lossy or lossless #27

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

FAIR data decisions: Lossy or lossless #27

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions