-
Notifications
You must be signed in to change notification settings - Fork 0
Dictionary of metadata
Any proposal for an interchange format involves two categories of content in the data file -- (1) a table of numbers representing the spectral data, and (2) a listing of metadata organized in a format that can be read by both computers and humans.
On this page, we list a dictionary of metadata. Each entry in the dictionary consists of three components:
- The name representing the datum
- The meaning of the datum
- The format of representing its value
We recommend that metadata be gathered into related categories, here called namespaces. Thus related bits of metadata will also be related syntactically. As an example, one namespace might be called Scan
and denote metadata related to choice of absorbing atom and the choice of parameters in the data acquisition program related to effecting that scan.
Using the syntax of the XDI suggestion, the name of the metadatum consists of the word Scan
, followed by a dot, followed by another word. The dot tells the reader that the second word is related to the Scan
namespace.
Scan.element = Au Scan.edge = L3 Scan.edge_energy = 11919 eV
Decisions have to be made about the allowed character set of the name, whether efforts at internationalization will be supported, and how deeply nested (i.e. whether one or more dots are allowed) names can be.
One of the charges of the Data Format Working Group is to identify a set of metadata to be encoded in the specification of a data interchange format and to assign names to each meaningful concept. This effort must take a broad view, capturing metadata concepts as broadly as they are used in the community. This effort must also be open ended in that there must be a mechanism for providing new forms of metadata not considered up front.
Again, decisions must be made about character sets and internationalization. Among other decisions:
- Identification of standard units and whether units must be specified in a compliant file.
- Representations of numerical values and special data types like timestamps.
- Standards for identifying facilities and beamlines
- Representations of deeply nested data
- Beamline
- Scan
- Mono
- Facility
- Detector
- Sample
- Column