-
Notifications
You must be signed in to change notification settings - Fork 0
Dictionary of metadata
Any proposal for an interchange format involves two categories of content in the data file -- (1) a table of numbers representing the spectral data, and (2) a listing of metadata organized in a format that can be read by both computers and humans.
On this page, we list a dictionary of metadata. Each entry in the dictionary consists of three components:
- The name representing the datum
- The meaning of the datum
- The format of representing its value
We recommend that metadata be gathered into related categories, here called namespaces. Thus related bits of metadata will also be related syntactically. As an example, one namespace might be called Scan
and denote metadata related to choice of absorbing atom and the choice of parameters in the data acquisition program related to effecting that scan.
Using the syntax of the XDI suggestion, the name of the metadatum consists of the word Scan
, followed by a dot, followed by another word. The dot tells the reader that the second word is related to the Scan
namespace.
Scan.element = Au
Scan.edge = L3
Scan.edge_energy = 11919 eV
Decisions have to be made about the allowed character set of the name, whether efforts at internationalization will be supported, and how deeply nested (i.e. whether one or more dots are allowed) names can be.
One of the charges of the Data Format Working Group is to identify a set of metadata to be encoded in the specification of a data interchange format and to assign names to each meaningful concept. This effort must take a broad view, capturing metadata concepts as broadly as they are used in the community. This effort must also be open ended in that there must be a mechanism for providing new forms of metadata not considered up front.
Again, decisions must be made about character sets and internationalization. Among other decisions:
- Identification of standard units and whether units must be specified in a compliant file.
- Representations of numerical values and special data types like timestamps.
- Standards for identifying facilities and beamlines
- Representations of deeply nested data
The purpose of namespaces is to provide sensible, widely understood, semantic groupings of defined metadata tags. All tags associated with conveying information about sample preparation and the measurement environment of the sample belong in the Sample namespace. Similarly, all tags associated with the configuration of the beamline optics belong in the Beamline namespace.
Here is a list of all such semantic groupings:
- Beamline
- Scan
- Mono
- Facility
- Detector
- Sample
- Column
A tag is specified by inclusion in a namespace and identified by a name. An example is Mono.d_spacing which gives the d-spacing of the monochromator under operating conditions. In a tag, the namespace and tag name are separated by a dot (.). The tag and its value are separated by a colon and a single space. Here is an example:
# Mono.d_spacing: 3.13525
We identify three items that are essential to the interchange and successful interpretation of XAS data. These are required of an XDI file.
-
The d-spacing of the monochromator. A correction to the energy axis of measured data is required in the case of a miscalibration due to inaccuracies in the translation from angular position of the monochromator to energy. See Mono.d_spacing.
-
The element of the absorbing atom. The periodic table is replete with examples of atoms that have absorption edges with very similar edge energies. For example, the tabulated values of the Cr K edge and the Ba L1 edge are both 5989 eV. Without the species of the absorbing atom and absorption edge measured, some data cannot cannot be unambiguously identified. See Scan.element.
-
The absorption edge measured. See above. See Scan.edge.
All other metadata definitions that follow are optional, but recommended.
- Beamline.name: The name by which the beamline is known
- Beamline.collimation: A concise statement of how beam collimation is provided
- Beamline.focusing: A concise statement about how beam focusing is provided
- Beamline.harmonic_rejection: A concise statement about how harmonic rejection is accomplished
# Beamline.name: NSLS X11-A
# Beamline.collimation: none
# Beamline.focusing: No
# Beamline.harmonic_rejection: detuned
Click here to see issues related to the Beamline namespace
-
Scan.edge: The measured absorption edge. This is a required parameter.
-
Scan.element: The species of the measured element. This is a required parameter.
-
Scan.edge_energy: The value of the edge energy used in the data acquisition software.
# Scan.edge: K
# Scan.element: Cu
# Scan.edge_energy: 8980.0
Click here to see issues related to the Scan namespace
-
Mono.name: A string identifying the material and diffracting plane or grating spacing of the monochromator
-
Mono.d_spacing: The known d-spacing of the monochromator under operating conditions. This is a required parameter.
# Mono.name: Si 111
# Mono.d_spacing: 3.13525
- How do we describe a polychromator?
- Should the Mono namespace be rolled into the Beamline namespace? That is, should Mono be a separate namespace?
-
Facility.name: The name of synchrotron or other X-ray facility.
-
Facility.xray_source: A string identifying the source of X-ray generation.
# Facility.name: NSLS
# Facility.name: bend magnet
- Detector.i0: A description of how the incident flux was measured
- Detector.it: A description of how the transmitted flux was measured
- Detector.if: A description of how the fluorescent flux was measured
- Detector.ir: A description of how the reference flux was measured
# Detector.i0: 10cm N2
# Detector.i1: 10cm N2
- What goes into the description? Fill gases? Applied voltage? Gain?
- How are energy discriminating detectors described?
- How are wavelength dispersive detectors described?
- How is the detection for energy dispersive XAS described?
- Sample.name: A string identifying the measured sample
- Sample.formula: The stoichiometric formula of the measured sample
- Sample.prep: A string summarizing the method of sample preparation
- Sample.temperature: The temperature at which the sample was measured
- Sample.pressure: The pressure at which the sample was measured
# Sample.name: Hematite
# Sample.formula: Fe2O3
# Sample.prep: Powder spread on polyimide tape
# Sample.temperature: room temperature
- Formula stoichiometry will require a specification of syntax
- Do we want to define further tags for things like electrochmical potential or other in situ parameters?
All items in the Column namespace are integers and refer to columns in the data portion of the file. The columns are numbered from left to right starting at 1. The first column must be the abscissa, either energy or wavenumber.
Column.1: energy
Subsequent columns are numbered sequentially and contain a description of the contents of the columns. For example
Column.1: energy
Column.2: mu
Column.3: i0
Entries in the Column namespace are optional, but it is recommended to include a Column
entry for every column of data.
- Describe extensions within existing namespaces
- Describe extension field syntax (i.e. identical to defined field syntax)
- Explain relationship between application metadata and extension fields