Skip to content

Dictionary of metadata

bruceravel edited this page Sep 9, 2012 · 29 revisions

This page is deprecated!

Please see https://github.com/XraySpectroscopy/XAS-Data-Interchange/wiki/Dictionary-of-metadata


Any proposal for an interchange format involves precise definition of the meaning of the tags appearing in the datafile.

On this page, we list a dictionary of metadata. Each entry in the dictionary consists of three components:

  1. The name representing the datum
  2. The meaning of the datum
  3. The format of representing its value

Note that elements of the tag name are separated by period characters. This is not a requirement.
Different concrete syntaxes may alter this separator or even arrange the elements into a hierarchy.

Words used to signify the requirements in the specification shall follow the practice of RFC 2119.

Overview

The meaning of metadata

One of the charges of the Data Format Working Group is to identify a set of metadata to be encoded in the specification of a data interchange format and to assign names to each meaningful concept. This effort must take a broad view, capturing metadata concepts as broadly as they are used in the community. This effort must also be open ended in that there must be a mechanism for providing new forms of metadata not considered up front.

The format of the value

Decisions must be made about character sets and internationalization. Among other decisions:

  1. Identification of standard units and whether units must be specified in a compliant file.
  2. Representations of numerical values and special data types like timestamps.
  3. Standards for identifying facilities and beamlines
  4. Representations of deeply nested data

The dictionary

Name spaces

The purpose of namespaces is to provide sensible, widely understood, semantic groupings of defined metadata tags. All tags associated with conveying information about sample preparation and the measurement environment of the sample belong in the Sample namespace. Similarly, all tags associated with the configuration of the beamline optics belong in the Beamline namespace.

Here is a list of all such semantic groupings:

  1. Beamline: Tags related to the structure of the beamline and its optics
  2. Element: Tags related to the absorbing atom
  3. Scan: Tags related to the parameters of the scan
  4. Mono: Tags related to the monochromator
  5. Facility: Tags related to the synchrotron or other facility at which the measurement was made
  6. Detector: Tags related to the details of the photon detection system
  7. Sample: Tags related to the details of sample preparation and measurment
  8. Data: Tags used for representing the scan data

Required metadata

We identify three items that are essential to the interchange and successful interpretation of XAS data. These are required in all files.

  • The d-spacing of the monochromator. A correction to the energy axis of measured data is required in the case of a miscalibration due to inaccuracies in the translation from angular position of the monochromator to energy. See Mono/d_spacing.

  • The element of the absorbing atom. The periodic table is replete with examples of atoms that have absorption edges with very similar edge energies. For example, the tabulated values of the Cr K edge and the Ba L1 edge are both 5989 eV. Without identification of the species of the absorbing atom and of the absorption edge measured, some data cannot cannot be unambiguously determined. See Element/symbol.

  • The absorption edge measured. See above. See Element/edge.

All other metadata definitions that follow are optional, but recommended.


Defined items in the Beamline namespace

  • Namespace: Beamline -- Tag: name

    • Description: The name by which the beamline is known
    • Units: none
    • Format: free form string
  • Namespace: Beamline -- Tag: collimation

    • Description: A concise statement of how beam collimation is provided
    • Units: none
    • Format: free form string
  • Namespace: Beamline -- Tag: focusing

    • Description: A concise statement about how beam focusing is provided
    • Units: none
    • Format: free form string
  • Namespace: Beamline -- Tag: harmonic_rejection

    • Description: A concise statement about how harmonic rejection is accomplished
    • Units: none
    • Format: free form string

Click here to see issues related to the Beamline namespace

Examples of tags in the Beamline namespace


Defined items in the Element namespace

  • Namespace: Element -- Tag: symbol

    • Description: The measured absorption edge. This is a required parameter.

    • Units: none

    • Format: one of these 112 1 or 2 character strings for the standard atomic symbols (not case sensitive):

        H He Li Be B C N O F Ne Na Mg Al Si P S Cl Ar K Ca Sc Ti V Cr Mn 
        Fe Co Ni Cu Zn Ga Ge As Se Br Kr Rb Sr Y Zr Nb Mo Tc Ru Rh Pd Ag 
        Cd In Sn Sb Te I Xe Cs Ba La Ce Pr Nd Pm Sm Eu Gd Tb Dy Ho Er Tm 
        Yb Lu Hf Ta W Re Os Ir Pt Au Hg Tl Pb Bi Po At Rn Fr Ra Ac Th Pa 
        U Np Pu Am Cm Bk Cf Es Fm Md No Lr Rf Db Sg Bh Hs Mt Ds Rg Cn
      

      See Wikipedia's list of element symbols.

  • Namespace: Element -- Tag: edge

    • Description: The measured absorption edge. This is a required parameter.

    • Units: none

    • Format: one of these 28 1 or 2 character strings (not case sensitive):

        K L L1 L2 L3  M M1 M2 M3 M4 M5 N N1 N2 N3 N4 N5 N6 N7 O O1 O2 O3 O4 O5 O6 O7
      

    See table 10.10 at IUPAC notation for X-ray absorption edges for further explanation. The use of the generic edges L, M, N, and O is discouraged, but may be used for spectra spanning multiple edges.

  • Namespace: Element -- Tag: edge_energy

    • Description: The measured absorption edge. This is a required parameter.
    • Units: eV (perhaps keV or inverse Å)
    • Format: float

Click here to see issues related to the Element namespace

Examples of tags in the Element namespace


Defined items in the Scan namespace

Click here to see issues related to the Scan namespace

Examples of tags in the Scan namespace


  • Namespace: Mono -- Tag: name

    • Description: A string identifying the material and diffracting plane or grating spacing of the monochromator
    • Units: none
    • Format: string
  • Namespace: Mono -- Tag: d_spacing

    • Description: The known d-spacing of the monochromator under operating conditions. This is a required parameter.
    • Units: Å
    • Format: float

Defined items in the Mono namespace

  • Mono.name:

  • Mono.d_spacing: The known d-spacing of the monochromator under operating conditions. This is a required parameter.

Click here to see issues related to the Mono namespace

Examples of tags in the Mono namespace


Defined items in the Facility namespace

  • Namespace: Facility -- Tag: name

    • Description: The name of synchrotron or other X-ray facility.
    • Units: none
    • Format: string
  • Namespace: Facility -- Tag: source

    • Description: A string identifying the source of X-ray generation, such as "bend magnet", "undulator", or "rotating copper anode".
    • Units: none
    • Format: string

Examples of tags in the Facility namespace


Defined items in the Detector namespace

  • Namespace: Detector -- Tag: i0

    • Description: A description of how the incident flux was measured
    • Units: none
    • Format: string
  • Namespace: Detector -- Tag: it

    • Description: A description of how the tranmission flux was measured
    • Units: none
    • Format: string
  • Namespace: Detector -- Tag: if

    • Description: A description of how the fluorescence flux was measured
    • Units: none
    • Format: string
  • Namespace: Detector -- Tag: ir

    • Description: A description of how the reference flux was measured
    • Units: none
    • Format: string

(The formatting for this namespace requires attention. This was one of the areas for which James advocated the use of tables.)

Click here to see issues related to the Detector namespace

Examples of tags in the Detector namespace


Defined items in the Sample namespace

  • Namespace: Sample -- Tag: name

    • Description: A string identifying the measured sample
    • Units: none
    • Format: string
  • Namespace: Sample -- Tag: stoichiometry

  • Namespace: Sample -- Tag: prep

    • Description: A string summarizing the method of sample preparation
    • Units: none
    • Format: string
  • Namespace: Sample -- Tag: temperature

    • Description: The temperature at which the sample was measured
    • Units: degrees K (or perhaps degrees C)
    • Format: float

The Sample namespace is rather open-ended. It is probably impossible to anticipate all the kinds of sample-related metadata that may be useful to attach to data. That said, it would be useful to suggest tags for a number of common kinds of extrinsic parameters.

Here are some other possible tags denoting extrinsic parameters of the experiment along the line of Sample/temperature.

  • Sample/pressure
  • Sample/ph
  • Sample/eh
  • Sample/volume
  • Sample/porosity
  • Sample/density
  • Sample/resistivity
  • Sample/viscosity
  • Sample/magnetic_field
  • Sample/magnetic_moment
  • Sample/crystal_structure
  • Sample/opacity
  • Sample/electrochemical_potential

Click here to see issues related to the Sample namespace

Examples of tags in the Sample namespace


Defined items in the Data namespace

Items in the Data namespace describe single columns of the data table. These items must be tabulated together, and at least one of the columns must be the energy.

  • Namespace: Data -- Tag: energy

    • Description: The energy at the which the data were collected
    • Units: eV (see above)
    • Format: float
  • Namespace: Data -- Tag: mu

    • Description: The observed μ(E)
    • Units: none
    • Format: float
  • Namespace: Data -- Tag: i0

    • Description: COunts in the monitor detector
    • Units: none
    • Format: float

Click here to see issues related to the Data namespace


Extension fields

Metadata tags carry syntax and may carry semantics. That is, it is possible to have syntactically correct tags that have no definition. Such tags could carry information considered useful by the user or the author of software that, at some point, touches the data.

Such a tag could be an extension within an existing namespace. This has already been discussed in the context of the Sample namespace.

Such a tag could be part of a new namespace. One application of a new namespace would be to tie a group of metadata tags to a particular application. For example, the data processing program Athena might attach tags associated with the parameters for normalizing the data. That might look something like this:

 # Athena.pre1: -150
 # Athena.pre2: -30
 # Athena.nor1: 150
 # Athena.nor2: 800

These define the boundaries of the pre- and post-edge lines used to define the edge step in the μ(E) spectrum.

The use of such extension tags is encouraged by authors of controls, data acquisition, data analysis, and data archiving software.

If an extension tag is not understood due its lack of defined semantics, the default behavior for software touching the data be to silently preserve the metadata.

Clone this wiki locally