Skip to content

Uniform (implementation-neutral) syntax to access arbitrary content within a mapping set #466

@gouttegd

Description

@gouttegd

I am thinking about building a compliance test suite that SSSOM implementations could use to check their coverage of the standard (which slots and features they support or not).

One part of such a test suite would consist of tests in which:

  1. we provide a SSSOM input file;
  2. we use the tested implementation to retrieve one particular value from the file (e.g. one particular slot from either the set metadata or one of the mapping records);
  3. we compare the retrieved value with what we expect.

In order to do that, it would be useful if we had a common syntax, that would be understood by all implementations (it would be part of the spec), to unambiguously refer to an arbitrary piece of content within a SSSOM mapping set.

This syntax should allow to specify 3 elements:

  1. the object from which we want to retrieve a value (either the mapping set, or one of the mapping records; if the latter, then it should also allow to specify which mapping record, e.g. the first record in the set, the 7th record in the set, etc.);
  2. the type of content we want to retrieve (either a standard slot, a non-standard slot aka an extension slot, or a “special” value);
  3. the name of the exact content we want to retrieve:
    1. if we want to retrieve a standard slot, then it should be the name of the slot; in the case of multi-valued slots, it should also indicate which item in the list (the first item, the second item, etc.);
    2. if we want to retrieve an extension slot, then it should be the property associated with the extension slot in the extension definition;
    3. if we want to retrieve a “special“ value, then it should a keyword representing that special value (for now I can think of two: sexpr to retrieve the canonical S-expression representing a mapping record, or hash to retrieve the hash of a mapping record).

Examples

Using some kind of “path-like” syntax:

  • set/slot/mapping_set_id: retrieve the value of mapping_set_id slot from the mapping set;
  • set/slot/creator_id/1: retrieve the first value in the creator_id slot from the mapping set;
  • mapping/5/slot/subject_id: retrieve the subject_id slot from the 5th mapping record;
  • mapping/last/extension/https://example.org/fooProperty: retrieve the extension slot associated with the property https://example.org/fooProperty in the last mapping record;
  • mapping/1/special/hash: retrieve the SSSOM standard hash of the first mapping record

Alternatively, using some pseudo JsonPath-like¹ syntax:

  • set.slot.mapping_set_id
  • set.slot.creator_id(1)
  • mapping(5).slot.subject_id
  • mapping(last).extension(https://example.org/fooProperty)
  • mapping(1).special.hash

Thoughts?


¹ Why not using exactly JsonPath instead of devising a new syntax? Well, first we cannot assume that all implementations use a JSON representation of the data under the hood – SSSOM-Java doesn’t for example, and AFAIK SSSOM-Py only does it for the set metadata, not the mappings metadata. Second, this would not allow to access non-standard slots or special values.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions