-
Notifications
You must be signed in to change notification settings - Fork 27
Initial draft of the SSSOM/RDF spec. #469
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
@@ -0,0 +1,366 @@ | ||||||||||||||||||||
# The SSSOM/RDF serialisation format | ||||||||||||||||||||
|
||||||||||||||||||||
This section defines how to represent a SSSOM mapping set as a [RDF | ||||||||||||||||||||
model](https://www.w3.org/TR/rdf11-concepts/). | ||||||||||||||||||||
|
||||||||||||||||||||
|
||||||||||||||||||||
## RDF formats | ||||||||||||||||||||
The RDF model that represents a SSSOM mapping set is independent of the | ||||||||||||||||||||
concrete format that may be used to serialise the model. | ||||||||||||||||||||
|
||||||||||||||||||||
It is RECOMMENDED that implementations support reading and writing a | ||||||||||||||||||||
SSSOM set from and to the [RDF Turtle](https://www.w3.org/TR/turtle/) | ||||||||||||||||||||
format at least. They MAY support any other RDF concrete format (e.g. | ||||||||||||||||||||
RDF/XML, TriG, N-Triples, etc.). | ||||||||||||||||||||
|
||||||||||||||||||||
This specification does not mandate how a concrete RDF syntax is to be | ||||||||||||||||||||
used. For example, if the RDF syntax allows named resources and | ||||||||||||||||||||
predicates to be serialised as either IRIs or CURIEs, if is left at the | ||||||||||||||||||||
discretion of the implementations (or their users) to decide which form | ||||||||||||||||||||
to use. | ||||||||||||||||||||
|
||||||||||||||||||||
<a id="sssom-slots"></a> | ||||||||||||||||||||
## Representation of slots | ||||||||||||||||||||
A metadata slot on any given SSSOM object (such as a `Mapping` or a | ||||||||||||||||||||
`MappingSet`) is represented as a RDF triple where: | ||||||||||||||||||||
|
||||||||||||||||||||
* the subject is the resource representing the SSSOM object; | ||||||||||||||||||||
* the predicate is either: | ||||||||||||||||||||
* the property indicated by the `URI` field in the LinkML | ||||||||||||||||||||
description of the slot, if such a field is present; | ||||||||||||||||||||
* or a property constructed by catenating the | ||||||||||||||||||||
`https://w3id.org/sssom/` namespace and the name of the slot; | ||||||||||||||||||||
* the object is the value of the slot. | ||||||||||||||||||||
|
||||||||||||||||||||
### Representation of slot values | ||||||||||||||||||||
The following rules determine how the value of a slot is represented as | ||||||||||||||||||||
the object of a RDF triple. | ||||||||||||||||||||
|
||||||||||||||||||||
#### For slots typed as `sssom:EntityReference` | ||||||||||||||||||||
(e.g. `subject_id`, `mapping_justification`, `subject_source`…) | ||||||||||||||||||||
|
||||||||||||||||||||
The value is rendered as a named RDF resource (IRI). | ||||||||||||||||||||
|
||||||||||||||||||||
#### For slots typed as `sssom:NonRelativeURI` | ||||||||||||||||||||
(e.g. `license`, `mapping_provider`, `issue_tracker`…) | ||||||||||||||||||||
|
||||||||||||||||||||
The value is rendered as a named RDF resource (IRI). | ||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why use "is" instead of |
||||||||||||||||||||
|
||||||||||||||||||||
#### For slots typed as `linkml:date` | ||||||||||||||||||||
(e.g. `mapping_date`, `publication_date`) | ||||||||||||||||||||
|
||||||||||||||||||||
The value is represented as a `xsd:date` literal. | ||||||||||||||||||||
|
||||||||||||||||||||
#### For slots typed as `linkml:double` | ||||||||||||||||||||
(e.g. `mapping_set_confidence`, `confidence`, `similarity_score`) | ||||||||||||||||||||
|
||||||||||||||||||||
The value is represented as a `xsd:double` literal. | ||||||||||||||||||||
|
||||||||||||||||||||
#### For slots typed as an enumeration | ||||||||||||||||||||
(e.g. `sssom_version`, `mapping_cardinality`, `subject_type`…) | ||||||||||||||||||||
|
||||||||||||||||||||
If the permissible values for the enumeration are defined in the LinkML | ||||||||||||||||||||
model as having an associated `meaning` property, then the value is | ||||||||||||||||||||
represented as a named RDF resource with the indicated property. | ||||||||||||||||||||
Otherwise, the value is represented as a string literal. | ||||||||||||||||||||
|
||||||||||||||||||||
#### For slots typed as a SSSOM object | ||||||||||||||||||||
(e.g. `mappings`, `extension_definitions`) | ||||||||||||||||||||
|
||||||||||||||||||||
The value is represented as a RDF resource. Whether the resource is | ||||||||||||||||||||
named (IRI) or not (blank node) will depend on the type of the object, | ||||||||||||||||||||
see the [section on representing SSSOM objects](#sssom-objects) below | ||||||||||||||||||||
for details. | ||||||||||||||||||||
|
||||||||||||||||||||
### Representation of multi-valued slots | ||||||||||||||||||||
(e.g. `creator_id`, `see_also`, `object_match_field`…) | ||||||||||||||||||||
|
||||||||||||||||||||
As an exception to the general principle that slots are represented by a | ||||||||||||||||||||
single RDF triple, multi-valued slots are represented by as many | ||||||||||||||||||||
triples as there are values, each value being the object of one triple. | ||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. i stumbled across this sentence multiple times. Maybe this can be written clearer like: for each value {v1,v..,vn} represented by a set of individual triples There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also, is this true for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Does not sound any clearer to me. On the contrary, it sounds like each value is represented by a set of triples, which is certainly not the case.
Of course it is. Mappings are represented as follows: ?mappingset sssom:mappings [ a owl:Axiom ;
owl:annotatedSource ...
] ,
[ a owl:Axiom ;
owl:annotatedSource ...
] . which fits the description for multi-valued slots: one triple per value. This is what SSSOM-Py has always done, so I had assumed you were fine with that. Let me guess: you are no longer happy with that and want to radically change the format? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, it seems I misunderstood
I thought this literally meant:
Which is why I was confused. Maybe just add an example here? |
||||||||||||||||||||
|
||||||||||||||||||||
> Non-normative notes: | ||||||||||||||||||||
> | ||||||||||||||||||||
> 1. This means, in particular, that RDF complex structures intended to | ||||||||||||||||||||
> represent collections of values, such as `rdfs:Container` or | ||||||||||||||||||||
> `rdfs:List`, MUST NOT be used to represent multi-valued SSSOM | ||||||||||||||||||||
> slots. | ||||||||||||||||||||
> 2. This also implies that values in multi-valued slots are _not_ | ||||||||||||||||||||
> ordered. | ||||||||||||||||||||
|
||||||||||||||||||||
The other rules above apply to determine how each single value is to be | ||||||||||||||||||||
represented. | ||||||||||||||||||||
|
||||||||||||||||||||
<a id="extension-slots"></a> | ||||||||||||||||||||
### Representation of extension slots | ||||||||||||||||||||
An [extension slot](spec-model.md#non-standard-slots) is represented in | ||||||||||||||||||||
a similar way to a standard slot, with the following specific rules. | ||||||||||||||||||||
|
||||||||||||||||||||
The predicate is the property associated to the extension slot, as | ||||||||||||||||||||
indicated by the `property` slot in the set’s | ||||||||||||||||||||
[definition](ExtensionDefinition.md) of the extension. | ||||||||||||||||||||
|
||||||||||||||||||||
The value of the extension is represented: | ||||||||||||||||||||
|
||||||||||||||||||||
* as a named RDF resource, if the `type_hint` of the extension | ||||||||||||||||||||
definition is `linkml:uriOrCurie`; | ||||||||||||||||||||
* otherwise, as a literal of the type indicated by the `type_hint`. | ||||||||||||||||||||
matentzn marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||||||||||||||
|
||||||||||||||||||||
|
||||||||||||||||||||
<a id="sssom-objects"></a> | ||||||||||||||||||||
## Representation of SSSOM objects | ||||||||||||||||||||
|
||||||||||||||||||||
### Representation of a `Mapping` object | ||||||||||||||||||||
The RDF type of a `Mapping` object is `owl:Axiom`. | ||||||||||||||||||||
|
||||||||||||||||||||
If the `Mapping` object has a `record_id` slot, then the value of that | ||||||||||||||||||||
slot is used as the named RDF resource that represents the object (and | ||||||||||||||||||||
consequently, that slot MUST NOT be represented using the [general | ||||||||||||||||||||
rules](#sssom-slots) for the representation of slots as defined above). | ||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. maybe better to phrase this postively, like "rules don't apply". |
||||||||||||||||||||
Otherwise, the `Mapping` object is represented as a blank node. | ||||||||||||||||||||
|
||||||||||||||||||||
### Representation of a `MappingSet` object | ||||||||||||||||||||
The RDF type of a `MappingSet` object is `sssom:MappingSet`. | ||||||||||||||||||||
|
||||||||||||||||||||
A `MappingSet` object is represented by a named RDF resource | ||||||||||||||||||||
corresponding to the value of the `mapping_set_id` slot (which | ||||||||||||||||||||
consequently MUST NOT be represented using the [general | ||||||||||||||||||||
rules](#sssom-slots) for the representation slots as defined above). | ||||||||||||||||||||
|
||||||||||||||||||||
The `curie_map` slot MUST NOT be represented using the [general | ||||||||||||||||||||
rules](#sssom-slots). Instead, if it is needed it MUST be represented | ||||||||||||||||||||
using whatever mechanism is provided by the concrete RDF serialisation | ||||||||||||||||||||
format (e.g. `@prefix` declarations in [RDF | ||||||||||||||||||||
Turtle](https://www.w3.org/TR/turtle/) or [RDF | ||||||||||||||||||||
TriG](https://www.w3.org/TR/trig/), or `xmlns` namespace declarations in | ||||||||||||||||||||
[RDF/XML](https://www.w3.org/TR/rdf-syntax-grammar/)). | ||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am a bit against any relationship between curie map and the various RDF prefix syntaxes you list. cc @cthoyt My main concern is this: two of the most important serialisations (RDF/XML, OWL/XML) can't even accurately represent a sssom:curie_map. Because in these XML serializations the prefix system is hooked into the XML namespacing system, the local identifier part has severe syntactic constraints. In particular, it must correspond to an NCNAME, which means it MUST start with a letter (not a number, so you cant actually represent UBERON:123 in RDF XML). My vote is to represent the prefix map using the SHACL prefixmap:
This ensures 100% faithful roundtripping. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Point taken, but that’s a limitation of RDF/XML. Other prefix-supporting formats such as Turtle don’t have that limitation. You want to preserve the CURIE map accurately and preserve the ability to roundtrip? Then don’t use RDF/XML.
You are joking. You are joking, right? You are not seriously proposing something completely different from what we currently have? Something for which we don’t have the inkling of the beginning of support in neither SSSOM-Py nor SSSOM-Java? Because if you are not joking, I give up. Design your dream format all you want, and give me a sign when you’ll be done moving the goalposts from one day to the next. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I did not expect such a strong disagreement :P Alright, instead of this suggestion, we should then make SSSOM-RDF explicitly write only format.. I am totally fine with this as well, and we don't need to concern ourselves then at all with the curie_map.. This way we can circumvent the limitations outlined above? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Try sparing some thoughts for the people who have to implement your last-minute ideas that you sprout out of nowhere, maybe you’ll understand. Two days ago you were reluctant to just changing the predicates to use to represent the core triple (e.g. changing from And now, you’re here suggesting that we should in fact do something completely different, that has never even been casually mentioned in any discussion related to the SSSOM/RDF format. Because all of a sudden you are concerned about roundtripping back from RDF/XML, even though the RDF/XML produced by SSSOM-Py has never been roundtrippable and AFAIK nobody has ever complained about that. So yeah, I disagree with your proposal, because it contradicts everything that you seemed to want just two days ago. So what do you want now? A. A SSSOM/RDF format that is minimally different from what we already have, that can be supported rapidly (that is already supported by one implementation), but that (oh, the horror!) does not guarantee that a set written in RDF/XML can be roundtripped back to another SSSOM format? B. A SSSOM/RDF format that is a clean break from the existing stuff, that will initially not be supported by any implementation (and in fact I doubt it will ever be implemented by SSSOM-Py, given the lack of activity on that implementation)? If you want B, fine. But then I’ll leave you to design the format. I won’t get involved in any of it, I’ll just wait until you have designed the perfect format of your dream, and then I promise I’ll do my best to implement it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Now you are throwing the baby with the bathwater. Just because the RDF/XML concrete serialisation may not guarantee that the prefix map is preserved does not mean that we should give up on SSSOM/RDF being a read/write format. Again: As currently written, the spec does allow a SSSOM/RDF set to be fully converted back to any other SSSOM format, provided that:
As I said in another comment below, I wrote the spec to be flexible (“à la carte”): if you want the ability to roundtrip between RDF and another format, you can have it; if you are not interested in that ability, you can ignore it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
In fact even if you do serialise into RDF/XML and write identifiers as IRIs, you will still be able convert back to SSSOM/TSV, unless you happen to run your RDF/XML file into a RDF processor that decides to strip away any unused namespace declarations. Not sure if that is a common behaviour among RDF tools, but Jena and RDFLib do not seem to do it – they are happy to let unused namespace declarations pass through unchanged. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Lets tackle the NCname issue with a comment in the documentation: XML formats might not be able to roundtrip (@gouttegd says: longest URI extension will win) Maybe just document a rule on conflicting uri prefixes for roundtrip
|
||||||||||||||||||||
|
||||||||||||||||||||
> Non-normative notes | ||||||||||||||||||||
> | ||||||||||||||||||||
> 1. The CURIE map may not be needed at all if all named resources and | ||||||||||||||||||||
> predicates are always serialised as full-length IRIs. | ||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is only true if we decide that SSSOM-RDF is export only format. I know we said this for SSSOM-OWL, forgot where we ended up with RDF. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is why there is a subsection ”serialisation of identifiers” in the “Special considerations” section. The SSSOM/RDF format can be both a read/write format that is equivalent to SSSOM/TSV or SSSOM/JSON (meaning that can roundtrip between all those formats without loss of information) and an export format. It all depends on what you want to do with the output file – something the spec cannot know in advance. In a sense, SSSOM/RDF is a “à la carte” format. You want to preserve the ability to roundtrip back to SSSOM/TSV? You can, just make sure to include the CURIE map and any extension definitions. You are not interested in being able to come back (say, because all you want is to load the set into a graph database and forget about it)? Then you don’t have to worry about the CURIE map or extension definitions at all. |
||||||||||||||||||||
> 2. If at least some named resources or predicates are serialised as | ||||||||||||||||||||
> CURIEs, the RDF requirement that all used prefix names must be | ||||||||||||||||||||
> declared (using the appropriate mechanism for the chosen concrete | ||||||||||||||||||||
> syntax) takes precedence over the possibility of omitting the | ||||||||||||||||||||
> declarations of prefix names that are considered | ||||||||||||||||||||
> [built-in](spec-intro.md#iri-prefixes) in the context of SSSOM. | ||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would compact this sentence which has many redundant parts (the RDF requirement that all used prefix names must be declared) to:
|
||||||||||||||||||||
|
||||||||||||||||||||
### Representation of a `ExtensionDefinition` object | ||||||||||||||||||||
The RDF type of a `ExtensionDefinition` object is | ||||||||||||||||||||
`sssom:ExtensionDefinition`. | ||||||||||||||||||||
|
||||||||||||||||||||
A `ExtensionDefinition` object has no identifier of any kind and is | ||||||||||||||||||||
always represented by a blank node. | ||||||||||||||||||||
|
||||||||||||||||||||
## Special considerations for serialising to RDF | ||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. mark as not normative? |
||||||||||||||||||||
When serialising a mapping set to SSSOM/RDF, implementations should | ||||||||||||||||||||
consider how the resulting RDF file is intended to be used. In | ||||||||||||||||||||
particular, they should ponder whether it is expected that the RDF | ||||||||||||||||||||
serialisation can at any time be converted back to any other SSSOM | ||||||||||||||||||||
format (e.g. SSSOM/TSV), or if it is only intended to be used by | ||||||||||||||||||||
“generic”, non-SSSOM-aware RDF applications. | ||||||||||||||||||||
|
||||||||||||||||||||
Depending on that intended usage (if it is known), implementations may | ||||||||||||||||||||
adopt slightly different behaviours as described in the following | ||||||||||||||||||||
subsections. | ||||||||||||||||||||
|
||||||||||||||||||||
### Serialisations of identifiers | ||||||||||||||||||||
If the serialisation is intended to be convertible back to another SSSOM | ||||||||||||||||||||
format (especially the SSSOM/TSV format), implementations MUST | ||||||||||||||||||||
matentzn marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||||||||||||||
serialise identifiers as CURIEs and include the required prefix | ||||||||||||||||||||
declarations. | ||||||||||||||||||||
|
||||||||||||||||||||
> Non-normative explanation | ||||||||||||||||||||
> | ||||||||||||||||||||
> This is because, if all identifiers are serialised as full-length | ||||||||||||||||||||
> IRIs, then even if the RDF file includes prefix declarations, they may | ||||||||||||||||||||
> be stripped away by a RDF reader, since they are not needed. And | ||||||||||||||||||||
> without those prefix declarations, it would not be possible to | ||||||||||||||||||||
> serialise the set back as a SSSOM/TSV file (remember that the | ||||||||||||||||||||
> SSSOM/TSV format _requires_ that identifiers be serialised as CURIEs). | ||||||||||||||||||||
|
||||||||||||||||||||
Conversely, if the ability to convert the RDF file back to another SSSOM | ||||||||||||||||||||
format is not required, implementations can freely decide whether to | ||||||||||||||||||||
serialise identifiers as IRIs or CURIEs (assuming the concrete RDF | ||||||||||||||||||||
syntax allows that of course). | ||||||||||||||||||||
|
||||||||||||||||||||
### Extension definitions | ||||||||||||||||||||
Extension definitions MAY be omitted if the RDF file is only intended to | ||||||||||||||||||||
be used by RDF applications. | ||||||||||||||||||||
|
||||||||||||||||||||
Conversely, they SHOULD be included if the set is intended to be | ||||||||||||||||||||
convertible back to another SSSOM format. | ||||||||||||||||||||
|
||||||||||||||||||||
> Non-normative explanation | ||||||||||||||||||||
> | ||||||||||||||||||||
> The whole point of an extension definition in SSSOM is to provide (1) | ||||||||||||||||||||
> a property that confers some meaning to the extension, and (2) the | ||||||||||||||||||||
> type of the expected values. In RDF, as described | ||||||||||||||||||||
> [above](#extension-slots), those two bits of information are already | ||||||||||||||||||||
> contained in the triple that represent the extension slot, so there is | ||||||||||||||||||||
> no need for an additional definition. | ||||||||||||||||||||
> | ||||||||||||||||||||
> But the extension definition also provides the `slot_name` which is | ||||||||||||||||||||
> used to represent the extension slot in other formats (especially | ||||||||||||||||||||
> SSSOM/TSV), so if conversion back to other SSSOM formats is required, | ||||||||||||||||||||
> ensuring that the extension definitions are present in the RDF | ||||||||||||||||||||
> serialisation is helpful. | ||||||||||||||||||||
|
||||||||||||||||||||
### Propagation and condensation | ||||||||||||||||||||
Propagatable slots can be represented in RDF indifferently in their | ||||||||||||||||||||
propagated or condensed form, following the [normal | ||||||||||||||||||||
rules](spec-model.md##propagation-of-mapping-set-slots) for propagation | ||||||||||||||||||||
and condensation. | ||||||||||||||||||||
|
||||||||||||||||||||
But if the RDF file is intended to be used by generic, non-SSSOM-aware | ||||||||||||||||||||
RDF applications, then implementations SHOULD serialise propagatable | ||||||||||||||||||||
slots in their propagated form. | ||||||||||||||||||||
|
||||||||||||||||||||
> Non-normative explanation | ||||||||||||||||||||
> | ||||||||||||||||||||
> Propagation is a SSSOM-specific concept. If a RDF application is | ||||||||||||||||||||
> provided with a RDF file representing a set with condensed slots, the | ||||||||||||||||||||
> application will not know to propagate the condensed slots at the set | ||||||||||||||||||||
> level down to the level of the individual mappings, which will result | ||||||||||||||||||||
> in the application having an incomplete view of the mappings. | ||||||||||||||||||||
|
||||||||||||||||||||
|
||||||||||||||||||||
## Compatibility with pre-standard RDF representations | ||||||||||||||||||||
The present specification of the SSSOM/RDF format differs slightly from | ||||||||||||||||||||
what several implementations of SSSOM have been producing before the | ||||||||||||||||||||
format was formally specified. | ||||||||||||||||||||
|
||||||||||||||||||||
In the name of backward compatibility, implementations MAY support the | ||||||||||||||||||||
alternative rules described in the following subsections when | ||||||||||||||||||||
deserialising from RDF. | ||||||||||||||||||||
|
||||||||||||||||||||
Implementations MUST NOT follow these rules when serialising to RDF. | ||||||||||||||||||||
|
||||||||||||||||||||
### Representation of slots typed as `sssom:NonRelativeURI` | ||||||||||||||||||||
Implementations MAY accept a value represented as a `xsd:anyURI` | ||||||||||||||||||||
literal. | ||||||||||||||||||||
|
||||||||||||||||||||
### Representation of slots typed as an enumeration | ||||||||||||||||||||
Implementations MAY accept a value represented as a string literal, even | ||||||||||||||||||||
if the value is defined in the LinkML model as having an associated | ||||||||||||||||||||
`meaning` property. | ||||||||||||||||||||
|
||||||||||||||||||||
For example, implementations MAY accept | ||||||||||||||||||||
|
||||||||||||||||||||
```ttl | ||||||||||||||||||||
?mapping sssom:predicate_modifier "Not"^^xsd:string . | ||||||||||||||||||||
``` | ||||||||||||||||||||
|
||||||||||||||||||||
as an alternative to | ||||||||||||||||||||
|
||||||||||||||||||||
```ttl | ||||||||||||||||||||
?mapping sssom:predicate_modifier sssom:NegatedPredicate . | ||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why not just decide on one and say "this is the standard"? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That’s what we did. This: ?mapping sssom:predicate_modifier sssom:NegatedPredicate . is the standard. But the decision to standardize that form has only been made a few weeks ago. Before that, both SSSOM-Java and SSSOM-Py have been producing the string literal form (SSSOM-Java for the past 8 months – since version 1.1, which introduced RDF support – and SSSOM-Py for as long as it has existed). In fact SSSOM-Py still produces the string literal form to this day. So for backwards compatibility (which is the entire point of this section, “compatibility with pre-standard representations”), implementations MAY support the old string literal form, even though it is not the standard form. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Makes sense. |
||||||||||||||||||||
``` | ||||||||||||||||||||
|
||||||||||||||||||||
### Representation of a `MappingSet` object | ||||||||||||||||||||
Implementations MAY accept a `MappingSet` object represented as a blank | ||||||||||||||||||||
node, with the `mapping_set_id` slot being represented as any other | ||||||||||||||||||||
slot. | ||||||||||||||||||||
|
||||||||||||||||||||
For example, instead of | ||||||||||||||||||||
|
||||||||||||||||||||
```ttl | ||||||||||||||||||||
<https://example.org/myset> a sssom:MappingSet . | ||||||||||||||||||||
``` | ||||||||||||||||||||
|
||||||||||||||||||||
implementations MAY accept | ||||||||||||||||||||
|
||||||||||||||||||||
```ttl | ||||||||||||||||||||
[] a sssom:MappingSet ; | ||||||||||||||||||||
sssom:mapping_set_id <https://example.org/myset> . | ||||||||||||||||||||
``` | ||||||||||||||||||||
|
||||||||||||||||||||
or even (by also applying the alternative rule regarding the | ||||||||||||||||||||
representation of slots typed as `sssom:NonRelativeURI`) | ||||||||||||||||||||
|
||||||||||||||||||||
```ttl | ||||||||||||||||||||
[] a sssom:MappingSet ; | ||||||||||||||||||||
sssom:mapping_set_id "https://example.org/myset"^^xsd:anyURI . | ||||||||||||||||||||
``` | ||||||||||||||||||||
|
||||||||||||||||||||
## Examples | ||||||||||||||||||||
|
||||||||||||||||||||
> This section is non-normative. | ||||||||||||||||||||
|
||||||||||||||||||||
Considering the following set in the SSSOM/TSV format: | ||||||||||||||||||||
|
||||||||||||||||||||
``` | ||||||||||||||||||||
#curie_map: | ||||||||||||||||||||
# EXT: https://example.org/properties/ | ||||||||||||||||||||
# FOODON: http://purl.obolibrary.org/obo/FOODON_ | ||||||||||||||||||||
# KF_FOOD: https://kewl-foodie.inc/food/ | ||||||||||||||||||||
# ORCID: https://orcid.org/ | ||||||||||||||||||||
#mapping_set_id: https://example.org/sample-set | ||||||||||||||||||||
#mapping_set_description: Manually curated alignment of KEWL FOODIE INC internal food and nutrition database with Food Ontology (FOODON). Intended to be used for ontological analysis and grouping of KEWL FOODIE INC related data. | ||||||||||||||||||||
#license: https://creativecommons.org/licenses/by/4.0/ | ||||||||||||||||||||
#mapping_date: 2025-07-14 | ||||||||||||||||||||
#extension_definitions: | ||||||||||||||||||||
# - slot_name: ext_fooable | ||||||||||||||||||||
# property: EXT:isFooable | ||||||||||||||||||||
# type_hint: xsd:boolean | ||||||||||||||||||||
subject_id subject_label predicate_id object_id object_label mapping_justification author_id confidence ext_fooable | ||||||||||||||||||||
KF_FOOD:F001 apple skos:exactMatch FOODON:00002473 apple (whole) semapv:ManualMappingCuration ORCID:0000-0002-7356-1779 0.95 true | ||||||||||||||||||||
KF_FOOD:F002 gala skos:exactMatch FOODON:00003348 Gala apple (whole) semapv:ManualMappingCuration ORCID:0000-0002-7356-1779 1 false | ||||||||||||||||||||
``` | ||||||||||||||||||||
|
||||||||||||||||||||
A valid serialisation of that set in RDF/Turtle would be: | ||||||||||||||||||||
|
||||||||||||||||||||
```ttl | ||||||||||||||||||||
@prefix EXT: <https://example.org/properties/> . | ||||||||||||||||||||
@prefix FOODON: <http://purl.obolibrary.org/obo/FOODON_> . | ||||||||||||||||||||
@prefix KF_FOOD: <https://kewl-foodie.inc/food/> . | ||||||||||||||||||||
@prefix ORCID: <https://orcid.org/> . | ||||||||||||||||||||
@prefix dcterms: <http://purl.org/dc/terms/> . | ||||||||||||||||||||
@prefix owl: <http://www.w3.org/2002/07/owl#> . | ||||||||||||||||||||
@prefix pav: <http://purl.org/pav/> . | ||||||||||||||||||||
@prefix semapv: <https://w3id.org/semapv/vocab/> . | ||||||||||||||||||||
@prefix skos: <http://www.w3.org/2004/02/skos/core#> . | ||||||||||||||||||||
@prefix sssom: <https://w3id.org/sssom/> . | ||||||||||||||||||||
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> . | ||||||||||||||||||||
|
||||||||||||||||||||
<https://example.org/sample-set> a sssom:MappingSet; | ||||||||||||||||||||
dcterms:description "Manually curated alignment of KEWL FOODIE INC internal food and nutrition database with Food Ontology (FOODON). Intended to be used for ontological analysis and grouping of KEWL FOODIE INC related data."; | ||||||||||||||||||||
dcterms:license <https://creativecommons.org/licenses/by/4.0/>; | ||||||||||||||||||||
sssom:extension_definitions [ | ||||||||||||||||||||
sssom:property EXT:isFooable; | ||||||||||||||||||||
sssom:slot_name "ext_fooable"; | ||||||||||||||||||||
sssom:type_hint xsd:boolean | ||||||||||||||||||||
]; | ||||||||||||||||||||
sssom:mappings [ a owl:Axiom; | ||||||||||||||||||||
pav:authoredBy ORCID:0000-0002-7356-1779; | ||||||||||||||||||||
dcterms:created "2025-07-14"^^xsd:date; | ||||||||||||||||||||
owl:annotatedProperty skos:exactMatch; | ||||||||||||||||||||
owl:annotatedSource KF_FOOD:F001; | ||||||||||||||||||||
owl:annotatedTarget FOODON:00002473; | ||||||||||||||||||||
EXT:isFooable true; | ||||||||||||||||||||
sssom:confidence 9.5E-1; | ||||||||||||||||||||
sssom:mapping_justification semapv:ManualMappingCuration; | ||||||||||||||||||||
sssom:object_label "apple (whole)"; | ||||||||||||||||||||
sssom:subject_label "apple" | ||||||||||||||||||||
], [ a owl:Axiom; | ||||||||||||||||||||
pav:authoredBy ORCID:0000-0002-7356-1779; | ||||||||||||||||||||
dcterms:created "2025-07-14"^^xsd:date; | ||||||||||||||||||||
owl:annotatedProperty skos:exactMatch; | ||||||||||||||||||||
owl:annotatedSource KF_FOOD:F002; | ||||||||||||||||||||
owl:annotatedTarget FOODON:00003348; | ||||||||||||||||||||
EXT:isFooable false; | ||||||||||||||||||||
sssom:confidence 1.0E0; | ||||||||||||||||||||
sssom:mapping_justification semapv:ManualMappingCuration; | ||||||||||||||||||||
sssom:object_label "Gala apple (whole)"; | ||||||||||||||||||||
sssom:subject_label "gala" | ||||||||||||||||||||
] . | ||||||||||||||||||||
``` | ||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would expect that the triples represented by the axioms also to show up somewhere in the RDF
Suggested change
though it's not clear what to do for negated triples There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. They appear, but only as reified OWL axioms. This has been the RDF output produced by SSSOM-Py since the beginning. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Its a valid question though. Without the direct triple, triples stores might not be able to return all terms mapped to In the OWL serialisation it seems I have injected them in sssom py: There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
If that is needed, once the set has been exported to RDF it shouldn’t be hard to process it with some SPARQL to construct a I wouldn’t mind adding that as a OPTIONAL behaviour for RDF writer, on the condition that it is really optional – that is, if those Something like:
If we do allow that, this raises the question (as hinted by @cthoyt) of what to do about negated mappings. Two possibilities: (A) Don’t care. Negated mappings are treated in the same way as any other mappings. If users don’t want (B) Explicitly exclude negated mappings, as in
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There is also the question of mappings with
Rendering them as MYMAP:1 a owl:Axiom ;
owl:annotatedSource HP:1234 ;
owl:annotatedPredicate skos:exactMatch
owl:annotatedTarget sssom:NoTermFound ;
sssom:object_source obo:doid.owl . should be perfectly fine, but do we also want a HP:1234 skos:exactMatch sssom:NoTermFound . triple as well? I’d say, we should do for them the same thing as we do for negated mappings (the two possibilities outlined in my previous message). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am now on board with:
With regards to the special cases:
I agree. Just because both are very different use cases I would favour it if implementations would be injecting triples to sssom:NoTermFound versus negated mappings based on separate conditionals; I would probably never add either one, but there may be some use cases to do so. |
||||||||||||||||||||
|
||||||||||||||||||||
Note that the two `Mapping` objects are represented as blank nodes, | ||||||||||||||||||||
since the original set does not contain any `record_id` slot. | ||||||||||||||||||||
|
||||||||||||||||||||
Note also that (1) identifiers are serialised as CURIEs whenever | ||||||||||||||||||||
possible, and (2) the definition for the `EXT:isFooable` extension is | ||||||||||||||||||||
included. This means that the set can be fully converted back to | ||||||||||||||||||||
SSSOM/TSV without any loss of information. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.