This project provides SHACL shapes to validate metadata against the eCH-0200 standard.
ech-0200.shacl.ttl: This file models the constraints defined in eCH-200 with the SHACL vocabularyexamples/: This directory contains RDF turtle files that can be used to testech-0200.shacl.ttl. The convention is that files ending with.valid.ttlwill validate, while files ending with.fail.ttlwill not validate.
To use the shapes to validate data, you need a SHACL validator such as TopBraid SHACL API. With this validator you can validate RDF turtle files as follows:
$ shaclvalidate -shapesfile ech-0200.shacl.ttl -datafile data.ttlFor example (from this directory):
$ shaclvalidate -shapesfile ech-0200.shacl.ttl -datafile .\examples\minimal.valid.ttlAs the TopBraid SHACL validator only supports the Turtle RDF Format, you need to convert files in other formats such as RDF/XML files.
First you need Apache Jena. You can download and extract it with these commands:
$ wget https://www-eu.apache.org/dist/jena/binaries/apache-jena-3.9.0.tar.gz
$ tar xvzf apache-jena-3.9.0.tar.gzThis creates a folder named apache-jena-3.9.0. To convert your RDF/XML file (eg. file.rdf) you can use this command:
$ ./apache-jena-3.9.0/bin/riot --output=turtle rdfxml file.rdf > file.ttlAnd you will find the converted result in file.ttl.
Jena supports these RDF formats: turtle, ntriples, nquads, trig and rdfxml.
This project is similar and partially based on the EU DCAT-AP SHACL constraint definitions.
While the eCH-0200 Specification is available in German and French the SHACL shapes are documented in English to better allign with other shape files and tools that are likely used simultaneously.
- The specification mandates the use of
schema:urlas class. This seems to be a mistake, so we assume thatschema:URLis what it's supposed to mean. - The SHACL file also supports
xsd:dateTimewhere the spec mandatesxsd:date. - Inference: The specification isn't explicit if and what inference should be allowed. We assume that where
vcard:Kindis allowed its subclasses (Individual, Organization, Group, Location) should be allowed to. SHACL only allows specifying ontological statements in the data and not in the shape graph, so currently using a subclass is only accepted if the respectiverdfs:subClassOfstatement is also present in the data. We could of course explicitly allow some named subclassed in the shape file but this doesn't seem to be wanted by the spec. - The type (
foaf:Document) does not need to be explicitely specified for a document to validate; the type can be inferred from therdfs:rangeoffoaf:Document.
- Shouldn't we require a dataset to be named (using standard IRI) rather than requiring a proprietary
dct:identifier? - Also, shouldn't the
dct:publisherbe named, rather than being an instance offoaf:Agent? Analogous questions can be asked fordcat:themeTaxonomyandfoaf:homepage. - It seems inconsistent to forbid
adms:statuson distributions while generally allowing arbitrary properties.
As prospective part of an eCH standard the code and documentations in this repository can be used, distributed and further developed without any restriction by patents or licenses.