-
Notifications
You must be signed in to change notification settings - Fork 2
Linking StatDCAT-AP datasets to SDMX Dataflow #24
Description
As raised during the last webinar (10/03) by OECD, a discussion needs to explicitly consider the SDMX concept of a Dataflow, which is currently absent from the StatDCAT-AP model.
Notion of Dataflow
A Dataflow is "an abstract concept of the Data Sets, i.e. a ( structure without any data)
The SDMX Glossary defines a Dataflow as "an abstract concept of the Data Sets, i.e. a structure without any data". This aligns exactly with the DCAT-AP definition of dcat:Dataset as "a conceptual entity that represents the information published". The two definitions are semantically equivalent, and the Dataflow might therefore be the canonical SDMX source from which a dcat:Dataset description is derived. It corresponds to a Product in GSIM 2.0 Exchange Group,
A Dataflow associates a DSD with one or more Categories. This means the Dataflow is the bridge between:
- the structural definition (DSD)
- A Dataflow always points to exactly one DSD
- the thematic organisation - in a Category Scheme - dcat:theme
A Dataflow always points to exactly one DSD (one DSD can have multiple dataflows)
Proposal for linking dcat:Dataset to its Dataflow
dct:conformsTo is proposed as a candidate property for Linking dcat:Dataset to its SDMX Dataflow.
A key question open for community feedback: where should dct:conformsTo be attached: dcat:Dataset or dcat:Distribution?
References: Usage of dct:conformsTo in DCAT and its extensions:
DCAT 3.0 (attached to dcat:Dataset or dcat:DataService)
- Definition An established standard to which the described resource conforms.
- Usage note: This property SHOULD be used to indicate the model, schema, ontology, view or profile that the cataloged resource content conforms to.
- See 14.2.1 Conformance to a standard
DCAT-AP-3.0 (attached to dcat:Dataset or dcat:DataService)
- Definiton (dcat:Dataset) : An implementing rule or other specification.
- Definiton ((dcat:DatasService): An established (technical) standard to which the Data Service conforms.
HealthDCAT-AP (attached to dcat:Dataset or dcat:DataService and dcat:Distribution)
- Definition (dcat:Distribution): Linked Schema: An established schema to which the described Distribution conforms
Good and bad practices
The Eurostat example below on data.europa.eu represents the Dataflow as a dcat:Distribution. This directly contradicts both the SDMX Glossary (a Dataflow has no data) and the DCAT-AP specification (a Distribution is a physical embodiment). StatDCAT-AP should provide explicit guidance to prevent this pattern.
Dataset
<http://data.europa.eu/88u/dataset/dpyj6oz3pcclvrsdkyp1w>
rdf:type **dcat:Dataset**;
dct:title "Intelligence Artificielle , par activité de la NACE Rév. 2"@fr , "Artificial intelligence by NACE Rev. 2 activity"@en , "Künstliche Intelligenz, nach Aktivitäten der NACE Rev.2"@de;
dct:type <http://publications.europa.eu/resource/authority/dataset-type/STATISTICAL>;
adms:identifier <https://doi.org/10.2908/ISOC_EB_AIN2>;
dcat:distribution <http://data.europa.eu/88u/distribution/b7e3e222-3189-4902-966e-c35aef83013b> , <http://data.europa.eu/88u/distribution/ee684489-260c-40bb-88fa-9d10b58aaacb> , <http://data.europa.eu/88u/distribution/f6714666-1c9d-4cfa-b402-959da5e0fb12> , <http://data.europa.eu/88u/distribution/e1ccb4e1-7789-46bb-8684-07b6f39f92f8> , <http://data.europa.eu/88u/distribution/85559983-36eb-4488-880a-5c57f442bb02>;Distribution (dataflow)
<http://data.europa.eu/88u/distribution/e1ccb4e1-7789-46bb-8684-07b6f39f92f8>
rdf:type d**cat:Distribution**;
dct:compressformat <http://publications.europa.eu/resource/authority/file-type/GZIP>;
dct:format <http://publications.europa.eu/resource/authority/file-type/XML>;
dct:identifier "https://ec.europa.eu/eurostat/api/dissemination**/sdmx/2.1/dataflow/ESTAT/isoc_eb_ain2**?references=descendants&detail=referencepartial&format=sdmx_2.1_generic&compressed=true";
dct:license <http://publications.europa.eu/resource/authority/licence/CC_BY_4_0>;
dct:rights <http://publications.europa.eu/resource/authority/access-right/PUBLIC>;
dct:title "Download metadata in SDMX 2.1 format"@en;
dct:type <http://publications.europa.eu/resource/authority/distribution-type/DOWNLOADABLE_FILE>;
spdx:checksum [ rdf:type spdx:Checksum;
spdx:algorithm spdx:checksumAlgorithm_sha256;
spdx:checksumValue "caa364f06885eac84ac4f639da5efb694bb111769d079d8fd15ac8db752ba899"
];
dcat:accessURL <https://ec.europa.eu/eurostat/api/dissemination/sdmx/2.1/dataflow/ESTAT/isoc_eb_ain2?references=descendants&detail=referencepartial&format=sdmx_2.1_generic&compressed=true>;
dcat:byteSize "18407";
dcat:downloadURL <https://ec.europa.eu/eurostat/api/dissemination/sdmx/2.1/dataflow/ESTAT/isoc_eb_ain2?references=descendants&detail=referencepartial&format=sdmx_2.1_generic&compressed=true>;
dcat:mediaType <http://www.iana.org/assignments/media-types/application/xml> .