|
| 1 | +<!-- DO NOT EDIT: This file is auto-generated. Any changes will be overwritten. --> |
| 2 | + |
| 3 | +<span style="display:inline-block; background:#eee; color:#333; padding:4px 8px; border-radius:4px;">Schema Mapping</span> <span style="display:inline-block; background:#eee; color:#333; padding:4px 8px; border-radius:4px;">Discovery</span> <span style="display:inline-block; background:#eee; color:#333; padding:4px 8px; border-radius:4px;">Integration</span> <span style="display:inline-block; background:#eee; color:#333; padding:4px 8px; border-radius:4px;">Transformation</span> |
| 4 | + |
| 5 | +!!! info "DataCite to Schema.org Case Study Infobox" |
| 6 | + |
| 7 | + - **Author:** TBD (TBD) |
| 8 | + - **Last updated:** 2025-01-13 |
| 9 | + - **Mapping Type:**  |
| 10 | + - **Status of this case study:**  |
| 11 | + |
| 12 | +Mapping from DataCite to Schema.org for datasets. |
| 13 | + |
| 14 | +### Summary |
| 15 | + |
| 16 | +The schema.org metadata served from the DataCite API for data (and software) does not produce metadata that validates in the schema.org validator. Additionally, there are erroneous mappings of various contributor types, and terms included in the ESIP standard for describing datasets are not included in the mapping or are not updated to that standard. DataCite is interested in an updated mapping to be implemented in their API, both for mapping to schema.org and for mapping from schema.org. |
| 17 | + |
| 18 | +### Domain |
| 19 | + |
| 20 | +Science data |
| 21 | + |
| 22 | +### Use case category |
| 23 | + |
| 24 | +- Discovery (finding related data across resources) |
| 25 | +- Integration (Connecting data across disparate resources) |
| 26 | +- Transformation (translating source data into a target schema) |
| 27 | + |
| 28 | +### Purpose of the mapping |
| 29 | + |
| 30 | +- Translating DataCite metadata into Schema.org in a standard way centralizes that metadata generation into a trusted resource. Once the mapping is corrected, the correct schema.org metadata can be pulled from DataCite's API and then automatically inserted into dataset landing pages, increasing the discoverability of data across all sciences. |
| 31 | +- This bases the creation of correct schema.org on DataCite metadata, which is regarded by publishers as the most trusted resource for metadata about data (and software). Schema.org metadata is commonly embedded on landing pages and websites to support discovery. |
| 32 | +- Data repositories are interested in measuring the potential increase in traffic on the data landing pages where this updated schema.org is embedded to aid in the search engine optimization, particularly as searching with AI tools becomes more commonplace. |
| 33 | + |
| 34 | +### Type of mapped resources |
| 35 | + |
| 36 | +Science datasets are supported by this mapping. Science software will be supported by another mapping effort led by CodeMeta. |
| 37 | + |
| 38 | +### Links to existing mappings |
| 39 | + |
| 40 | +- [SSSOM file (Google Sheets)](https://docs.google.com/spreadsheets/d/1pQ8Nfxx6nZ_5-GWmWKDglrx-xbIYmfGxbQGclhi-wLo/edit?usp=sharing) |
| 41 | +- [Related GitHub issue (ESIP Science-on-schema.org)](https://github.com/ESIPFed/science-on-schema.org/issues/265) |
| 42 | + |
| 43 | +The sections of the mapping are also split out into multiple GitHub issues hosted by the ESIP Science-on-schema.org group. |
| 44 | + |
| 45 | +### Tools used for creating the mapping |
| 46 | + |
| 47 | +- Manual curation (done by hand) |
| 48 | + |
| 49 | +### Type of mapping relations |
| 50 | + |
| 51 | +The mapping aims to include all metadata fields included in the [ESIP recommended practices for expressing science datasets on schema.org](https://github.com/ESIPFed/science-on-schema.org/blob/main/guides/Dataset.md). |
| 52 | + |
| 53 | +### Examples (samples) of different types of mapping implementations |
| 54 | + |
| 55 | +- [DataCite's API](https://commons.datacite.org/doi.org) - provides schema.org metadata for datasets |
0 commit comments