Skip to content

Provide long-term hosting solutions for RDF/Linked Data #81

@chiarcos

Description

@chiarcos

This is especially for academic data and demonstrators, and this is less about technology or formalisms, but about politics.

Long-term availability is a problem for Linked Data technology in general. Infrastructure providers (e.g., libraries) can be hesitant to provide SPARQL end points, and long-term maintenance for services can probably not be expected at all. Case in point: The infamous British Museum SPARQL end point. Once hailed in Digital Humanities, but now largely defunct. There are challenges here that can be addressed on a technical level, but this is not what this issue is about.

A minimal requirement to ensure long-term (re-)usability of Linked Data and the feasility of federation (as unique selling points for RDF technology) would be to guarantee the acccessibility of the data itself, ideally in a way that allows resolvable URIs.
Indeed, major infrastructures for hosting research data have emerged in different fields, e.g., Zenodo, DSPACE, or CLARIN, and these could provide exactly that, at least for the academic sector. However, they currently do not seem to support RDF formats as mime types for deposited data (for metadata, they do). See zenodo/zenodo#1515 for the discussion on Zenodo. So, the mime type RDF data is published under is normally text/plain. This means that applications need to guess the format if they attempt to resolve URIs against a resource. This can work, but it is unreliable. In particular, it will fail if URIs do not include the file ending (as recommended, because we have content negotiation, except not here), or if the data URI carries any flags after the file ending (e.g., "...?download=1").

Example:
FAILURE (using Zenodo-provided data link)
http://www.sparql.org/sparql?query=SELECT+*%0D%0AFROM+%3Chttps%3A%2F%2Fzenodo.org%2Frecord%2F4444132%2Ffiles%2Fcrmtex.owl%3Fdownload%3D1%3E%0D%0AWHERE+%7B+%3Fa+%3Fb+%3Fc+%7D+LIMIT+10&default-graph-uri=&output=xml&stylesheet=%2Fxml-to-html.xsl

SUCCESS (skipping download flags, providing full file ending)
http://www.sparql.org/sparql?query=SELECT+*%0D%0AFROM+%3Chttps%3A%2F%2Fzenodo.org%2Frecord%2F4444132%2Ffiles%2Fcrmtex.owl%3E%0D%0AWHERE+%7B+%3Fa+%3Fb+%3Fc+%7D+LIMIT+10&default-graph-uri=&output=xml&stylesheet=%2Fxml-to-html.xsl

Actually, this is very easy to fix, we just need to petition repeatedly and massively to maintainers and developers of such infrastructures that data is declarable as text/turtle (etc.) than just text/plain.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions