Skip to content

Commit d527cf6

Browse files
Merge pull request #111 from TREEcg/split-discovery
Rewrite of the spec
2 parents 4e2027a + 011d2a7 commit d527cf6

26 files changed

+512
-890
lines changed

.github/workflows/Build-ShapeTopologies-spec.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ jobs:
2121

2222
# if your doc isn’t in the root folder,
2323
# or Bikeshed otherwise can’t find it:
24-
SOURCE: shape-topologies.bs
24+
SOURCE: 02-shape-topologies.bs
2525

2626
# output filename defaults to your input
2727
# with .html extension instead,

.github/workflows/Build-TREE-spec.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ jobs:
2121

2222
# if your doc isn’t in the root folder,
2323
# or Bikeshed otherwise can’t find it:
24-
SOURCE: spec.bs
24+
SOURCE: 01-tree-specification.bs
2525

2626
# output filename defaults to your input
2727
# with .html extension instead,
Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
name: Build TREE discovery spec
2+
on:
3+
workflow_dispatch: {}
4+
pull_request: {}
5+
push:
6+
branches: [master]
7+
jobs:
8+
main:
9+
name: Build, Validate and Deploy
10+
runs-on: ubuntu-20.04
11+
permissions:
12+
contents: write
13+
steps:
14+
- uses: actions/checkout@v3
15+
- uses: w3c/spec-prod@v2
16+
with:
17+
TOOLCHAIN: bikeshed
18+
19+
# Modify as appropriate
20+
GH_PAGES_BRANCH: gh-pages
21+
22+
# if your doc isn’t in the root folder,
23+
# or Bikeshed otherwise can’t find it:
24+
SOURCE: 03-discovery-specification.bs
25+
26+
# output filename defaults to your input
27+
# with .html extension instead,
28+
# but if you want to customize it:
29+
DESTINATION: discovery.html

01-tree-specification.bs

Lines changed: 334 additions & 0 deletions
Large diffs are not rendered by default.

03-discovery-specification.bs

Lines changed: 124 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,124 @@
1+
<pre class='metadata'>
2+
Title: TREE Discovery and Context Information
3+
Shortname: TREEDiscovery
4+
Level: 1
5+
Status: w3c/CG-DRAFT
6+
Markup Shorthands: markdown yes
7+
Group: TREE hypermedia community group
8+
URL: https://w3id.org/tree/specification/discovery
9+
Repository: https://github.com/treecg/specification
10+
Mailing List: public-treecg@w3.org
11+
Mailing List Archives: https://lists.w3.org/Archives/Public/public-treecg/
12+
Editor: Pieter Colpaert, https://pietercolpaert.be
13+
Abstract:
14+
This specification defines how a client selects a specific dataset and search tree, as well as extracts relevant context information.
15+
</pre>
16+
17+
# Definitions # {#overview}
18+
19+
A `tree:Collection` is a subclass of `dcat:Dataset` ([[!vocab-dcat-3]]).
20+
The specialization being that this particular dataset is a collection of _members_.
21+
22+
A `tree:SearchTree` is a subClassOf `dcat:Distribution`.
23+
The specialization being that it uses the main TREE specification to publish a search tree.
24+
25+
A node from which all other nodes can be found is a `tree:RootNode`.
26+
27+
Note: The `tree:SearchTree` and the `tree:RootNode` MAY be identified by the same IRI when no disambiguation is needed.
28+
29+
A TREE client MUST be provided with a URL to start from, which we call the _entrypoint_.
30+
31+
# Initializing a client with a url # {#starting-from}
32+
33+
The goal of the client is to understand what `tree:Collection` it is using, and to find a `tree:RootNode` to start the traversal phase from.
34+
This discovery specification extends the initialization step in the TREE specification, for the cases in which multiple options are possible.
35+
36+
The client MUST dereference the URL, which will result in a set of quads. The client now MUST first perform the init step from the main specification.
37+
If that did not return any result, then the client MUST check whether the URL before redirects (`E`) has been used in one of the following discovery patterns described in the subsections:
38+
1. `E` is a `tree:Collection`: then the client needs to [select the right search tree](#tree-search-trees)
39+
2. `E` is a `dcat:Dataset`: then the client needs to [select the right distribution or dataservice from a catalog](#dcat-dataset)
40+
3. `E` is a `ldes:EventStream`: then the client MAY take into account [LDES specific properties](#ldes)
41+
4. `E` is a `dcat:Distribution`: then the client needs to [process it accordingly](#dcat-distribution)
42+
5. `E` is a `dcat:DataService`: then the client needs to [process it accordingly](#dcat-dataservice)
43+
6. `E` is a catalog or is not explicitly mentioned: then it needs to select a dataset based on [shape information](#tree-collection-shapes) and [DCAT Catalog information](#dcat-catalog)
44+
45+
## Selecting a collection via shapes ## {#tree-collection-shapes}
46+
47+
When multiple collections are found by a client, it can choose to prune the collections based on the `tree:shape` property.
48+
The `tree:shape` property will refer to a first `sh:NodeShape`.
49+
The collection MAY be pruned in case there is no overlap with the properties the client needs.
50+
51+
Issue: Will we document the precise algorithm to use? Should we extend shapes with cardinality approximations as well?
52+
53+
## Selecting a collection via a catalog ## {#dcat-catalog}
54+
55+
A DCAT Catalog is an overview of datasets, data services and distributions.
56+
As TREE clients first need to select a dataset, and then a search tree to use, it aligns with how DCAT-AP works.
57+
DCAT discovery extends upon the previous section in which a collection or dataset can be selected based on the `tree:shape` property.
58+
59+
For now, we will assume the DCAT information is available in subject pages.
60+
61+
Issue: Do we need more text on how to handle different types of DCAT interfaces?
62+
63+
The dataset descriptions can be used for filtering the datasets available in a catalog to a list of datasets that can be useful for the client.
64+
Such properties may include the spatial extent, the time extent, or how it is possibly a part of another `dcat:Dataset`.
65+
66+
Issue: How precise do we need to be in this specification?
67+
68+
When the `dcat:Dataset` is a `tree:Collection`, the DCAT catalog is going to contain a `dct:type` property with `https://w3id.org/tree#Collection` or `https://w3id.org/ldes#EventStream` as the object.
69+
70+
## Choosing from multiple SearchTrees with TREE ## {#tree-search-trees}
71+
72+
Issue: This is yet to be done
73+
74+
## Selecting a search tree via a DCAT dataset ## {#dcat-dataset}
75+
76+
The are two ways in which you can find a search tree from a dataset: via the distributions and via the data services. Both need to be tested.
77+
Selecting a distribution or data service when multiple are available needs to be done based on [the search tree description](tree-search-trees).
78+
If nothing is available, all need to be tested by processing them as exemplifie din the next subsections.
79+
80+
### Selecting a search tree via DCAT Distribution ### {#dcat-distribution}
81+
82+
`E dcat:distribution ?D . ?D dcat:downloadURL ?N .` then ?N is a rootnode of E.
83+
84+
Issue: This is yet to be done
85+
86+
### Selecting a search tree from a DCAT data service ### {#dcat-dataservice}
87+
88+
* `?DS dcat:servesDataset E ; dcat:endpointURL ?U` or `E dcat:endpointURL ?U`, then the algorithm MUST repeat the algorithm with `?U` as the entrypoint.
89+
90+
Issue: This is yet to be done
91+
92+
## Linked Data Event Streams ## {#ldes}
93+
94+
In case the client is not made for query answering, but only for setting up a replication and synchronization system, then there is a special type that can be used to indicate the search tree is made for this purpose: the `ldes:EventSource`.
95+
Clients that want to prioritize taking a _full_ copy MAY give full priority to this server hint.
96+
97+
<div class="example">
98+
```turtle
99+
E a ldes:EventSource ;
100+
tree:rootNode|dcat:downloadURL </node1> .
101+
```
102+
</div>
103+
104+
# Extracting content information # {#context}
105+
106+
Issue: This is yet to be done
107+
108+
Context information enables a client to understand who the creator of a certain dataset is, when it was last changed, what other datasets it was derived from, etc.
109+
110+
## DCAT and dcterms ## {#context-dcat}
111+
112+
Issue: This is yet to be done
113+
114+
## Provenance ## {#context-prov}
115+
116+
Issue: This is yet to be done
117+
118+
## Linked Data Event Streams ## {#context-ldes}
119+
120+
Issue: This is yet to be done
121+
122+
LDES (https://w3id.org/ldes/specification) is a way to evolve search trees in a consistent way. It defines every member as immutable, and a collection as append-only.
123+
Therefore, one can make sure to only process each member once.
124+
Extra terms are added, such as the concept of an EventStream, retention policies and a timestampPath.

TREE-overview.svg

Lines changed: 1 addition & 0 deletions
Loading
File renamed without changes.
Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,5 @@
11
# Compatibility # {#compatibility}
22

3-
## DCAT ## {#dcat}
4-
5-
[[!VOCAB-DCAT-2]] is the standard for Open Data Portals by W3C. In order to find TREE compliant datasets in data portals, there SHOULD be a <code>dcat:endpointDescription</code> from the <code>dcat:DataService</code> to the entrypoint where the <code>tree:Collection</code>s and the <code>tree:ViewDescription</code>s are listed. Furthermore, there SHOULD be a <code>dct:conformsTo</code> this URI: <code>https://w3id.org/tree/specification</code>.
6-
73
## Hydra ## {#hydra}
84

95
A <code>tree:Collection</code> is compatible with the [Hydra Collections specification](https://www.hydra-cg.com/spec/latest/core/#collections). However, instead of <code>hydra:view</code>, we use <code>tree:view</code> and do not link to a <code>hydra:PartialCollectionView</code> but to a <code>tree:Node</code>.
File renamed without changes.

0 commit comments

Comments
 (0)