-
Notifications
You must be signed in to change notification settings - Fork 577
feat: proposed Dataset API changes #3060
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 7 commits
76a870c
4b0f580
006949a
f4f3b73
7d3666e
62c528d
523b9b5
2cc3590
a6d494c
9d272b3
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,155 @@ | ||
Incorporate the changes proposed from Martynas, with the exception of graphs(), which would now return a dictionary of graph names (URIRef or BNode) to Graph objects (as the graph's identifier would be removed). | ||
|
||
``` | ||
add add_named_graph(uri: IdentifiedNode, graph: Graph) method | ||
add has_named_graph(uri: IdentifiedNode) method | ||
add remove_named_graph(uri: IdentifiedNode) method | ||
add replace_named_graph(uri: IdentifiedNode, graph: Graph)) method | ||
add graphs() method as an alias for contexts() | ||
add default_graph property as an alias for default_context | ||
add get_named_graph as an alias for get_graph | ||
deprecate graph(graph) method | ||
deprecate remove_graph(graph) method | ||
deprecate contexts() method | ||
Using IdentifiedNode as a super-interface for URIRef and BNode (since both are allowed as graph names in RDF 1.1). | ||
``` | ||
|
||
Make the following enhancements to the triples, quads, and subject/predicate/object APIs. | ||
|
||
Major changes: | ||
P1. Remove `default_union` attribute and make the Dataset inclusive. | ||
P2. Remove the Default Graph URI ("urn:x-rdflib:default"). | ||
P3. Remove Graph class's "identifier" attribute to align with the W3C spec, impacting Dataset methods which use the Graph class. | ||
P4. Make the graphs() method of Dataset return a dictionary of named graph names to Graph objects. | ||
Enhancements: | ||
P5. Support passing of iterables of Terms to triples, quads, and related methods, similar to the triples_choices method. | ||
P6. Default the triples method to iterate with `(None, None, None)` | ||
|
||
With all of the above changes, including those changes proposed by Martynas, here are some examples: | ||
|
||
```python | ||
from rdflib import Dataset, Graph, URIRef, Literal | ||
from rdflib.namespace import RDFS | ||
|
||
# ============================================ | ||
# Adding Data to the Dataset | ||
# ============================================ | ||
|
||
# Initialize the dataset | ||
d = Dataset() | ||
|
||
# Add a single triple to the Default Graph, and a single triple to a Named Graph | ||
g1 = Graph() | ||
g1.add( | ||
( | ||
URIRef("http://example.com/subject-a"), | ||
URIRef("http://example.com/predicate-a"), | ||
Literal("Triple A") | ||
) | ||
) | ||
d.add_graph(g1) | ||
recalcitrantsupplant marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
# Add a Graph to a Named Graph in the Dataset. | ||
g2 = Graph() | ||
g2.add( | ||
( | ||
URIRef("http://example.com/subject-b"), | ||
URIRef("http://example.com/predicate-b"), | ||
Literal("Triple B") | ||
) | ||
) | ||
d.add_named_graph(uri=URIRef("http://example.com/graph-B"), g2) | ||
recalcitrantsupplant marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
# ============================================ | ||
# Iterate over the entire Dataset returning triples | ||
# ============================================ | ||
|
||
for triple in d.triples(): | ||
print(triple) | ||
# Output: | ||
(rdflib.term.URIRef('http://example.com/subject-a'), rdflib.term.URIRef('http://example.com/predicate-a'), rdflib.term.Literal('Triple A')) | ||
(rdflib.term.URIRef('http://example.com/subject-b'), rdflib.term.URIRef('http://example.com/predicate-b'), rdflib.term.Literal('Triple B')) | ||
|
||
# ============================================ | ||
# Iterate over the entire Dataset returning quads | ||
# ============================================ | ||
|
||
for quad in d.quads(): | ||
print(quad) | ||
# Output: | ||
(rdflib.term.URIRef('http://example.com/subject-a'), rdflib.term.URIRef('http://example.com/predicate-a'), rdflib.term.Literal('Triple A'), None) | ||
(rdflib.term.URIRef('http://example.com/subject-b'), rdflib.term.URIRef('http://example.com/predicate-b'), rdflib.term.Literal('Triple B'), rdflib.term.URIRef('http://example.com/graph-B')) | ||
|
||
# ============================================ | ||
# Get the Default graph | ||
# ============================================ | ||
|
||
dg = d.default_graph # same as current default_context | ||
|
||
# ============================================ | ||
# Iterate on triples in the Default Graph only | ||
# ============================================ | ||
|
||
for triple in d.triples(graph="default"): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I question the usefulness of this. Why not simply: d.default_graph.triples() ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Providing "default_graph" as a convenience necessarily means there will be more than one way to iterate over the triples. There's no functional change from the current classes here, just name changes, you can already Dataset.triples(context=) and you can also Dataset.default_context.triples() |
||
print(triple) | ||
# Output: | ||
(rdflib.term.URIRef('http://example.com/subject-a'), rdflib.term.URIRef('http://example.com/predicate-a'), rdflib.term.Literal('Triple A')) | ||
|
||
# ============================================ | ||
# Access quads in Named Graphs only | ||
# ============================================ | ||
|
||
for quad in d.quads(graph="named"): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Shouldn't this be equivalent to simply There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Or is the graph element of the default graph There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes the proposal is to have the "graph" of triples in the default graph set to None. |
||
print(quad) | ||
# Output: | ||
(rdflib.term.URIRef('http://example.com/subject-b'), rdflib.term.URIRef('http://example.com/predicate-b'), rdflib.term.Literal('Triple B'), rdflib.term.URIRef('http://example.com/graph-B')) | ||
|
||
# ============================================ | ||
# Equivalent to iterating over graphs() | ||
# ============================================ | ||
|
||
for ng_name, ng_object in d.graphs().items(): | ||
for quad in d.quads(graph=ng_name): | ||
print(quad) | ||
# Output: | ||
(rdflib.term.URIRef('http://example.com/subject-b'), rdflib.term.URIRef('http://example.com/predicate-b'), rdflib.term.Literal('Triple B'), rdflib.term.URIRef('http://example.com/graph-B')) | ||
|
||
# ============================================ | ||
# Access triples in the Default Graph and specified Named Graphs. | ||
# ============================================ | ||
|
||
for triple in d.triples(graph=["default", URIRef("http://example.com/graph-B")]): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm comfortable with it - SPARQL queries in triplestores where named graphs are used frequently omit the graph, only having basic graph patterns, and we understand this to be across all graphs? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Union graph is an extension feature though, not a feature of an RDF dataset. |
||
print(triple) | ||
# Output: | ||
(rdflib.term.URIRef('http://example.com/subject-a'), rdflib.term.URIRef('http://example.com/predicate-a'), rdflib.term.Literal('Triple A')) | ||
(rdflib.term.URIRef('http://example.com/subject-b'), rdflib.term.URIRef('http://example.com/predicate-b'), rdflib.term.Literal('Triple B')) | ||
|
||
# ============================================ | ||
# Access quads in the Default Graph and specified Named Graphs. | ||
# ============================================ | ||
|
||
for quad in d.quads(graph=["default", URIRef("http://example.com/graph-B")]): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. for quad in (q for q in d.quads() if q[3] in (None, URIRef("http://example.com/graph-B"))): not much longer really. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes I think this is the point to get a broader consensus on. The way I see it, if including the graph parameter:
Cons:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe I'm too used to Jena where |
||
print(quad) | ||
# Output: | ||
(rdflib.term.URIRef('http://example.com/subject-a'), rdflib.term.URIRef('http://example.com/predicate-a'), rdflib.term.Literal('Triple A'), None) | ||
(rdflib.term.URIRef('http://example.com/subject-b'), rdflib.term.URIRef('http://example.com/predicate-b'), rdflib.term.Literal('Triple B'), rdflib.term.URIRef('http://example.com/graph-B')) | ||
|
||
# ============================================ | ||
# "Slice" the dataset on specified predicates. Same can be done on subjects, objects, graphs | ||
# ============================================ | ||
|
||
filter_preds = [URIRef("http://example.com/predicate-a"), RDFS.label] | ||
for quad in d.quads((None, filter_preds, None, None)): | ||
print(quad) | ||
# Output: | ||
(rdflib.term.URIRef('http://example.com/subject-a'), rdflib.term.URIRef('http://example.com/predicate-a'), rdflib.term.Literal('Triple A'), None) | ||
|
||
# ============================================ | ||
# Serialize the Dataset in a quads format. | ||
# ============================================ | ||
|
||
print(d.serialize(format="nquads")) | ||
# Output: | ||
<http://example.com/subject-a> <http://example.com/predicate-a> "Triple A" . | ||
<http://example.com/subject-b> <http://example.com/predicate-b> "Triple B" <http://example.com/graph-B> . | ||
``` |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,3 @@ | ||
# Fixing this here as readthedocs can't use the compiled requirements-poetry.txt | ||
# due to conflicts. | ||
poetry==1.8.4 | ||
poetry==1.8.5 |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,4 @@ | ||
# This file is used for building a docker image of the latest rdflib release. It | ||
# will be updated by dependabot when new releases are made. | ||
rdflib==7.1.0 | ||
rdflib==7.1.3 | ||
html5rdf==1.2.0 | ||
# html5lib-modern is required to allow the Dockerfile to build on with pre-RDFLib-7.1.1 releases. | ||
html5lib-modern==1.2.0 |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,16 +1,14 @@ | ||
# | ||
# This file is autogenerated by pip-compile with Python 3.12 | ||
# This file is autogenerated by pip-compile with Python 3.8 | ||
# by the following command: | ||
# | ||
# pip-compile docker/latest/requirements.in | ||
# | ||
html5rdf==1.2 | ||
# via | ||
# -r docker/latest/requirements.in | ||
# rdflib | ||
html5lib-modern==1.2 | ||
# via -r docker/latest/requirements.in | ||
isodate==0.7.2 | ||
# via rdflib | ||
pyparsing==3.0.9 | ||
# via rdflib | ||
rdflib==7.1.0 | ||
rdflib==7.1.3 | ||
# via -r docker/latest/requirements.in |
Uh oh!
There was an error while loading. Please reload this page.