Skip to content

Commit 78e9bda

Browse files
committed
Update README.md
1 parent 47200d2 commit 78e9bda

File tree

10 files changed

+335
-52
lines changed

10 files changed

+335
-52
lines changed

README.md

Lines changed: 185 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -5,26 +5,28 @@ Work with RDF-related concepts, datasets, and files in Go.
55
* Decode TriG, N-Quads, XML, JSON-LD, HTML, and other RDF-based sources.
66
* Reference common IRI constants generated from vocabularies.
77
* Canonicalize datasets with RDFC-1.0.
8-
* Track data lineage for RDF properties within source files.
9-
* Build higher-level abstractions based on RDF primitives.
108

119
## Usage
1210

13-
Import the module and refer to the code's documentation ([pkg.go.dev](https://pkg.go.dev/github.com/dpb587/rdfkit-go)).
11+
Refer to the code's documentation ([pkg.go.dev](https://pkg.go.dev/github.com/dpb587/rdfkit-go)). Below are a few packages to get started with...
1412

1513
```go
14+
// rdf contains primitives for statements and value types
1615
import "github.com/dpb587/rdfkit-go/rdf"
17-
```
1816

19-
Some sample use cases and starter snippets can be found in the [`examples` directory](examples).
17+
// encoding subpackages are well-known formats with encoders/decoders
18+
import "github.com/dpb587/rdfkit-go/encoding/nquads"
2019

21-
<details><summary><code>examples$ go run <strong>./rdf-to-dot -i https://www.w3.org/2000/01/rdf-schema.ttl</strong> | dot -Tsvg</code></summary>
20+
// rdfio offers simplified access to files and encodings
21+
import "github.com/dpb587/rdfkit-go/rdfio"
2222

23-
![rdf-schema](dev/artifacts/readme-rdf-ontology.svg)
23+
// rdfcanon implements the dataset canonicalization algorithm
24+
import "github.com/dpb587/rdfkit-go/rdfcanon"
25+
```
2426

25-
</details>
27+
The [`examples` submodules](examples) demonstrates some common use cases and starter snippets.
2628

27-
<details><summary><code>examples$ go run <strong>./html-extract https://microsoft.com</strong></code></summary>
29+
<details><summary><code>html-extract$ <strong>go run . https://microsoft.com</strong></code></summary>
2830

2931
```turtle
3032
@base <https://www.microsoft.com/en-us/> .
@@ -56,6 +58,155 @@ _:b0
5658

5759
</details>
5860

61+
The [`cmd/rdfkit` submodule](cmd/rdfkit) offers command line access to some common tasks. Refer to its subpackages to learn more about their implementations.
62+
63+
<details><summary><code>rdfkit$ <strong>go run . --help</strong></code></summary>
64+
65+
```
66+
Usage:
67+
rdfkit [command]
68+
69+
Available Commands:
70+
canonicalize Convert a dataset into canonical blank nodes and ordering
71+
completion Generate the autocompletion script for the specified shell
72+
export-dot Generate a Graphviz DOT visualization from an ontology
73+
export-go-iri Generate a Go file of IRI constants from an ontology
74+
help Help about any command
75+
pipe Decode and re-encode using supported encoding formats
76+
77+
Flags:
78+
-h, --help help for rdfkit
79+
80+
Use "rdfkit [command] --help" for more information about a command.
81+
```
82+
83+
</details>
84+
85+
<details><summary><code>rdfkit$ <strong>go run . pipe --help</strong></code></summary>
86+
87+
```
88+
Decode and re-encode using supported encoding formats
89+
90+
Usage:
91+
rdfkit pipe [flags]
92+
93+
Flags:
94+
-h, --help help for pipe
95+
-i, --in string path or IRI for reading (default stdin)
96+
--in-base string override the base IRI of the resource
97+
--in-param stringArray extra decode configuration parameters (syntax "KEY[=VALUE]")
98+
--in-param-io stringArray extra read configuration parameters (syntax "KEY[=VALUE]")
99+
--in-type string name or alias for the decoder (default detect)
100+
-o, --out string path or IRI for writing (default stdout)
101+
--out-base string override the base IRI of the resource
102+
--out-param stringArray extra encode configuration parameters (syntax "KEY[=VALUE]")
103+
--out-param-io stringArray extra write configuration parameters (syntax "KEY[=VALUE]")
104+
--out-type string name or alias for the encoder (default detect or nquads)
105+
106+
Encodings:
107+
108+
org.json-ld.document (decode)
109+
110+
Aliases: jsonld
111+
File Extensions: .jsonld
112+
Media Types: application/ld+json
113+
114+
--in-param captureTextOffsets[=bool]
115+
Capture the line+column offsets for statement properties
116+
117+
--in-param tokenizer.lax[=bool]
118+
Accept and recover common syntax errors
119+
120+
org.w3.n-quads (decode, encode)
121+
122+
Aliases: n-quads, nq, nquads
123+
File Extensions: .nq
124+
Media Types: application/n-quads
125+
126+
--in-param captureTextOffsets[=bool]
127+
Capture the line+column offsets for statement properties
128+
129+
--out-param ascii[=bool]
130+
Use escape sequences for non-ASCII characters
131+
132+
org.w3.n-triples (decode, encode)
133+
134+
Aliases: n-triples, nt, ntriples
135+
File Extensions: .nt
136+
Media Types: application/n-triples
137+
138+
--in-param captureTextOffsets[=bool]
139+
Capture the line+column offsets for statement properties
140+
141+
--out-param ascii[=bool]
142+
Use escape sequences for non-ASCII characters
143+
144+
org.w3.rdf-json (decode, encode)
145+
146+
Aliases: rdf-json, rdfjson, rj
147+
File Extensions: .rj
148+
Media Types: application/rdf+json
149+
150+
--in-param captureTextOffsets[=bool]
151+
Capture the line+column offsets for statement properties
152+
153+
org.w3.rdf-xml (decode)
154+
155+
Aliases: rdf-xml, rdfxml, xml
156+
File Extensions: .rdf
157+
Media Types: application/rdf+xml
158+
159+
--in-param captureTextOffsets[=bool]
160+
Capture the line+column offsets for statement properties
161+
162+
org.w3.trig (decode)
163+
164+
Aliases: trig
165+
File Extensions: .trig
166+
Media Types: application/trig
167+
168+
--in-param captureTextOffsets[=bool]
169+
Capture the line+column offsets for statement properties
170+
171+
org.w3.turtle (decode, encode)
172+
173+
Aliases: ttl, turtle
174+
File Extensions: .ttl
175+
Media Types: text/turtle
176+
177+
--in-param captureTextOffsets[=bool]
178+
Capture the line+column offsets for statement properties
179+
180+
--out-param buffered[=bool]
181+
Load all statements into memory before writing any output
182+
183+
--out-param iris.useBase[=bool]
184+
Prefer IRIs relative to the resource IRI
185+
186+
--out-param iris.usePrefix=string...
187+
Prefer IRIs using a prefix. Use the syntax of "{prefix}:{iri}", "rdfa-context", or "none"
188+
189+
--out-param resources[=bool]
190+
Write nested statements and resource descriptions (implies buffered=true)
191+
192+
public.html (decode)
193+
194+
Aliases: htm, html, xhtml
195+
File Extensions: .htm, .html, .xhtml
196+
Media Types: application/xhtml+xml, text/html, text/xhtml+xml
197+
198+
--in-param captureTextOffsets[=bool]
199+
Capture the line+column offsets for statement properties
200+
```
201+
202+
</details>
203+
204+
<details><summary><code>rdfkit$ <strong>go run . export-dot -i https://www.w3.org/2000/01/rdf-schema.ttl</strong> | dot -Tsvg</code></summary>
205+
206+
![rdf-schema](dev/artifacts/readme-rdf-ontology.svg)
207+
208+
</details>
209+
59210
## Primitives
60211

61212
Based on the [Resource Description Framework](https://www.w3.org/TR/rdf11-concepts/) (RDF), there are three primitive value types, aka *terms*, that are used to represent data: *IRIs*, *literals*, and *blank nodes*. The primitive value types are the basis of *triples* and other assertions about information.
@@ -105,9 +256,10 @@ A *blank node* represents an anonymous resource and are always created with a un
105256

106257
```go
107258
bnode := rdf.NewBlankNode()
259+
bnode.Identifier != rdf.NewBlankNode().Identifier
108260
```
109261

110-
The [`blanknodeutil` package](rdf/blanknodeutil) provides additional support for using string-based identifiers (e.g. `b0`), mapping blank nodes from implementations, and scoped factories.
262+
The [`blanknodes` package](rdf/blanknodes) provides additional support for using string-based identifiers (e.g. `b0`) and other utilities.
111263

112264
### Triple
113265

@@ -121,19 +273,11 @@ nameTriple := rdf.Triple{
121273
}
122274
```
123275

124-
The fields of a triple are restricted to the normative value types they support, described by the table below.
125-
126-
| Field | IRI | Literal | Blank Node |
127-
| ----- |:---:|:-------:|:----------:|
128-
| Subject | Valid | Invalid | Valid |
129-
| Predicate | Valid | Invalid | Invalid |
130-
| Object | Valid | Valid | Valid |
131-
132-
The `rdf` package includes other supporting types (e.g. `TripleList`, `TripleIterator`, and `TripleMatcher`), and the [`triples` package](rdf/triples) offers additional interfaces and utilities for working with triple types.
276+
The `rdf` package includes other supporting types (e.g. `TripleList`, `TripleIterator`, and `TripleMatcher`), and the [`triples` package](rdf/triples) offers additional interfaces and utilities for working with triples.
133277

134278
### Quad
135279

136-
A *quad* is used to describe a triple with an optional graph name. A graph name may be an IRI, Blank Node, or `nil` which indicates the default graph.
280+
A *quad* is used to describe a triple with an optional graph name.
137281

138282
```go
139283
nameQuad := rdf.Quad{
@@ -142,7 +286,18 @@ nameQuad := rdf.Quad{
142286
}
143287
```
144288

145-
Similar to triples, the `rdf` and [`quads` package](rdf/quads) offers additional interfaces and utilities.
289+
The `rdf` and [`quads` packages](rdf/quads) offer additional interfaces and utilities for working with quads.
290+
291+
### Property Values
292+
293+
The fields of triples and quads are restricted (with interfaces) to the normative value types they support, described by the table below. [Generalized RDF](https://www.w3.org/TR/rdf11-concepts/#section-generalized-rdf) values are not currently supported.
294+
295+
| Field | IRI | Literal | Blank Node | `nil` |
296+
| ----- |:---:|:-------:|:----------:|:-----:|
297+
| Subject | Valid | Invalid | Valid | Invalid |
298+
| Predicate | Valid | Invalid | Invalid | Invalid |
299+
| Object | Valid | Valid | Valid | Invalid |
300+
| GraphName | Valid | Invalid | Valid | Valid |
146301

147302
## Graphs
148303

@@ -172,29 +327,25 @@ The [`inmemory` experimental package](x/storage/inmemory) offers a dataset imple
172327
storage := inmemory.NewDataset()
173328
```
174329

175-
Better-supported storage or alternative, remote service clients will likely be a focus on the future.
330+
Better-supported storage or alternative, remote service clients will likely be a focus in the future.
176331

177332
## Encodings
178333

179334
An *encoding* (or *file format*) is used to decode and encode RDF data. The following encodings are available under the [`encoding` package](encoding).
180335

181336
| Package | Decode | Encode |
182337
|:------- |:------:|:------:|
183-
| [`htmljsonld`](encoding/htmljsonld) | Quad | n/a |
184-
| [`htmlmicrodata`](encoding/htmlmicrodata) | Triple | n/a |
338+
| [`htmljsonld`](encoding/htmljsonld) | Quad | - |
339+
| [`htmlmicrodata`](encoding/htmlmicrodata) | Triple | - |
185340
| [`jsonld`](encoding/jsonld) | Quad | Quad |
186341
| [`nquads`](encoding/nquads) | Quad | Quad |
187342
| [`ntriples`](encoding/ntriples) | Triple | Triple |
188-
| [`rdfa`](encoding/rdfa) | Triple | n/a |
343+
| [`rdfa`](encoding/rdfa) | Triple | - |
189344
| [`rdfjson`](encoding/rdfjson) | Triple | Triple |
190-
| [`rdfxml`](encoding/rdfxml) | Triple | n/a |
191-
| [`trig`](encoding/trig) | Quad | n/a |
345+
| [`rdfxml`](encoding/rdfxml) | Triple | - |
346+
| [`trig`](encoding/trig) | Quad | - |
192347
| [`turtle`](encoding/turtle) | Triple | Triple, Description |
193348

194-
Some encodings do not yet support all syntactic features defined by their official specification, though they should cover common practices. Most are tested against some sort of test suite (such as the ones published by W3C), and the latest results can be found in their `testsuites/*/testresults` directory.
195-
196-
Broader support for encoders will likely be added in the future.
197-
198349
### Decoder
199350

200351
Encodings provide a `NewDecoder` function which require an `io.Reader` and optional `DecoderConfig` options. It can be used as an iterator for all statements found in the encoding. Depending on the capabilities of the encoding format, the decoder fulfills either the `encoding.TripleDecoder` or `encoding.QuadDecoder` interface.
@@ -238,7 +389,7 @@ for decoder.Next() {
238389

239390
When working with offsets, consider the following caveats.
240391

241-
* Capturing and processing text offsets comes with a slight impact to performance and memory.
392+
* Capturing and processing text offsets impacts the performance and memory.
242393
* Offsets for some properties may not always be available due to decoding limitations.
243394
* Offsets for some properties may be "incomplete" due to stream processing. For example, `turtle` may only refer to the opening `[` token of an anonymous resource when the closing `]` token has not yet been read.
244395

@@ -330,7 +481,7 @@ Once canonicalized, the encoded N-Quads form can be directly written to an `io.W
330481
_, err := canonicalized.WriteTo(os.Stdout)
331482
```
332483

333-
Alternatively, use `NewIterator` to manually iterate over the results containing its encoded form. If the `BuildCanonicalQuad` option was enabled, use `NewQuadIterator` for a standard `rdf.QuadIterator` of quads including the canonicalized blank nodes.
484+
Alternatively, use `NewIterator` to manually iterate over the results containing its encoded form. If the `BuildCanonicalQuad` option was enabled, use `NewQuadIterator` for a standard `rdf.QuadIterator` of quads with the canonical blank nodes.
334485

335486
```go
336487
canonicalized, err := rdfcanon.Canonicalize(quadIterator, rdfcanon.CanonicalizeConfig{}.
@@ -423,17 +574,9 @@ ok && rIRI == "http://example.com/resource"
423574

424575
The [`rdfacontext` package](rdf/iriutil/rdfacontext/) provides a list of prefix mappings defined by the W3C at [RDFa Core Initial Context](https://www.w3.org/2011/rdfa-context/rdfa-1.1). This includes prefixes such as `owl:`, `rdfa:`, and `xsd:`. The list of widely-used prefixes is included as well, which includes prefixes such as `dc:` and `schema:`.
425576

426-
## Command Line
427-
428-
The `cmd/rdfkit` package offers a command line interface with a few utilities. Most notably:
429-
430-
* `irigen` - generate Go constants from an RDF vocabulary. Used internally for most of the `*iri` packages.
431-
* `pipe` - decode local files or remote URLs, and then re-encode using any of the supported RDF formats.
432-
433577
## Notes
434578

435-
* **RDF 1.2** (i.e. RDF-star) - not currently supported; waiting for more stability in the draft specification and definitions.
436-
* **Generalized RDF** - not currently supported; may be introduced in the future as a breaking change or via build tag.
579+
* **RDF 1.2** (i.e. RDF-star) - not currently supported; likely to add primitive type support soon, encodings later.
437580
* This is a periodically updated fork based on private usage. There may still be some breaking changes before starting to version this module.
438581

439582
## License

cmd/rdfkit/canonicalizecmd/command.go

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,9 @@ func New(app *cmdutil.App) *cobra.Command {
1515
fIn := &cmdflags.EncodingInput{}
1616

1717
cmd := &cobra.Command{
18-
Use: "canonicalize",
18+
Use: "canonicalize",
19+
Short: "Convert a dataset into canonical blank nodes and ordering",
20+
Args: cobra.ExactArgs(0),
1921
RunE: func(cmd *cobra.Command, args []string) error {
2022
ctx := cmd.Context()
2123

cmd/rdfkit/exportdotcmd/command.go

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -24,8 +24,9 @@ func New(app *cmdutil.App) *cobra.Command {
2424
fOut := &cmdflags.EncodingOutput{}
2525

2626
cmd := &cobra.Command{
27-
Use: "export-dot",
28-
Args: cobra.ExactArgs(0),
27+
Use: "export-dot",
28+
Short: "Generate a Graphviz DOT visualization from an ontology",
29+
Args: cobra.ExactArgs(0),
2930
RunE: func(cmd *cobra.Command, args []string) error {
3031
ctx := cmd.Context()
3132

cmd/rdfkit/exportgoiricmd/command.go

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -26,8 +26,9 @@ func New(app *cmdutil.App) *cobra.Command {
2626
fIn := &cmdflags.EncodingInput{}
2727

2828
cmd := &cobra.Command{
29-
Use: "export-go-iri",
30-
Args: cobra.ExactArgs(0),
29+
Use: "export-go-iri",
30+
Short: "Generate a Go file of IRI constants from an ontology",
31+
Args: cobra.ExactArgs(0),
3132
RunE: func(cmd *cobra.Command, args []string) error {
3233
ctx := cmd.Context()
3334

cmd/rdfkit/pipecmd/command.go

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,9 @@ func New(app *cmdutil.App) *cobra.Command {
1919
fOut := &cmdflags.EncodingOutput{}
2020

2121
cmd := &cobra.Command{
22-
Use: "pipe",
22+
Use: "pipe",
23+
Short: "Decode and re-encode using supported encoding formats",
24+
Args: cobra.ExactArgs(0),
2325
RunE: func(cmd *cobra.Command, args []string) error {
2426
ctx := cmd.Context()
2527

dev/artifacts/readme-rdf-ontology.dot

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ digraph {
77
<tr><td align="left" colspan="2">The RDF Schema vocabulary (RDFS)</td></tr>
88
<tr><td align="right">Datatype</td><td align="left" href="http://www.w3.org/2001/XMLSchema#string">xsd:string</td></tr>
99
</table>>,shape=plain];
10-
r0 -> lit0 [href="http://purl.org/dc/elements/1.1/title",label="dc:title"]
10+
r0 -> lit0 [href="http://purl.org/dc/elements/1.1/title",label="dc11:title"]
1111
r2 [fillcolor="lavender",href="http://www.w3.org/2000/01/rdf-schema#Resource",label="rdfs:Resource",shape=box,style="filled,rounded,setlinewidth(2)"]
1212
r3 [fillcolor="lavender",href="http://www.w3.org/2000/01/rdf-schema#Class",label="rdfs:Class",shape=box,style="filled,rounded,setlinewidth(2)"]
1313
r2 -> r3 [href="http://www.w3.org/1999/02/22-rdf-syntax-ns#type",label="rdf:type"]

dev/artifacts/readme-rdf-ontology.svg

Lines changed: 3 additions & 3 deletions
Loading

0 commit comments

Comments
 (0)