Skip to content

Commit 9d0983f

Browse files
committed
add flow diagrams to orchestrator section
1 parent e73094a commit 9d0983f

File tree

4 files changed

+173
-16
lines changed

4 files changed

+173
-16
lines changed

index.bs

Lines changed: 107 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -9,15 +9,33 @@ Repository: https://github.com/your-org/rdf-connect
99
Abstract: Some abstract
1010
</pre>
1111

12+
<link rel=stylesheet href="https://cdn.jsdelivr.net/npm/mermaid@11/dist/mermaid.min.css">
13+
1214
# Introduction
1315

1416
RDF Connect is a modular framework for building and executing multilingual data processing pipelines using RDF as the configuration and orchestration layer.
1517

1618
It enables fine-grained, reusable processor components that exchange streaming data, allowing workflows to be described declaratively across programming languages and environments.
1719
RDF Connect is especially well suited for data transformation, integration, and linked data publication.
1820

21+
# Usage Paths
22+
23+
<div class=note>
24+
This section is complete in terms of content, but may be reorganized or rewritten for clarity during editorial review.
25+
</div>
26+
27+
Depending on your use case, you may only need a subset of this specification:
28+
29+
* **Pipeline Authors**: Start with [[#getting-started]] and focus on [[#pipeline]].
30+
* **Processor Developers**: Read [[#processor]] and [[#runner]].
31+
* **Platform Maintainers**: Read all sections, including implementation notes.
32+
1933
# Design Goals
2034

35+
<div class=note>
36+
This section is complete in terms of content, but may be reorganized or rewritten for clarity during editorial review.
37+
</div>
38+
2139
* **Language Agnosticism**: Integrate processor components written in different languages (e.g., JavaScript, Python, shell).
2240
* **Streaming by Default**: Pass data between processors, via an orchestrator, to support large-scale and real-time data.
2341
* **Semantic Configuration**: Use RDF to define processors, pipelines, inputs, and outputs.
@@ -43,11 +61,38 @@ Processors can be written in any language and executed with a runner.
4361

4462
A runner is an execution strategy for processors — for example, a processor in Javascript is a class that is started using a `NodeRunner`.
4563

64+
## Orchestrator
65+
66+
The orchestrator is the core component responsible for executing a pipeline.
67+
It reads the configuration, initializes runners, dispatches processor instantiations, and coordinates data flow between them.
68+
It acts as the runtime conductor that interprets RDF Connect’s declarative configuration.
69+
4670
## Reader / Writer
4771

4872
Readers and Writers are components that define how data is streamed into and out of a processor.
4973
These provide an idiomatic way to transport streaming data between processors.
5074

75+
76+
# Getting Started
77+
78+
<div class=issue>
79+
🚧 This section is a work in progress and will be expanded soon.
80+
</div>
81+
82+
This section provides a high-level overview of how to define and run a pipeline in RDF Connect. The rest of the specification provides detail on how each part works.
83+
84+
Here's a simple example:
85+
86+
```turtle
87+
ex:pipeline a rdfc:Pipeline ;
88+
rdfc:instantiates ex:myRunner ;
89+
rdfc:processor ex:myProcessor .
90+
```
91+
92+
Once a pipeline is fully described using RDF, it is handed off to the orchestrator.
93+
The orchestrator parses the configuration, resolves all runner and processor definitions, and initiates execution.
94+
95+
5196
# SHACL as Configuration Schema
5297

5398
RDF Connect uses SHACL [[shacl]] not only as a data validation mechanism but also as a schema language for defining the configuration interface of components such as processors and runners.
@@ -248,19 +293,63 @@ This results in the following JSON object:
248293
```
249294

250295

251-
# Getting Started
252296

253-
This section provides a high-level overview of how to define and run a pipeline in RDF Connect. The rest of the specification provides detail on how each part works.
297+
# RDF Connect by Layer
254298

255-
Here's a simple example:
256299

257-
```turtle
258-
ex:pipeline a rdfc:Pipeline ;
259-
rdfc:instantiates ex:myRunner ;
260-
rdfc:processor ex:myProcessor .
261-
```
300+
## Orchestrator
301+
302+
The orchestrator is the central runtime entity in RDF Connect.
303+
It reads the pipeline configuration, sets up the runners, initiates processors, and routes messages between them.
304+
It ensures the dataflow graph described by the pipeline is brought to life across isolated runtimes.
305+
The orchestrator acts as a coordinator, not an executor. Each runner is responsible for running the actual processor code, but the orchestrator ensures the pipeline as a whole behaves as intended.
306+
307+
Responsibilities:
308+
309+
* Parse the pipeline RDF.
310+
* Load SHACL shapes for processors and runners.
311+
* Validate and coerce configuration to structured JSON.
312+
* Instantiate runners.
313+
* Start processors.
314+
* Mediate messages (data and control).
315+
* Handle retries, streaming, and backpressure.
316+
317+
<div class=note>
318+
The remained of this section is intended for developers building custom runners or integrating RDF Connect into infrastructure.
319+
</div>
320+
321+
### Protobuf Messaging Protocol
322+
323+
Communication between the orchestrator and the runners happens using a strongly typed protocol defined in Protocol Buffers (protobuf).
324+
This enables language-independent and efficient communication across processes and machines.
325+
326+
### Startup Flow
327+
328+
The following diagram describes the startup sequence handled by the orchestrator. This includes validating pipeline structure, instantiating runners, and initializing processor instances.
329+
330+
<pre class=include>
331+
path: ./startup.mdd
332+
</pre>
333+
334+
### Message Handling Flow
335+
336+
Once processors are running, the orchestrator handles incoming messages and forwards them to the appropriate reader instances, based on their declared channels.
337+
338+
<pre class=include>
339+
path: ./message.mdd
340+
</pre>
341+
342+
343+
### Streaming Messages
344+
345+
For large messages or real-time input, RDF Connect supports a streaming model.
346+
Instead of sending entire payloads as a single message, the message can be broken into chunks and sends them over time.
347+
This is handled by the StreamChunk message type.
348+
349+
<pre class=include>
350+
path: ./streamMessage.mdd
351+
</pre>
262352

263-
# RDF Connect by Layer
264353

265354
## Runner
266355

@@ -289,20 +378,22 @@ Pipelines connect processors using runners to form a processing graph.
289378

290379
This is the main unit most users interact with when defining workflows.
291380

381+
292382
# Ontology Reference
293383

294384
The RDF Connect ontology provides the terms used in RDF pipeline definitions. See the full [RDF Connect Ontology](https://w3id.org/rdf-connect/ontology.ttl) for details.
295385

296-
# Usage Paths
297-
298-
Depending on your use case, you may only need a subset of this specification:
299386

300-
* **Pipeline Authors**: Start with [[#getting-started]] and focus on [[#pipeline]].
301-
* **Processor Developers**: Read [[#processor]] and [[#runner]].
302-
* **Platform Maintainers**: Read all sections, including implementation notes.
387+
# Putting It All Together: Example Flow and Use Case
303388

304-
## Putting It All Together: Example Flow and Use Case
389+
<div class=issue>
390+
🚧 This section is a work in progress and will be expanded soon.
391+
</div>
305392

306393

307394

308395

396+
<script type=module>
397+
import mermaid from 'https://cdn.jsdelivr.net/npm/mermaid@11/dist/mermaid.esm.min.mjs';
398+
mermaid.initialize({ startOnLoad: true });
399+
</script>

message.mdd

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
2+
<div class="mermaid">
3+
sequenceDiagram
4+
participant P1 as Processor 1
5+
participant R1 as Runner 1
6+
participant O as Orchestrator
7+
participant R2 as Runner 2
8+
participant P2 as Processor 2
9+
P1-->>R1: Msg to Channel A
10+
R1-->>O: Msg to Channel A
11+
Note over O: Channel A is connected<br>to processor of Runner 2
12+
O -->> R2: Msg to Channel A
13+
R2-->>P2: Msg to Channel A
14+
</div>

startup.mdd

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
2+
<div class="mermaid">
3+
sequenceDiagram
4+
autonumber
5+
participant O as Orchestrator
6+
participant R as Runner
7+
participant P as Processor
8+
Note over O: Discovered Runners<br>and processors
9+
loop For every runner
10+
Note over R: Runner is started with cli
11+
O-->>R: Startup with address and uri
12+
R-->>O: RPC.Identify: with uri
13+
end
14+
loop For every processor
15+
O-->>R: RPC.Processor: Start processor
16+
Note over P: Load module and class
17+
R-->>P: Load processor
18+
R-->>P: Start processor with args
19+
R-->>O: RPC.ProcessorInit: processor started
20+
end
21+
loop For every runner
22+
O-->>R: RPC.Start: Start
23+
loop For every Processor
24+
R-->>P: Start
25+
end
26+
end
27+
</div>

streamMessage.mdd

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
2+
<div class="mermaid">
3+
sequenceDiagram
4+
autonumber
5+
participant P1 as Processor 1
6+
participant R1 as Runner 1
7+
participant O as Orchestrator
8+
participant R2 as Runner 2
9+
participant P2 as Processor 2
10+
P1 -->> R1: Send streaming<br>message
11+
critical Start stream message
12+
R1 ->> O: rpc.sendStreamMessage<br>(bidirectional stream)
13+
O -->> R1: sends generated ID<br>of stream message
14+
R1 -->> O: announce StreamMessage<br>with ID over normal stream
15+
O -->> R2: announce StreamMessage<br>with ID over normal stream
16+
R2 ->> O: rpc.receiveMessage with Id<br>starts receiving stream
17+
R2 -->> P2: incoming stream message
18+
end
19+
loop Streams data
20+
P1 -->> R1: Data chunks
21+
R1 -->> O: Data chunks over stream
22+
O -->> R2: Data chunks over stream
23+
R2 -->>P2: Data chnuks
24+
end
25+
</div>

0 commit comments

Comments
 (0)