Skip to content

Commit 0e8eb18

Browse files
committed
content assembled for publication!
1 parent 0caacc2 commit 0e8eb18

28 files changed

+4093
-0
lines changed

.github/workflows/ci.yml

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
name: CI
2+
3+
on:
4+
push:
5+
branches: [main]
6+
pull_request:
7+
branches: [main]
8+
9+
jobs:
10+
test:
11+
runs-on: ubuntu-latest
12+
strategy:
13+
matrix:
14+
python-version: ["3.11", "3.12"]
15+
16+
steps:
17+
- uses: actions/checkout@v4
18+
19+
- name: Set up Python ${{ matrix.python-version }}
20+
uses: actions/setup-python@v5
21+
with:
22+
python-version: ${{ matrix.python-version }}
23+
24+
- name: Install package and dev dependencies
25+
run: pip install -e ".[dev]"
26+
27+
- name: Run tests
28+
run: python -m pytest tests/ -v --tb=short

.github/workflows/docs.yml

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
name: docs
2+
3+
on:
4+
push:
5+
branches: [main]
6+
7+
permissions:
8+
contents: write
9+
10+
jobs:
11+
deploy:
12+
runs-on: ubuntu-latest
13+
steps:
14+
- uses: actions/checkout@v4
15+
16+
- uses: actions/setup-python@v5
17+
with:
18+
python-version: "3.12"
19+
20+
- name: Install dependencies
21+
run: pip install -e ".[docs]"
22+
23+
- name: Deploy docs
24+
run: mkdocs gh-deploy --force

.github/workflows/publish.yml

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
name: Publish to PyPI
2+
3+
on:
4+
push:
5+
tags:
6+
- "v*.*.*"
7+
8+
jobs:
9+
publish:
10+
runs-on: ubuntu-latest
11+
environment: pypi
12+
permissions:
13+
id-token: write # required for trusted publishing
14+
15+
steps:
16+
- uses: actions/checkout@v4
17+
18+
- name: Set up Python
19+
uses: actions/setup-python@v5
20+
with:
21+
python-version: "3.11"
22+
23+
- name: Install build tools
24+
run: pip install hatchling build
25+
26+
- name: Build distribution
27+
run: python -m build
28+
29+
- name: Publish to PyPI
30+
uses: pypa/gh-action-pypi-publish@release/v1

.gitignore

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
__pycache__/
2+
*.py[cod]
3+
*.egg-info/
4+
dist/
5+
build/
6+
.venv/
7+
venv/
8+
.pytest_cache/
9+
.mypy_cache/
10+
.ruff_cache/
11+
*.egg
12+
MANIFEST
13+
htmlcov/
14+
.coverage

ARCHITECTURE.md

Lines changed: 166 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,166 @@
1+
# Architecture: knowledgecomplex
2+
3+
## The 2×2 Responsibility Map
4+
5+
This is the central architectural constraint. Every rule in the system belongs to exactly one cell. The Python package exists to hide this table from the user.
6+
7+
| | **OWL** | **SHACL** |
8+
|---|---|---|
9+
| **Topological** | `kc:Element` base class; `kc:Vertex`, `kc:Edge`, `kc:Face` as subclasses. `kc:Edge` has exactly 2 `boundedBy` (Vertex); `kc:Face` has exactly 3 `boundedBy` (Edge). `kc:Complex` as collection of elements via `kc:hasElement`. | Boundary vertices are distinct; boundary edges of a face form a closed triangle; boundary-closure of a complex (all instance-level; require `sh:sparql`) |
10+
| **Ontological** | Concrete subclasses and their allowed attributes; property domain/range declarations | Controlled vocabulary enforcement (e.g. `status ∈ {passing, failing, pending}`); attribute presence rules; co-occurrence constraints |
11+
12+
### Why Both OWL and SHACL at Each Layer
13+
14+
**Topological layer:** OWL cardinality axioms enforce structural counts at the class level (reasoning over schema). SHACL is required for the closed-triangle constraint because OWL cannot express a constraint that references the *co-values* of three different property assertions on the same individual — this is a known expressivity boundary of OWL-DL. The `sh:sparql` constraint in `kc_core_shapes.ttl` is the explicit test of this boundary.
15+
16+
**Ontological layer:** OWL defines what attributes a concrete type *has* (property declarations, domain, range, subclass hierarchy). SHACL defines what values those attributes *must have* at the instance level (vocabulary constraints, cardinality on the concrete shape, required/optional). OWL cannot enforce controlled vocabulary on data properties at the instance level without enumerating individuals, which is inappropriate for string-valued attributes.
17+
18+
---
19+
20+
## Component Layers
21+
22+
```
23+
┌─────────────────────────────────────────────────────┐
24+
│ Application / Demo (user code) │
25+
│ build_my_instance() | domain-specific queries │
26+
│ Concrete elements: vertices, edges, faces │
27+
├─────────────────────────────────────────────────────┤
28+
│ Domain Model (user code) │
29+
│ build_my_schema() | domain SPARQL templates │
30+
│ MyVertex, MyEdge, MyFace type definitions │
31+
├─────────────────────────────────────────────────────┤
32+
│ knowledgecomplex Python Package │
33+
│ SchemaBuilder DSL | KnowledgeComplex I/O │
34+
│ (OWL + SHACL emit) | (rdflib graph + SPARQL) │
35+
├──────────────────────┬──────────────────────────────┤
36+
│ kc_core.ttl │ kc_core_shapes.ttl │
37+
│ (abstract OWL) │ (abstract SHACL) │
38+
├──────────────────────┴──────────────────────────────┤
39+
│ rdflib | pyshacl | owlrl │
40+
└─────────────────────────────────────────────────────┘
41+
```
42+
43+
The **domain model** layer sits between the core framework and the application. It defines domain-specific types and queries using the core's `SchemaBuilder` DSL. The application layer then instantiates that model with concrete data.
44+
45+
The static resources (`kc_core.ttl`, `kc_core_shapes.ttl`) are loaded once at `SchemaBuilder.__init__`. Model schema and shapes are merged into the same rdflib `Graph` objects at runtime.
46+
47+
---
48+
49+
## Abstraction Boundary: Core vs. Domain Models
50+
51+
The layers above are separated by a key abstraction boundary: the **core framework** (`knowledgecomplex/`) vs. **domain models** (user code). Everything inside the package boundary (core, static resources, libraries) is framework-owned and invariant. Everything outside (model definitions, instances) is user-authored.
52+
53+
### Core Framework (`knowledgecomplex/`, prefixes `kc:` and `kcs:`)
54+
55+
- **Topological rule enforcement.** The Element/Vertex/Edge/Face hierarchy, cardinality axioms, distinctness, closed-triangle, and boundary-closure constraints. Static OWL and SHACL shipped with the package. Users cannot modify them.
56+
- **Superstructure attributes.** `kc:uri` (optional, at-most-one) allows any element to reference a source file. Enforced by `kcs:ElementShape`.
57+
- **Ontological rule authoring.** `SchemaBuilder` provides the DSL for declaring types, attributes, and vocabularies. It *generates* OWL classes and SHACL shapes on behalf of the domain model but does not itself define any domain types.
58+
- **Instance management.** `KnowledgeComplex` loads the merged schema, manages the RDF graph, validates on every write, and executes named SPARQL queries.
59+
- **Framework queries.** Generic SPARQL templates (`vertices`, `coboundary`) that work for any domain model.
60+
61+
### Domain Models (user code, prefix `{namespace}:`)
62+
63+
- **Ontological rule enforcement.** The concrete OWL types and SHACL shapes generated by calling `SchemaBuilder.add_*_type()`.
64+
- **Concrete complex authoring.** Instance data constructed via `KnowledgeComplex.add_*()` calls.
65+
- **Domain queries.** Model-specific SPARQL templates.
66+
67+
### The Type Inheritance Chain Crosses the Boundary
68+
69+
```
70+
kc:Element → kc:Vertex → aaa:spec → (instance "spec-001")
71+
core core model application
72+
```
73+
74+
The core owns `Element → Vertex`; the model owns `Vertex → spec`; the application owns the instance `spec-001`. The boundary is at the subclass declaration — `add_vertex_type("spec")` is the model calling the core's authoring API to extend the core's type hierarchy.
75+
76+
### Layer Ownership of the 2×2 Map
77+
78+
| | **OWL** | **SHACL** |
79+
|---|---|---|
80+
| **Topological** | Core owns (static `kc_core.ttl`) | Core owns (static `kc_core_shapes.ttl`) |
81+
| **Ontological** | Domain model authors via `SchemaBuilder` → core generates | Domain model authors via `vocab()`/attributes → core generates |
82+
83+
Both ontological cells are *authored* by the domain model but *generated and managed* by the core. The domain model never touches OWL or SHACL directly.
84+
85+
---
86+
87+
## Key Design Decisions
88+
89+
### DD1: Attributes over Subclasses (for Simple Domains)
90+
91+
For simple domain models, a single concrete type with a controlled-vocabulary attribute (e.g. `verification` with `status ∈ {passing, failing, pending}`) is preferred over two subclasses (`PassingVerification`, `FailingVerification`). The framework supports both patterns.
92+
93+
**Rationale:** The single-type-with-attribute pattern makes the data more inspectable before schema-level concerns are promoted. `promote_to_attribute()` supports the transition path from untyped patterns to typed attributes.
94+
95+
### DD2: SPARQL Templates, Not Free Queries
96+
97+
All SPARQL is encapsulated as named template files. `KnowledgeComplex.query()` accepts only registered template names.
98+
99+
**Rationale:** Maintains API opacity, prevents arbitrary SPARQL from bypassing validation invariants, and makes the query surface explicit and testable.
100+
101+
### DD3: Validation on Write
102+
103+
`add_vertex()`, `add_edge()`, and `add_face()` each trigger SHACL validation immediately and raise `ValidationError` on failure. Rollback removes all added triples on failure.
104+
105+
**Rationale:** Fail fast; keep the graph in a valid state at all times. Verification is not a batch post-processing step — it is enforced at assertion time.
106+
107+
### DD4: Static Core Resources
108+
109+
`kc_core.ttl` and `kc_core_shapes.ttl` are static files shipped with the package, not generated at runtime.
110+
111+
**Rationale:** The topological rules are framework invariants, not user-configurable. Separating them from user schema makes the 2×2 boundary visible in the file system.
112+
113+
### DD5: `dump_owl()` and `dump_shacl()` Merge Core and User Schema
114+
115+
Both dump methods return the full merged graph (core + user-defined), serialized as Turtle.
116+
117+
**Rationale:** The merged graph is what `pyshacl` and `owlrl` operate on. Showing the full graph makes the system inspectable and demonstrates that user types genuinely extend (not replace) the core ontology.
118+
119+
### DD6: Shared-Domain Removal (`_set_owl_domain`)
120+
121+
When the same property name appears on multiple types, the OWL `rdfs:domain` assertion is removed (leaving no domain) rather than adding multiple domain values. SHACL shapes still enforce per-type constraints correctly via each type's `NodeShape`.
122+
123+
**Rationale:** Multiple `rdfs:domain` values trigger RDFS inference to classify any individual with that property as a member of *all* domain types — violating the type hierarchy. Removing domain resolves the conflict; SHACL handles the per-type enforcement.
124+
125+
---
126+
127+
## Known OWL Expressivity Limits (Design Seams)
128+
129+
| Constraint | OWL can express? | Resolution |
130+
|---|---|---|
131+
| Edge has exactly 2 boundary vertices | Yes (cardinality on `boundedBy`) | OWL cardinality axiom |
132+
| Face has exactly 3 boundary edges | Yes (cardinality on `boundedBy`) | OWL cardinality axiom |
133+
| Boundary vertices are distinct individuals | No (OWL open-world; same-as/different-from is individual-level) | SHACL `sh:sparql` (COUNT DISTINCT) |
134+
| Boundary edges of a face form a closed triangle | No (requires co-reference across 3 property values) | SHACL `sh:sparql` constraint |
135+
| Boundary-closure of a complex | No (requires co-reference across `hasElement` and `boundedBy` on different individuals) | SHACL `sh:sparql` constraint |
136+
| Controlled vocabulary on data property | No (without `owl:oneOf` on individuals, impractical for strings) | SHACL `sh:in` |
137+
| At-most-one `kc:uri` per element | Not enforced practically (open-world) | SHACL `sh:maxCount 1` in `ElementShape` |
138+
139+
These seams are documented as comments in the relevant `.ttl` files.
140+
141+
---
142+
143+
## Namespace Conventions
144+
145+
```turtle
146+
@prefix kc: <https://example.org/kc#> . # core framework
147+
@prefix kcs: <https://example.org/kc/shape#> . # core shapes
148+
@prefix aaa: <https://example.org/aaa#> . # user namespace (example)
149+
@prefix aaas: <https://example.org/aaa/shape#> .# user shapes (example)
150+
```
151+
152+
User namespaces are set via `SchemaBuilder(namespace="aaa")`. The URI base `https://example.org/` is a placeholder for local development; a real deployment would use a dereferenceable IRI.
153+
154+
---
155+
156+
## File Inventory
157+
158+
| File | Layer | Purpose |
159+
|---|---|---|
160+
| `knowledgecomplex/resources/kc_core.ttl` | Abstract OWL | Topological backbone: classes, properties, cardinality axioms, `kc:uri` |
161+
| `knowledgecomplex/resources/kc_core_shapes.ttl` | Abstract SHACL | Topological constraints: distinctness, closed-triangle, boundary-closure, `kc:uri` at-most-one |
162+
| `knowledgecomplex/schema.py` | Python API — schema authoring | `SchemaBuilder` DSL: `add_*_type`, `dump_owl`, `dump_shacl`, `export`, `load` |
163+
| `knowledgecomplex/graph.py` | Python API — instance I/O | `KnowledgeComplex`: `add_vertex`, `add_edge`, `add_face`, `query`, `dump_graph`, `export`, `load` |
164+
| `knowledgecomplex/exceptions.py` | Public exceptions | `ValidationError`, `SchemaError`, `UnknownQueryError` |
165+
| `knowledgecomplex/queries/vertices.sparql` | Framework SPARQL | Return all vertices and their types |
166+
| `knowledgecomplex/queries/coboundary.sparql` | Framework SPARQL | Inverse boundary operator |

README.md

Lines changed: 94 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,94 @@
1+
# knowledgecomplex
2+
3+
A Python library for defining and instantiating **typed simplicial complexes** backed by OWL, SHACL, and SPARQL.
4+
5+
## What it is
6+
7+
A **knowledge complex** is a simplicial complex (vertices, edges, faces) where each element has a type governed by a formal ontology. The library provides:
8+
9+
- **`SchemaBuilder`** — a DSL for declaring vertex/edge/face types, attributes, and vocabularies. Generates OWL and SHACL automatically.
10+
- **`KnowledgeComplex`** — an instance manager that adds elements, validates them against SHACL on every write, and executes named SPARQL queries.
11+
- **Core OWL + SHACL** — a static topological backbone: the `Element → Vertex/Edge/Face` hierarchy, boundary-cardinality axioms, and closed-triangle/boundary-closure constraints.
12+
13+
All semantic web machinery (rdflib, pyshacl, owlrl) stays internal. The public API is pure Python.
14+
15+
## Install
16+
17+
```bash
18+
pip install knowledgecomplex
19+
```
20+
21+
Or from source:
22+
23+
```bash
24+
git clone https://github.com/BlockScience/knowledgecomplex.git
25+
cd knowledgecomplex
26+
pip install -e ".[dev]"
27+
```
28+
29+
## Quick start
30+
31+
```python
32+
from knowledgecomplex import SchemaBuilder, KnowledgeComplex, vocab, text
33+
34+
# 1. Define a schema
35+
sb = SchemaBuilder(namespace="aaa")
36+
sb.add_vertex_type("spec", attributes={"title": text(), "domain": text()})
37+
sb.add_vertex_type("guidance", attributes={"title": text(), "domain": text()})
38+
sb.add_edge_type("verification",
39+
attributes={"status": vocab("passing", "failing", "pending")})
40+
sb.add_face_type("assurance")
41+
42+
# 2. Build an instance
43+
kc = KnowledgeComplex(schema=sb)
44+
kc.add_vertex("spec-001", type="spec", uri="file:///docs/spec-001.md",
45+
title="Spec for Verification", domain="aaa")
46+
kc.add_vertex("guidance-001", type="guidance", uri="file:///docs/guidance-001.md",
47+
title="Guidance for Verification", domain="aaa")
48+
kc.add_edge("ver-001", type="verification",
49+
vertices={"spec-001", "guidance-001"}, status="passing")
50+
51+
# 3. Query
52+
df = kc.query("vertices") # built-in SPARQL template
53+
print(df)
54+
55+
# 4. Inspect the RDF
56+
print(kc.dump_graph()) # Turtle string
57+
```
58+
59+
## The `kc:uri` attribute
60+
61+
Every element (vertex, edge, or face) can carry an optional `kc:uri` property pointing to its source file:
62+
63+
```python
64+
kc.add_vertex("doc-001", type="spec", uri="file:///path/to/doc-001.md")
65+
kc.add_edge("ver-001", type="verification", vertices={"doc-001", "doc-002"},
66+
uri="file:///edges/ver-001.md", status="passing")
67+
```
68+
69+
SHACL enforces at-most-one `kc:uri` per element. This is particularly useful for domain applications like AAA where each element corresponds to an actual document file.
70+
71+
## Architecture
72+
73+
The library is organised around a 2×2 responsibility map. Every rule belongs to exactly one cell:
74+
75+
| | **OWL** | **SHACL** |
76+
|-----------------|-----------------------------------------------------------|-------------------------------------------------------------------|
77+
| **Topological** | `kc:Element`, `kc:Vertex`, `kc:Edge`, `kc:Face` hierarchy; cardinality axioms on `kc:boundedBy`; `kc:Complex` via `kc:hasElement` | Boundary vertices are distinct; boundary edges form a closed triangle; boundary-closure of a complex (all require `sh:sparql`) |
78+
| **Ontological** | Concrete subclasses and their properties; domain/range declarations | Controlled vocabulary (`sh:in`); attribute presence rules; co-occurrence constraints |
79+
80+
### Why both OWL and SHACL at each layer
81+
82+
**Topological layer:** OWL cardinality axioms express structural counts at the schema level. SHACL is required for the closed-triangle constraint because OWL cannot express co-reference across three property assertions on different individuals — a known expressivity boundary of OWL-DL.
83+
84+
**Ontological layer:** OWL defines what attributes a type *has* (property declarations, subclass hierarchy). SHACL defines what values those attributes *must have* at the instance level. OWL cannot enforce controlled vocabularies on string-valued data properties.
85+
86+
See [ARCHITECTURE.md](ARCHITECTURE.md) for the full design rationale.
87+
88+
## Domain model example
89+
90+
This package is used by [mtg-kc](https://github.com/BlockScience/mtg-kc) as a demonstration application, and by [assurances-audits-accountability](https://github.com/BlockScience/assurances-audits-accountability) as a domain-specific knowledge complex for typed document assurance.
91+
92+
## License
93+
94+
Apache 2.0 — see [LICENSE](LICENSE).

docs/api/exceptions.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# Exceptions
2+
3+
::: knowledgecomplex.exceptions

docs/api/graph.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# Graph
2+
3+
::: knowledgecomplex.graph

docs/api/schema.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# Schema
2+
3+
::: knowledgecomplex.schema

0 commit comments

Comments
 (0)