Skip to content

Commit 0fea456

Browse files
authored
Merge pull request #4 from BlockScience/dev
fix load + add another example
2 parents abaa779 + 6335492 commit 0fea456

File tree

24 files changed

+2942
-159
lines changed

24 files changed

+2942
-159
lines changed

README.md

Lines changed: 18 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -68,7 +68,7 @@ print(df)
6868
print(kc.dump_graph()) # Turtle string
6969
```
7070

71-
See [`examples/`](examples/) for 10 runnable examples covering all features below.
71+
See [`examples/`](examples/) for 11 runnable examples covering all features below.
7272

7373
## Topological queries
7474

@@ -132,6 +132,23 @@ chi = euler_characteristic(kc) # V - E + F
132132
pr = edge_pagerank(kc, "e1") # personalized edge PageRank vector
133133
```
134134

135+
## Local partitioning
136+
137+
Find clusters via diffusion — spread probability from a seed and sweep to find natural bottlenecks:
138+
139+
```python
140+
from knowledgecomplex.analysis import local_partition, edge_local_partition
141+
142+
# Vertex clusters via PageRank or heat kernel diffusion
143+
cut = local_partition(kc, seed="alice", method="pagerank")
144+
cut.vertices # vertex IDs on the small side
145+
cut.conductance # lower = cleaner partition
146+
147+
# Edge clusters via Hodge Laplacian diffusion
148+
edge_cut = edge_local_partition(kc, seed_edge="e1", method="hodge_pagerank")
149+
edge_cut.edges # relationship cluster around e1
150+
```
151+
135152
## Filtrations and time-varying complexes
136153

137154
Filtrations model strictly growing subcomplexes. Diffs model arbitrary add/remove sequences:

docs/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,7 @@ from knowledgecomplex.ontologies import operations, brand, research
6464
sb = brand.schema() # audience/theme with resonance, interplay, overlap
6565
```
6666

67-
See the [examples/](https://github.com/blockscience/knowledgecomplex/tree/main/examples) directory for 10 runnable examples.
67+
See the [examples/](https://github.com/blockscience/knowledgecomplex/tree/main/examples) directory for 11 runnable examples.
6868

6969
## API Reference
7070

docs/tutorial.md

Lines changed: 341 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,341 @@
1+
# Tutorial
2+
3+
A progressive walkthrough of knowledgecomplex, from schema definition through algebraic topology.
4+
5+
## 1. Define a schema
6+
7+
A schema declares vertex, edge, and face types with attributes. The `SchemaBuilder` generates OWL and SHACL automatically.
8+
9+
```python
10+
from knowledgecomplex import SchemaBuilder, vocab, text
11+
12+
sb = SchemaBuilder(namespace="vv")
13+
14+
# Vertex types (subclass of kc:Vertex)
15+
sb.add_vertex_type("requirement", attributes={"title": text()})
16+
sb.add_vertex_type("test_case", attributes={"title": text()})
17+
18+
# Edge type with controlled vocabulary (enforced via sh:in)
19+
sb.add_edge_type("verifies", attributes={
20+
"status": vocab("passing", "failing", "pending"),
21+
})
22+
23+
# Face type
24+
sb.add_face_type("coverage")
25+
```
26+
27+
### Attribute descriptors
28+
29+
| Descriptor | What it generates | Example |
30+
|---|---|---|
31+
| `text()` | `xsd:string`, required, single-valued | `title: text()` |
32+
| `text(required=False)` | `xsd:string`, optional | `notes: text(required=False)` |
33+
| `text(multiple=True)` | `xsd:string`, required, multi-valued | `tags: text(multiple=True)` |
34+
| `vocab("a", "b")` | `sh:in ("a" "b")`, required, single-valued | `status: vocab("pass", "fail")` |
35+
36+
### Type inheritance and binding
37+
38+
Types can inherit from other user-defined types. Child types can bind inherited attributes to fixed values:
39+
40+
```python
41+
sb.add_vertex_type("document", attributes={"title": text(), "category": text()})
42+
sb.add_vertex_type("specification", parent="document",
43+
attributes={"format": text()},
44+
bind={"category": "structural"})
45+
```
46+
47+
### Introspection
48+
49+
```python
50+
sb.describe_type("specification")
51+
# {'name': 'specification', 'kind': 'vertex', 'parent': 'document',
52+
# 'own_attributes': {'format': text()},
53+
# 'inherited_attributes': {'title': text(), 'category': text()},
54+
# 'all_attributes': {'title': text(), 'category': text(), 'format': text()},
55+
# 'bound': {'category': 'structural'}}
56+
57+
sb.type_names(kind="vertex") # ['document', 'specification']
58+
```
59+
60+
## 2. Build a complex
61+
62+
A `KnowledgeComplex` manages instances. Every write triggers SHACL verification — the graph is always in a valid state.
63+
64+
```python
65+
from knowledgecomplex import KnowledgeComplex
66+
67+
kc = KnowledgeComplex(schema=sb)
68+
69+
# Vertices have no boundary — always valid
70+
kc.add_vertex("req-001", type="requirement", title="Boot time < 5s")
71+
kc.add_vertex("tc-001", type="test_case", title="Boot smoke test")
72+
kc.add_vertex("tc-002", type="test_case", title="Boot regression")
73+
74+
# Edges need their boundary vertices to already exist (slice rule)
75+
kc.add_edge("ver-001", type="verifies",
76+
vertices={"req-001", "tc-001"}, status="passing")
77+
kc.add_edge("ver-002", type="verifies",
78+
vertices={"req-001", "tc-002"}, status="pending")
79+
kc.add_edge("ver-003", type="verifies",
80+
vertices={"tc-001", "tc-002"}, status="passing")
81+
82+
# Faces need 3 boundary edges forming a closed triangle
83+
kc.add_face("cov-001", type="coverage",
84+
boundary=["ver-001", "ver-002", "ver-003"])
85+
```
86+
87+
### What gets enforced
88+
89+
| Constraint | When | What happens |
90+
|---|---|---|
91+
| Type must be registered | Before RDF assertions | `ValidationError` |
92+
| Boundary cardinality (2 for edges, 3 for faces) | Before SHACL | `ValueError` |
93+
| Boundary elements must exist in complex (slice rule) | SHACL on write | `ValidationError` + rollback |
94+
| Vocab values must be in allowed set | SHACL on write | `ValidationError` + rollback |
95+
| Face boundary edges must form closed triangle | SHACL on write | `ValidationError` + rollback |
96+
97+
### Element handles
98+
99+
```python
100+
elem = kc.element("req-001")
101+
elem.id # "req-001"
102+
elem.type # "requirement"
103+
elem.attrs # {"title": "Boot time < 5s"}
104+
105+
kc.element_ids(type="test_case") # ["tc-001", "tc-002"]
106+
kc.elements(type="test_case") # [Element('tc-001', ...), Element('tc-002', ...)]
107+
```
108+
109+
## 3. Topological queries
110+
111+
Every query returns `set[str]` for natural set algebra. All accept an optional `type=` filter.
112+
113+
```python
114+
# Boundary operator ∂
115+
kc.boundary("ver-001") # {'req-001', 'tc-001'} (edge → vertices)
116+
kc.boundary("cov-001") # {'ver-001', 'ver-002', 'ver-003'} (face → edges)
117+
kc.boundary("req-001") # set() (vertex → empty)
118+
119+
# Coboundary (inverse boundary)
120+
kc.coboundary("req-001") # {'ver-001', 'ver-002'} (vertex → incident edges)
121+
122+
# Star: all simplices containing σ as a face
123+
kc.star("req-001") # req-001 + incident edges + incident faces
124+
125+
# Closure: smallest subcomplex containing σ
126+
kc.closure("cov-001") # cov-001 + 3 edges + 3 vertices
127+
128+
# Link: Cl(St(σ)) \ St(σ)
129+
kc.link("req-001")
130+
131+
# Skeleton: elements up to dimension k
132+
kc.skeleton(0) # vertices only
133+
kc.skeleton(1) # vertices + edges
134+
135+
# Degree
136+
kc.degree("req-001") # 2
137+
138+
# Subcomplex check
139+
kc.is_subcomplex({"req-001", "tc-001", "ver-001"}) # True
140+
kc.is_subcomplex({"ver-001"}) # False (missing vertices)
141+
142+
# Set algebra composes naturally
143+
shared = kc.star("req-001") & kc.star("tc-001")
144+
```
145+
146+
## 4. Local partitioning
147+
148+
The topological queries above use combinatorial adjacency — boundary, star, and closure walk the simplicial structure directly. Local partitioning uses **diffusion** instead: spread probability from a seed and sweep the result to find a natural cluster boundary. This finds structure that combinatorial queries miss.
149+
150+
Requires `pip install knowledgecomplex[analysis]`.
151+
152+
### Graph partitioning (vertex clusters)
153+
154+
Diffuse from a seed vertex using personalized PageRank or the heat kernel, then sweep the resulting distribution to find a cut with low conductance:
155+
156+
```python
157+
from knowledgecomplex.analysis import (
158+
approximate_pagerank, heat_kernel_pagerank,
159+
sweep_cut, local_partition,
160+
)
161+
162+
# Approximate PageRank: push-based diffusion (Andersen-Chung-Lang)
163+
p, r = approximate_pagerank(kc, seed="req-001", alpha=0.15)
164+
# p is a sparse dict of vertex → probability; more mass near seed
165+
166+
# Heat kernel PageRank: exponential diffusion (Fan Chung)
167+
rho = heat_kernel_pagerank(kc, seed="req-001", t=5.0)
168+
# t controls locality: small t = tight cluster, large t = broad spread
169+
170+
# Sweep either distribution to find a low-conductance cut
171+
cut = sweep_cut(kc, p)
172+
cut.vertices # set of vertex IDs on the small side
173+
cut.conductance # Cheeger ratio — lower means cleaner partition
174+
175+
# Or use local_partition for the full pipeline in one call
176+
cut = local_partition(kc, seed="req-001", method="pagerank")
177+
cut = local_partition(kc, seed="req-001", method="heat_kernel")
178+
```
179+
180+
### Edge partitioning (simplicial clusters)
181+
182+
The simplicial version replaces the graph Laplacian with the **Hodge Laplacian** on edges. Instead of partitioning vertices, it partitions edges — finding clusters of relationships:
183+
184+
```python
185+
from knowledgecomplex.analysis import edge_local_partition
186+
187+
# Hodge PageRank: (βI + L₁)⁻¹ χ_e — diffusion on the edge space
188+
cut = edge_local_partition(kc, seed_edge="ver-001", method="hodge_pagerank")
189+
190+
# Hodge heat kernel: e^{-tL₁} χ_e — exponential diffusion on edges
191+
cut = edge_local_partition(kc, seed_edge="ver-001", method="hodge_heat", t=5.0)
192+
193+
cut.edges # set of edge IDs in the cluster
194+
cut.conductance # edge conductance
195+
```
196+
197+
The key difference: graph partitioning asks "which vertices are near this vertex?" while edge partitioning asks "which relationships are near this relationship?" — a question that only makes sense in a simplicial complex, not in a plain graph.
198+
199+
## 5. Algebraic topology
200+
201+
Requires `pip install knowledgecomplex[analysis]`.
202+
203+
```python
204+
from knowledgecomplex.analysis import (
205+
boundary_matrices, betti_numbers, euler_characteristic,
206+
hodge_laplacian, edge_pagerank, hodge_decomposition, hodge_analysis,
207+
)
208+
209+
# Boundary matrices (sparse)
210+
bm = boundary_matrices(kc)
211+
# bm.B1: (n_vertices × n_edges), bm.B2: (n_edges × n_faces)
212+
# Invariant: B1 @ B2 = 0 (∂₁ ∘ ∂₂ = 0)
213+
214+
# Betti numbers
215+
betti = betti_numbers(kc) # [β₀, β₁, β₂]
216+
chi = euler_characteristic(kc) # V - E + F = β₀ - β₁ + β₂
217+
218+
# Hodge Laplacian
219+
L1 = hodge_laplacian(kc) # B1ᵀB1 + B2B2ᵀ (symmetric PSD)
220+
# dim(ker L₁) = β₁
221+
222+
# Edge PageRank
223+
pr = edge_pagerank(kc, "ver-001", beta=0.1) # (βI + L₁)⁻¹ χ_e
224+
225+
# Hodge decomposition: flow = gradient + curl + harmonic
226+
decomp = hodge_decomposition(kc, pr)
227+
# decomp.gradient — im(B1ᵀ), vertex-driven flow
228+
# decomp.curl — im(B2), face-driven circulation
229+
# decomp.harmonic — ker(L₁), topological cycles
230+
231+
# Full analysis in one call
232+
results = hodge_analysis(kc, beta=0.1)
233+
```
234+
235+
All analysis functions accept an optional `weights` dict mapping element IDs to scalar weights, which factor into the Laplacian as diagonal weight matrices.
236+
237+
## 6. Filtrations
238+
239+
A filtration is a nested sequence of valid subcomplexes: C₀ ⊆ C₁ ⊆ ... ⊆ Cₘ.
240+
241+
```python
242+
from knowledgecomplex import Filtration
243+
244+
filt = Filtration(kc)
245+
filt.append({"req-001"}) # must be valid subcomplex
246+
filt.append_closure({"ver-001"}) # auto-closes + unions with previous
247+
filt.append_closure({"cov-001"}) # adds face + all boundary
248+
249+
filt.birth("cov-001") # index where element first appears
250+
filt.new_at(2) # elements added at step 2 (Cₚ \ Cₚ₋₁)
251+
filt[1] # set of element IDs at step 1
252+
253+
# Build from a scoring function
254+
filt2 = Filtration.from_function(kc, lambda eid: some_score(eid))
255+
```
256+
257+
## 7. Clique inference
258+
259+
Discover higher-order structure hiding in the edge graph:
260+
261+
```python
262+
from knowledgecomplex import find_cliques, infer_faces
263+
264+
# Pure query — what triangles exist?
265+
triangles = find_cliques(kc, k=3)
266+
267+
# Fill in all triangles as typed faces
268+
added = infer_faces(kc, "coverage")
269+
270+
# Preview without modifying
271+
preview = infer_faces(kc, "coverage", dry_run=True)
272+
```
273+
274+
## 8. Export and load
275+
276+
```python
277+
# Export schema + instance to a directory
278+
kc.export("output/my_complex")
279+
# Creates: ontology.ttl, shapes.ttl, instance.ttl, queries/*.sparql
280+
281+
# Reconstruct from exported files
282+
kc2 = KnowledgeComplex.load("output/my_complex")
283+
kc2.audit().conforms # True
284+
```
285+
286+
Multi-format serialization:
287+
288+
```python
289+
from knowledgecomplex import save_graph, load_graph
290+
291+
save_graph(kc, "data.jsonld", format="json-ld")
292+
load_graph(kc, "data.ttl") # additive loading
293+
```
294+
295+
## 9. Verification and audit
296+
297+
```python
298+
# Throwing verification
299+
kc.verify() # raises ValidationError on failure
300+
301+
# Non-throwing audit
302+
report = kc.audit()
303+
report.conforms # bool
304+
report.violations # list[AuditViolation]
305+
print(report) # human-readable summary
306+
307+
# Deferred verification for bulk construction
308+
with kc.deferred_verification():
309+
for item in big_dataset:
310+
kc.add_vertex(item.id, type=item.type, **item.attrs)
311+
# ... add edges, faces ...
312+
# Single SHACL pass runs on exit
313+
314+
# Static file verification (no Python objects needed)
315+
from knowledgecomplex import audit_file
316+
report = audit_file("data/instance.ttl", shapes="data/shapes.ttl",
317+
ontology="data/ontology.ttl")
318+
```
319+
320+
## 10. Pre-built ontologies
321+
322+
Three ontologies ship with the package:
323+
324+
```python
325+
from knowledgecomplex.ontologies import operations, brand, research
326+
327+
sb = operations.schema() # actor, activity, resource
328+
sb = brand.schema() # audience, theme
329+
sb = research.schema() # paper, concept, note
330+
```
331+
332+
## Gotchas
333+
334+
| Issue | Detail |
335+
|---|---|
336+
| **Slice rule** | Boundary elements must exist before the element that references them. Add vertices → edges → faces. |
337+
| **Closed triangle** | A face's 3 edges must span exactly 3 vertices in a cycle. An open fan or 4-vertex path will fail. |
338+
| **`remove_element`** | No post-removal verification. Remove faces before their edges, edges before their vertices. |
339+
| **Schema after `load()`** | `load()` recovers type names, kinds, attributes, and parent relationships from OWL + SHACL. Full `describe_type()` introspection works after loading. |
340+
| **Deferred verification** | Inside the context manager, intermediate states need not be valid. Verification runs once on exit. |
341+
| **Face orientation** | Boundary matrix signs are computed internally to guarantee ∂₁∘∂₂ = 0. The orientation is consistent but not guaranteed to match external conventions. |

0 commit comments

Comments
 (0)