| title | Ontology Storage Architecture | ||||
|---|---|---|---|---|---|
| description | VisionFlow's ontology storage architecture provides a production-ready system for managing OWL (Web Ontology Language) definitions, reasoning inferences, and semantic constraints at scale. The syst... | ||||
| category | explanation | ||||
| tags |
|
||||
| updated-date | 2026-02-11 | ||||
| difficulty-level | advanced |
VisionFlow's ontology storage architecture provides a production-ready system for managing OWL (Web Ontology Language) definitions, reasoning inferences, and semantic constraints at scale. The system integrates GitHub-sourced OWL files, automated reasoning with Whelk-rs, and database-backed persistence with intelligent caching.
graph TB
subgraph "External Sources"
GitHub["GitHub Repositories<br/>(900+ OWL Classes)"]
OWLFiles["OWL/RDF Files<br/>(.ttl, .owl, .xml)"]
end
subgraph "Parsing Layer"
Parser["Horned-OWL Parser<br/>(Concurrent)"]
Validator["Ontology Validator<br/>(OWL 2 Profile Check)"]
end
subgraph "Reasoning Layer"
Whelk["Whelk-rs Reasoner<br/>(OWL 2 EL)"]
InferenceCache["Inference Cache<br/>(LRU 90x speedup)"]
end
subgraph "Storage Layer"
PrimaryDB["Neo4j + In-Memory Store<br/>(OntologyRepository)"]
Cache["Redis Cache<br/>(hot axioms)"]
Archive["Archive Storage<br/>(historical versions)"]
end
subgraph "Query Layer"
OntologyRepo["OntologyRepository<br/>(Domain Interface)"]
QueryEngine["Query Engine<br/>(SPARQL subsetting)"]
end
GitHub --> OWLFiles
OWLFiles --> Parser
Parser --> Validator
Validator --> Whelk
Whelk --> InferenceCache
Whelk --> PrimaryDB
PrimaryDB --> QueryEngine
Cache --> QueryEngine
QueryEngine --> OntologyRepo
style GitHub fill:#e1f5ff
style Whelk fill:#fff3e0
style PrimaryDB fill:#f0e1ff
style Cache fill:#e8f5e9
CREATE TABLE owl-classes (
id UUID PRIMARY KEY,
uri VARCHAR(512) NOT NULL UNIQUE,
local-name VARCHAR(128) NOT NULL,
namespace VARCHAR(256),
is-inferred BOOLEAN DEFAULT false,
source-file VARCHAR(256),
created-at TIMESTAMP DEFAULT CURRENT-TIMESTAMP,
modified-at TIMESTAMP,
INDEX idx-uri(uri),
INDEX idx-namespace(namespace),
INDEX idx-inferred(is-inferred)
);CREATE TABLE owl-axioms (
id UUID PRIMARY KEY,
axiom-type VARCHAR(32) NOT NULL,
-- SubClassOf, DisjointWith, EquivalentClasses, ObjectPropertyDomain, etc.
source-class-id UUID NOT NULL REFERENCES owl-classes(id),
target-class-id UUID REFERENCES owl-classes(id),
source-property-id UUID REFERENCES owl-properties(id),
target-property-id UUID REFERENCES owl-properties(id),
is-inferred BOOLEAN DEFAULT false,
confidence-score DECIMAL(3,2),
-- 1.0 for asserted, 0.3-0.9 for inferred via reasoning
reasoning-rule VARCHAR(128),
-- Which rule produced this inference
created-at TIMESTAMP DEFAULT CURRENT-TIMESTAMP,
INDEX idx-type(axiom-type),
INDEX idx-source(source-class-id),
INDEX idx-target(target-class-id),
INDEX idx-inferred(is-inferred),
FOREIGN KEY (source-class-id) REFERENCES owl-classes(id),
FOREIGN KEY (target-class-id) REFERENCES owl-classes(id)
);CREATE TABLE owl-properties (
id UUID PRIMARY KEY,
uri VARCHAR(512) NOT NULL UNIQUE,
local-name VARCHAR(128) NOT NULL,
property-type VARCHAR(32),
-- ObjectProperty, DataProperty, AnnotationProperty
domain-class-id UUID REFERENCES owl-classes(id),
range-class-id UUID REFERENCES owl-classes(id),
is-functional BOOLEAN DEFAULT false,
is-transitive BOOLEAN DEFAULT false,
is-symmetric BOOLEAN DEFAULT false,
created-at TIMESTAMP DEFAULT CURRENT-TIMESTAMP,
INDEX idx-uri(uri),
INDEX idx-type(property-type),
FOREIGN KEY (domain-class-id) REFERENCES owl-classes(id),
FOREIGN KEY (range-class-id) REFERENCES owl-classes(id)
);CREATE TABLE semantic-constraints (
id UUID PRIMARY KEY,
constraint-type VARCHAR(32) NOT NULL,
-- attraction, repulsion, alignment, spring, etc.
axiom-id UUID NOT NULL REFERENCES owl-axioms(id),
magnitude DECIMAL(6,3),
direction VECTOR(3), -- normalized direction (if applicable)
radius DECIMAL(6,3), -- effective interaction radius
decay-exponent DECIMAL(3,1), -- falloff rate
active BOOLEAN DEFAULT true,
created-at TIMESTAMP DEFAULT CURRENT-TIMESTAMP,
INDEX idx-type(constraint-type),
INDEX idx-axiom(axiom-id),
FOREIGN KEY (axiom-id) REFERENCES owl-axioms(id)
);CREATE TABLE ontology-versions (
id UUID PRIMARY KEY,
version-hash VARCHAR(64) NOT NULL UNIQUE,
commit-sha VARCHAR(40), -- GitHub commit if applicable
num-classes INT,
num-axioms INT,
num-properties INT,
snapshot-at TIMESTAMP NOT NULL,
changelog TEXT,
created-at TIMESTAMP DEFAULT CURRENT-TIMESTAMP,
INDEX idx-hash(version-hash),
INDEX idx-commit(commit-sha)
);sequenceDiagram
participant GitHub
participant Fetcher as File Fetcher
participant Parser as OWL Parser
participant Reasoner as Whelk Reasoner
participant DB as OntologyRepository
participant Cache as Redis Cache
GitHub->>Fetcher: Webhook (new OWL)
Fetcher->>Parser: Raw OWL content
Parser->>DB: Insert asserted axioms<br/>(is-inferred=false)
DB->>Reasoner: Load ontology
Reasoner->>Reasoner: Compute inferences<br/>(OWL 2 EL rules)
Reasoner->>DB: Insert inferred axioms<br/>(is-inferred=true, confidence)
DB->>DB: Generate constraints<br/>from new axioms
DB->>Cache: Populate hot axioms<br/>(LRU policy)
Note over GitHub,Cache: Ontology is ready for reasoning
sequenceDiagram
participant Client
participant OntologyRepo as OntologyRepository
participant QueryEngine as Query Engine
participant Cache
participant DB
Client->>OntologyRepo: getClassesByProperty(property)
OntologyRepo->>QueryEngine: Execute query
QueryEngine->>Cache: Check hot axioms
Cache-->>QueryEngine: Cache hit (90% typical)
QueryEngine-->>OntologyRepo: Results
OntologyRepo-->>Client: Typed results
alt Cache miss
QueryEngine->>DB: Query asserted + inferred
DB-->>QueryEngine: Full results
QueryEngine->>Cache: Update LRU
QueryEngine-->>OntologyRepo: Results
end
Problem: 900+ classes × 10+ axioms each = 10k+ axioms consuming memory
Solution: Store only unique axioms, compute on-demand
class AxiomCompression {
// Store unique axiom patterns with references
axiomPatterns = new Map<string, AxiomPattern>();
classReferences = new Map<UUID, UUID[]>();
getAxiomsFor(classId: UUID): Axiom[] {
const patternIds = this.classReferences.get(classId);
return patternIds.flatMap(id => this.expandPattern(id));
}
}Cache Size: 10,000 axioms (typical VisionFlow workload) Hit Rate: 90%+ on production loads Speedup: 90-150x vs fresh reasoning
class InferenceCache {
private cache: LRUCache<string, Axiom[]>;
private stats = { hits: 0, misses: 0 };
getInferences(classUri: string): Axiom[] {
const key = `infer:${classUri}`;
if (this.cache.has(key)) {
this.stats.hits++;
return this.cache.get(key);
}
this.stats.misses++;
const inferences = this.reasoner.infer(classUri);
this.cache.set(key, inferences);
return inferences;
}
hitRate(): number {
const total = this.stats.hits + this.stats.misses;
return total > 0 ? (this.stats.hits / total) * 100 : 0;
}
}Original: Store 3D direction + magnitude for each constraint Compressed: Store axiom ID + constraint type, compute at render time
-- Original (inefficient)
SELECT axiom-id, direction-x, direction-y, direction-z, magnitude
FROM semantic-constraints
WHERE active = true AND axiom-id IN (...);
-- Compressed (efficient)
SELECT oa.axiom-type, oa.source-class-id, oa.target-class-id, sc.constraint-type
FROM owl-axioms oa
JOIN semantic-constraints sc ON oa.id = sc.axiom-id
WHERE sc.active = true;
-- Physics engine computes direction from class positionsOntology Profile: OWL 2 EL (supports 90% of real-world ontologies) Reasoning Time: 100-500 ms for 900-class ontology Memory: ~50 MB for full reasoning state
use whelk::Reasoner;
use horned-owl::ontology::set::SetOntology;
pub struct VisionFlowReasoner {
reasoner: Reasoner,
ontology: SetOntology,
}
impl VisionFlowReasoner {
pub async fn reason-ontology(&mut self) -> Result<ReasoningResults> {
// Load ontology
self.reasoner.insert(&self.ontology)?;
// Perform reasoning
let class-hierarchy = self.reasoner.get-class-hierarchy()?;
let inferences = self.reasoner.get-inferred-axioms()?;
Ok(ReasoningResults {
class-hierarchy,
inferences,
timestamp: chrono::Utc::now(),
})
}
pub fn get-inferred-for(&self, class-uri: &str) -> Result<Vec<Axiom>> {
// Retrieve inferred axioms for specific class
self.reasoner.get-related-classes(class-uri, &QueryType::AllRelations)
}
}Approach: Re-reason only affected classes when ontology changes
class IncrementalReasoner {
async updateOntology(changes: OWLChange[]): Promise<void> {
// Extract affected classes
const affectedClasses = this.extractAffectedClasses(changes);
// Store pre-reasoning state
const preState = this.getClassState(affectedClasses);
// Apply changes
for (const change of changes) {
await this.applyChange(change);
}
// Re-reason only affected classes
const affectedSet = new Set(affectedClasses);
const reasoningWork = this.reasoner.reasonAbout(affectedSet);
// Minimal database updates
const inferenceDiff = this.computeDifference(preState, await reasoningWork);
await this.applyInferenceDiff(inferenceDiff);
}
}- Class Hierarchy:
getParentClasses(),getSubClasses() - Property Queries:
getPropertiesByDomain(),getRangeOf() - Relationship Queries:
getRelatedClasses(),getAxiomsBetween() - Inferred Queries:
getInferredRelations()(includes reasoning results) - Constraint Queries:
getConstraintsForAxiom(),getActiveConstraints()
// Get all subclasses of 'Animal' (including inferred)
const animals = await ontologyRepo.getSubClasses('Animal', {
includeInferred: true,
transitiveReduction: true // Remove redundant intermediate classes
});
// Get all properties with domain 'PhysicalObject'
const properties = await ontologyRepo.getPropertiesByDomain('PhysicalObject');
// Get constraint forces for an axiom
const forces = await ontologyRepo.getConstraints('axiom-id-123', {
excludeInactive: true,
computeMagnitude: true
});
// Complex query: Get related classes with inferred axioms only
const relatedViaInference = await ontologyRepo.findClasses({
relationshipType: 'ANY',
includeAsserted: false,
includeInferred: true,
maxPathLength: 3,
targetUri: 'http://example.org/Event'
});class OntologyValidator {
async validateBeforeStorage(ontology: OWL): Promise<ValidationResult> {
const errors = [];
// Check OWL 2 profile compliance
if (!this.isOWL2EL(ontology)) {
errors.push('Not OWL 2 EL compliant');
}
// Check for undefined references
const undefined = this.findUndefinedReferences(ontology);
if (undefined.length > 0) {
errors.push(`Undefined references: ${undefined.join(', ')}`);
}
// Check for circular definitions
const cycles = this.detectCycles(ontology);
if (cycles.length > 0) {
errors.push(`Circular definitions detected`);
}
return {
isValid: errors.length === 0,
errors,
warnings: this.generateWarnings(ontology)
};
}
}- 900 Classes: 0.5 MB storage
- 10,000 Axioms: 2 MB storage
- Inference Cache: 10,000 hot axioms = 5 MB
- Constraints: 5,000 active = 1 MB
- Total: ~10 MB for complete ontology system
- Class lookup: 1-10 ms (cache hit) / 50-200 ms (cache miss)
- Axiom retrieval: 5-20 ms (cached)
- Inferred query: 100-500 ms (fresh reasoning if needed)
- Constraint query: 2-10 ms
- Concurrent ontology clients: 100+ simultaneous
- Queries per second: 1,000+ (with caching)
- Incremental updates: 10-50 ms for typical change
-- Create new version on change
INSERT INTO ontology-versions (version-hash, commit-sha, num-classes, num-axioms)
SELECT
MD5(CONCAT-WS(',', uri, axiom-type, is-inferred))::varchar(64),
NULL,
(SELECT COUNT(*) FROM owl-classes),
(SELECT COUNT(*) FROM owl-axioms),
NOW()
FROM owl-classes;
-- Archive previous version
UPDATE ontology-versions
SET archived-at = NOW()
WHERE version-hash != current-version-hash;class OntologyVersionManager {
async rollbackToVersion(versionHash: string): Promise<void> {
// Restore from archived version
const archived = await this.getArchivedVersion(versionHash);
// Clear current state
await this.db.truncate('owl-classes');
await this.db.truncate('owl-axioms');
// Restore archived state
await this.db.bulkInsert('owl-classes', archived.classes);
await this.db.bulkInsert('owl-axioms', archived.axioms);
// Invalidate cache
this.cache.clear();
}
}- Hexagonal/CQRS Architecture Design
- Pipeline Integration Architecture
- KnowledgeGraphRepository Port
- Hexagonal Architecture Ports - Overview
- VisionFlow Distributed Systems & Collaboration Assessment
- OWL File Validation: Verify RDF/XML/Turtle syntax
- URI Validation: Ensure URIs conform to RFC 3986
- Size Limits: Reject classes with >1000 axioms
- Circular Definition Prevention: Detect and reject
class OntologyAccessControl {
async checkAccess(userId: string, action: string, resource: string): Promise<boolean> {
// Public ontologies: read-only access
if (resource.startsWith('public:')) {
return ['read', 'query'].includes(action);
}
// Private ontologies: owner+collaborators
const owner = await this.getResourceOwner(resource);
if (userId === owner) {
return true;
}
const collaborators = await this.getCollaborators(resource);
return collaborators.includes(userId);
}
}