Skip to content

Commit 050c205

Browse files
v4
1 parent 253e85e commit 050c205

File tree

823 files changed

+19999
-3599
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

823 files changed

+19999
-3599
lines changed

README.md

Lines changed: 65 additions & 133 deletions
Original file line numberDiff line numberDiff line change
@@ -1,174 +1,106 @@
1-
omphalOS
1+
# omphalOS
22

3-
Deterministic workbench that ingests a trade feed and an entity registry, resolves entity matches, produces scored outputs, and packages artifacts for verification and release.
3+
omphalOS is a deterministic analysis workbench for trade and technology-transfer oversight. It builds a run directory that contains: inputs, normalized datasets, a warehouse, scored entities, review tables, exports, and a machine-checkable integrity index.
44

5-
Directory map
6-
- src/omphalos: Python implementation (CLI, pipeline, rules, artifacts, verification)
7-
- config: run configurations
8-
- warehouse: SQLite schema plus dbt project for modeling
9-
- sql: curated SQL catalog and playbooks targeting a run warehouse
10-
- orchestration: Airflow DAGs and scheduler surfaces
11-
- spark: PySpark and Scala Spark transforms that mirror warehouse rollups
12-
- infra: Terraform and Kubernetes manifests
13-
- policies: OPA policies for publishability and quality gates
14-
- agents: Go and Rust verifiers for run bundles
15-
- ui: run_manifest inspector (React)
5+
The repository ships a reference implementation with synthetic data. It is structured to support the same workflow across workstation runs, scheduled runs, and deployed runs, while preserving a stable artifact contract.
166

17-
<<<<<<< Updated upstream
18-
The purpose, here, is practical: **to make that posture routine.**
7+
## Scope
198

20-
This public release is a sanitized reference implementation, and all example data is synthetic.
9+
omphalOS covers four tasks:
2110

22-
## What the system asserts
11+
1. Ingest: load a trade feed and a registry (lists, watchlists, or reference entities).
12+
2. Normalize: canonicalize fields, enforce schemas, and derive deterministic features.
13+
3. Score and assemble: match trade records to registry entities, compute entity exposure summaries, and write review-ready tables.
14+
4. Package: fingerprint all outputs, emit a run manifest, and (optionally) assemble a release bundle for distribution.
2315

24-
A run is treated as an evidentiary package: it yields deliverables for a reader and, inseparably, a record adequate to explain what was done, reproduce it when feasible, and detect post-hoc alteration without argument.
16+
## Run directory contract
2517

26-
The claims, then, are intentionally narrow:
18+
A run is an immutable directory rooted at:
2719

28-
1. Integrity: a completed run directory can be checked against its manifest. (To wit, if the fingerprints do not match, the package has changed.)
29-
2. Comparability: two runs can be compared at the level of declared outputs, so disagreement can be located rather than narrated.
30-
3. Controlled distribution: a publishability scan surfaces common disclosure hazards before a package leaves its originating context.
20+
artifacts/runs/<run_id>/
3121

32-
No stronger guarantee is implied. Correctness remains a matter of method, inputs, and judgment.
22+
The directory is treated as write-once: outputs are written under stable paths, then indexed and fingerprinted. The manifest contains:
3323

34-
## What a reader can expect from the record
24+
- metadata: tool version, run_id, timestamps, environment identifiers
25+
- declared artifacts: relative paths, sizes, sha256
26+
- merkle root of the artifact set
27+
- structured reports: dataset validation, matching statistics, scoring summaries
28+
- release metadata when a bundle is assembled
3529

36-
A run produces a directory intended to travel as a unit. The directory is structured so a reviewer can answer, from the artifacts alone, the questions that reliably matter once work leaves its originating workspace:
30+
This contract is the unit of comparison and verification.
3731

38-
- What inputs were admitted, and what boundaries were enforced?
39-
- What rules governed transformations, and where are those rules stated?
40-
- Which outputs are intended for consumption, which are intermediate, and which require human review?
41-
- What may be shared, with whom, and with what risk of inadvertent disclosure?
42-
- When two executions disagree, is the disagreement substantive or procedural?
32+
## Data model
4333

44-
If a package cannot answer these questions, it is incomplete work.
34+
The reference warehouse is a SQLite database written to:
4535

46-
## Minimal use
36+
artifacts/runs/<run_id>/warehouse/warehouse.sqlite
4737

48-
From a fresh clone, either install the package:
38+
Base tables:
4939

50-
```bash
51-
python -m pip install -e .
52-
```
40+
- trade_feed: one row per shipment
41+
- registry: one row per entity
42+
- entity_matches: one row per shipment-entity candidate match
43+
- entity_scores: one row per entity summary
5344

54-
or run directly from source by setting:
45+
The maximal pipeline extends trade_feed with exporter_country and importer_country while preserving the legacy country field.
5546

56-
```bash
57-
export PYTHONPATH="$(pwd)/src"
58-
```
47+
## Warehouse and SQL surfaces
5948

49+
The repository contains two SQL surfaces:
6050

51+
1. Warehouse transforms: a dbt project under warehouse/ that defines staging, intermediate, and mart models. It is written to run against SQLite, DuckDB, or Postgres using profiles shipped under warehouse/profiles/.
52+
2. Analyst catalog: a curated query library under sql/ organized by briefing, review, audit, and investigations. Catalog execution records the query text, parameters, and output fingerprints into the run directory.
6153

62-
One may verify the included sample run:
54+
Both surfaces are designed to be executable and to emit artifacts that the manifest can index.
6355

64-
```bash
65-
python -m omphalos verify --run-dir examples/sample_run
66-
```
56+
## Orchestration and deployment
6757

68-
Execute the synthetic reference pipeline:
58+
The repository includes:
6959

70-
```bash
71-
python -m omphalos run --config config/runs/example_run.yaml
72-
```
60+
- scripts/ as the canonical operator interface (run, verify, certify, backfill, release-build, release-verify)
61+
- orchestration/airflow/ with DAGs that call the same runner interfaces
62+
- infra/k8s with base manifests and overlays for scheduled jobs
63+
- infra/terraform with modules and cloud examples for storage, identity, and logging
64+
- spark/scala as an optional scaling path for ingestion and coarse aggregations
7365

74-
Verify a generated run directory:
66+
## Policy
7567

76-
```bash
77-
python -m omphalos verify --run-dir artifacts/runs/<run_id>
78-
```
68+
policies/opa contains Rego policies that can evaluate:
7969

80-
Compare two runs for payload-level equivalence:
70+
- run manifests and release bundles
71+
- publishability constraints
72+
- infrastructure constraints for Terraform plans and Kubernetes manifests
8173

82-
```bash
83-
python -m omphalos certify --run-a artifacts/runs/<runA> --run-b artifacts/runs/<runB>
84-
```
74+
Policy evaluation produces structured reports under the run directory.
8575

86-
## Optional extras (SQL/dbt, Airflow, Spark)
76+
## User interface
8777

88-
The core runtime stays lightweight. Extra surfaces are available as optional dependencies:
78+
ui/ provides a local run browser that renders:
8979

90-
```bash
91-
# Development tools
92-
python -m pip install -e ".[dev]"
80+
- run manifests
81+
- reports and diffs between runs
82+
- review tables and export artifacts
9383

94-
# SQL/dbt surface (DuckDB + Postgres connectivity)
95-
python -m pip install -e ".[warehouse]"
84+
The UI reads from a small API server under src/omphalos/api.
9685

97-
# Orchestration surface
98-
python -m pip install -e ".[orchestration]"
86+
## Independent verifiers
9987

100-
# Spark surface
101-
python -m pip install -e ".[spark]"
102-
```
88+
agents/ contains small verifiers that can validate a run directory without importing the Python package:
10389

104-
## Distribution
90+
- agents/go/omphalos-verify
91+
- agents/rust/omphalos-verify
10592

106-
When a run must be transmitted as a single object:
93+
## Command line
10794

108-
```bash
109-
python -m omphalos release build --run-dir artifacts/runs/<run_id> --out artifacts/releases/<run_id>.tar.gz
110-
python -m omphalos release verify --bundle artifacts/releases/<run_id>.tar.gz
111-
```
95+
The CLI exposes:
11296

113-
Before distributing outputs outside the environment in which they were generated:
97+
- omphalos run: reference pipeline on synthetic data
98+
- omphalos verify: recompute fingerprints and validate the manifest
99+
- omphalos compare: compare declared artifacts between runs
100+
- omphalos release: build and verify release bundles
114101

115-
```bash
116-
python -m omphalos publishability scan --path . --out artifacts/reports/publishability.json
117-
```
102+
Maximal pipelines and additional surfaces are available under src/omphalos/maximal and are invoked through explicit commands and job specs.
118103

119-
The scan ought to be treated as a pre-flight gate, whereupon a clean report reduces common failure modes; it does not constitute a blanket safety determination.
104+
## Files kept for provenance
120105

121-
## Configuration and declared rules
122-
123-
Runs are configured in `config/runs/`. Schemas and rule packs live in `contracts/`.
124-
125-
The governing posture is explicitness. Shapes worth consuming should be declared. Rules worth relying on should be written down. Failures should be inspectable.
126-
127-
## Appendix A: run directory layout
128-
129-
A typical run directory includes:
130-
131-
- `run_manifest.json`
132-
Inventory of outputs with integrity fingerprints.
133-
134-
- `exports/`
135-
Reader-facing products (tables, narrative, packet-style records).
136-
137-
- `reports/`
138-
Structured checks and summaries (quality, determinism comparison, publishability scan, dependency inventory).
139-
140-
- `lineage/`
141-
Append-only event record of execution.
142-
143-
- `warehouse/`
144-
Local SQLite artifact used by the reference pipeline.
145-
146-
## Appendix B: operating expectations
147-
148-
omphalOS assumed two expectations throughout itself:
149-
150-
Firstly, the run directory is treated as an immutable package once the run completes. Editing outputs “for presentation” after completion is a change in evidence. If edits are required, the disciplined move is to rerun under a revised configuration and allow the record to reflect the revision.
151-
152-
Secondly, comparisons are only as meaningful as the boundaries you enforce. If the run’s inputs depend on ambient state—untracked files, implicit credentials, external services whose responses are not recorded—then replay will converge on approximation rather than identity. The system will still produce a record; it cannot supply missing constraints.
153-
154-
## Documentation
155-
156-
I recommend that you start with:
157-
158-
- `docs/overview.md`
159-
- `docs/architecture.md`
160-
- `docs/artifacts.md`
161-
- `docs/cli.md`
162-
- `docs/open_source_readiness.md`
163-
- `docs/threat_model.md`
164-
165-
## License
166-
167-
Apache-2.0; see `LICENSE` and `NOTICE`; citation metadata is in `CITATION.cff`.
168-
=======
169-
Common commands
170-
- omphalos run --config config/runs/example_run.yaml
171-
- omphalos verify --run-dir <run_dir>
172-
- omphalos release build --run-dir <run_dir> --out <bundle.tar.gz>
173-
- omphalos sql run --run-dir <run_dir> --manifest sql/manifests/briefing_pack.yaml
174-
>>>>>>> Stashed changes
106+
Original repository files are preserved. Where a file is materially upgraded, the prior content is copied into .legacy_snapshots/ with the same relative path before modification.

agents/go/cmd/verify/main.go

Lines changed: 0 additions & 90 deletions
This file was deleted.

agents/go/go.mod

Lines changed: 0 additions & 3 deletions
This file was deleted.

agents/go/omphalos-verify/go.mod

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
module omphalos-verify
2+
3+
go 1.22

0 commit comments

Comments
 (0)