Skip to content

Latest commit

 

History

History
344 lines (198 loc) · 16.8 KB

File metadata and controls

344 lines (198 loc) · 16.8 KB

omphalOS as Research Contribution

Positioning

omphalOS sits at the intersection of several research domains:

  1. Intelligence Analysis Methodology: Computational approaches to structured analytic techniques
  2. Algorithmic Accountability: Governance of automated decision systems in high-stakes contexts
  3. Software Verification: Polycentric architectures for trust in sensitive systems
  4. Export Control & Sanctions: Computational methods for trade-based threat detection

This document positions omphalOS within existing literature and articulates its novel contributions.


Literature Context

Intelligence Analysis & Structured Techniques

Prior Work:

  • Heuer (1999): Psychology of Intelligence Analysis — Documents cognitive biases, proposes structured analytic techniques (SAT)
  • NRC (2011): Intelligence Analysis: Behavioral and Social Scientific Foundations — Recommends decision aids over training alone
  • Johnston (2005): Analytic Culture in the US Intelligence Community — Cultural norms reward confident assessments over hedged uncertainties

omphalOS Contribution: Computational implementation of epistemic humility. While prior work identifies the problem (overconfidence) and proposes solutions (SATs), omphalOS makes epistemic restraint architectural. The Canon/Margin repetition is not training material—it is infrastructure that makes restraint procedurally unavoidable.

Novel Claim: Repeated environmental cues embedded in analytical tools can durably shift judgment patterns without requiring analyst training or organizational culture change.


Algorithmic Accountability

Prior Work:

  • O'Neil (2016): Weapons of Math Destruction — Critique of opaque scoring systems in criminal justice, lending, education
  • Pasquale (2015): The Black Box Society — Argues for transparency in algorithmic decision-making
  • Barocas & Selbst (2016): "Big Data's Disparate Impact" — Legal framework for algorithmic discrimination claims

omphalOS Contribution: Existence proof that high-stakes algorithmic systems can be transparent. Unlike black-box ML systems, omphalOS investigations are human-readable SQL. Every claim is traceable to specific queries. This design anticipates adversarial legal review—not as a threat but as a requirement.

Novel Claim: For analytical systems used in enforcement contexts (export control, sanctions, law enforcement), transparency is not just ethical—it is legally necessary. omphalOS shows that transparency does not preclude sophistication.


Trade-Based Money Laundering & Sanctions Evasion

Prior Work:

  • Zdanowicz (2009): "Detecting Money Laundering and Terrorist Financing via Data Mining" — Statistical anomaly detection in trade pricing
  • Ferwerda et al. (2020): "Gravity Models of Trade-Based Money Laundering" — Economic modeling of illicit flows
  • Early & Cilke (2020): "Sanctions and Evasion" — Case studies of sanctions circumvention techniques

omphalOS Contribution: Scalable, transparent, reproducible implementation of detection patterns. Prior work describes techniques (price anomalies, payment fragmentation) but does not provide executable, verifiable implementations. omphalOS makes these techniques operational at national scale.

Novel Claim: Hypothesis-generating pattern detection (not classification) is the appropriate computational approach for rare-event detection in adversarial settings. The investigation catalog is not an ML model—it is a library of parameterized heuristics that analysts interpret.


Polycentric Governance & Verification

Prior Work:

  • Ostrom (1990): Governing the Commons — Polycentric governance structures for resource management
  • Lessig (1999): Code and Other Laws of Cyberspace — "Code is law"—software architecture embeds governance
  • CompCert Project (Leroy, 2006+): Formally verified C compiler with machine-checked proofs

omphalOS Contribution: Polycentricism as trust architecture for intelligence systems. Multiple independent verifiers (Python, Rust, Go) must agree on run integrity. Trust is distributed—no single implementation is authoritative.

Novel Claim: For systems operating in environments with insider threats, supply chain risks, and adversarial scrutiny, polycentric verification provides defense-in-depth that single-implementation checksumming cannot.


Research Questions omphalOS Addresses

RQ1: Can Epistemic Humility Be Architecturally Enforced?

Traditional Approach: Train analysts on biases, encourage hedged language, incentivize reporting of disconfirming evidence.

omphalOS Approach: Embed procedural safeguards in tools. The Canon forces analysts to scroll past 105 lines of epistemic admonition before reaching queries. Packet format requires "unknowns" field—analysts must document what they don't know.

Empirical Test: Compare false positive rates and analytical confidence levels between analysts using omphalOS vs. traditional tools. Hypothesis: omphalOS users produce fewer overconfident assessments.


RQ2: Does Transparency Reduce Analytical Effectiveness?

Concern: If detection logic is disclosed (FOIA, litigation discovery, GitHub), adversaries will evade. Therefore, secrecy is necessary for effectiveness.

omphalOS Counter: Secrecy provides temporary advantage that erodes upon disclosure. Transparency enables community improvement (researchers propose better patterns), continuous evolution (catalog updated as evasion techniques emerge), and legal defensibility (disclosed methods survive adversarial challenge).

Empirical Test: Track evasion rates over time. Does publication of investigation catalog lead to detectable changes in adversary behavior? If so, does catalog evolution restore detection capability?


RQ3: Can Analytical Reproducibility Survive Organizational Turnover?

Problem: Analyst leaves agency, takes institutional knowledge. Five years later, enforcement action is challenged in court. Can original analysis be reconstructed?

omphalOS Approach: Run manifests record exact code/data versions. Checksums enable bit-identical re-execution. Investigations are version-controlled with commit messages explaining reasoning.

Empirical Test: Select random closed cases from 3+ years ago. Attempt to re-execute using archived manifests. Success rate measures reproducibility.


RQ4: What Is the Cost/Benefit of Polycentric Verification?

Cost: Additional implementation effort (Rust verifier, Go verifier), computational overhead (3x verification time vs. single-verifier).

Benefit: Defense against implementation bugs, supply chain attacks, insider tampering.

Empirical Test: Measure verification failure rates. How often do verifiers disagree? When they disagree, what is the root cause (bug, tampering, non-determinism)? Does polycentric verification catch errors that single-verifier would miss?


Methodological Innovations

1. The Canon as Cognitive Architecture

Innovation: Transform normative statements into structural elements. Rather than saying "be restrained" once, enforce it by making restraint procedurally unavoidable.

Broader Application: Could be applied to other high-stakes analytical domains (medical diagnosis, financial auditing, engineering safety analysis). Embed epistemic warnings in CAD software, electronic health records, financial models.


2. Hypothesis-Generating Detectors vs. Classifiers

Innovation: Explicitly reject classification paradigm (label entities as violators/non-violators) in favor of hypothesis-generation (surface patterns for human review).

Rationale: Base rates are too low, adversaries adapt too quickly, and legal requirements demand interpretability. Classification optimizes for metrics (precision, recall) that are misleading in this context.

Broader Application: Rare-event detection in other adversarial settings (fraud, insider threats, APT detection). Replace risk scores with pattern libraries.


3. Packets as Auditable Analytical Artifacts

Innovation: Structure intelligence products as machine-readable JSON with explicit unknowns, provenance chains, and checksum-verified integrity.

Broader Application: Any analytical domain requiring auditability (clinical decision support, credit underwriting, recidivism prediction). Make analytical reasoning transparent and contestable.


Empirical Research Agenda

Study 1: False Positive Analysis

Method: Apply omphalOS to ground-truth dataset (historical export control cases with known outcomes). Measure:

  • False positive rate (flagged but legitimate)
  • False negative rate (evaders missed)
  • Investigation diversity (how many patterns surface each true positive?)

Expected Findings: Redundancy reduces false negatives (multiple investigations catch what one misses). Canon reduces false positives (analysts interpret patterns more cautiously).


Study 2: Cognitive Load & Analyst Performance

Method: A/B test with two analyst cohorts:

  • Control group: Traditional SQL tools with single-warning disclaimers
  • Treatment group: omphalOS with Canon/Margin repetition

Metrics:

  • Time to case completion
  • Confidence levels in packet memos
  • Incidence of documented "unknowns"
  • Supervisor assessments of analytical quality

Hypothesis: omphalOS users take slightly longer (due to Canon scrolling) but produce higher-quality, more defensible analysis.


Study 3: Adversarial Red Teaming

Method: Disclose investigation catalog to red team (adversarial role-players). Task them with structuring transactions to evade detection. Then:

  1. Measure evasion success rate
  2. Update catalog with new patterns targeting evasion techniques
  3. Re-test evasion (arms race simulation)

Expected Findings: Catalog evolution imposes costs on adversaries. Evading thousands of patterns is harder than evading one.


Study 4: Reproducibility Audit

Method: Select random sample of closed cases (50 cases, 1-5 years old). Attempt to:

  1. Locate original run manifests
  2. Retrieve code/data versions specified in manifests
  3. Re-execute cases
  4. Verify checksums match original packets

Metrics:

  • Reproducibility rate (% of cases successfully re-executed)
  • Time required per case
  • Root causes of failures (missing data, lost commits, software dependencies unavailable)

Hypothesis: High reproducibility (>90%) demonstrates that omphalOS's versioning/checksumming design works in practice.


Publication Venues

Tier 1: Flagship Interdisciplinary Conferences

  • ACM Conference on Fairness, Accountability, and Transparency (FAccT): Governance of algorithmic systems in high-stakes settings
  • IEEE Security & Privacy: Secure systems for sensitive data analysis
  • CHI (Computer-Human Interaction): Human-centered design of analytical tools

Pitch: "omphalOS: A Polycentric Architecture for Epistemically Humble Intelligence Analysis"

Target Audience: CS researchers, policymakers, accountability advocates


Tier 2: Domain-Specific Journals

  • Intelligence and National Security (peer-reviewed journal): Intelligence methodology
  • Journal of Money Laundering Control: Trade-based money laundering detection
  • International Studies Quarterly / Journal of Conflict Resolution: Sanctions effectiveness, export control

Pitch: "Computational Methods for Sanctions Enforcement: Design and Evaluation of omphalOS"

Target Audience: Intelligence practitioners, policy scholars, international relations


Tier 3: Technical Implementation Papers

  • VLDB / SIGMOD: Scalable analytical SQL for trade databases
  • Software: Practice and Experience: Polycentric verification architectures
  • Journal of Open Source Software (JOSS): Software archival with lightweight peer review

Pitch: "omphalOS: An Open-Source Suite for Export Control Casework"

Target Audience: Database researchers, software engineers, practitioners


Open Research Questions

1. Can LLMs Improve Investigation Design?

Question: Can LLMs propose novel investigations given natural-language hypotheses?

Approach: Fine-tune LLM on existing catalog. Prompt with "Detect entities using multiple shell companies to obscure ownership." LLM generates candidate SQL. Analyst reviews/refines.

Risk: LLM-generated SQL may hallucinate non-existent tables, produce syntactically invalid queries, or embed biases from training data.


2. What Are the Limits of Transparency?

Question: At what point does disclosure harm effectiveness more than secrecy harms legitimacy?

Approach: Game-theoretic modeling of disclosure scenarios. Under what conditions do adversaries adapt faster than defenders can evolve?

Tension: omphalOS embraces transparency, but some intelligence methods (human sources, signals intelligence) cannot be disclosed without operational harm. Where is the boundary?


3. How to Formalize "Sufficient" Verification?

Question: How many independent verifiers are "enough"? Is 3 (Python, Rust, Go) meaningfully better than 2? Is 10 overkill?

Approach: Fault injection testing. Deliberately introduce bugs in one verifier. Measure how often polycentric verification catches them. Diminishing returns analysis.


4. Can Differential Privacy Enable Cross-Classification Collaboration?

Question: Can omphalOS be extended to enable analysis across classification boundaries (e.g., US + allies pooling data without revealing individual records)?

Approach: Differentially-private aggregations in packets. Each party computes noisy counts/sums, results are combined. Trade-off: privacy vs. analytical utility.

Challenge: Differential privacy requires careful parameter tuning. Too much noise → useless results. Too little noise → privacy leaks.


Intellectual Property & Licensing

Current Status

omphalOS is U.S. Government work (developed by/for USG). Under 17 USC § 105, works produced by federal employees within scope of employment are not subject to copyright within the United States.

Implications:

  • No copyright restrictions on use, modification, distribution within US
  • International copyright may apply (US government works can be copyrighted abroad)
  • Patents possible (government can patent inventions even if work itself is not copyrighted)

Recommended Licensing

For public release, apply CC0 (Creative Commons Zero) or MIT License:

CC0: Explicitly places work in public domain worldwide. Clearest possible dedication.

MIT License: Permissive license with attribution requirement. Compatible with most open-source projects.

Why not GPL? GPL's copyleft provisions may deter adoption by intelligence community (some agencies have policies against GPL due to disclosure obligations).


Export Control Notice

Because omphalOS relates to export control, clarify that the software itself is not controlled:

"This software is publicly released and not subject to export controls under the Export Administration Regulations (EAR). It contains no controlled technical data. The synthetic datasets included are invented and do not represent real trade transactions."

This prevents confusion about whether the software itself requires a license for export.


Engagement Strategy

Phase 1: Academic Publication (Months 1-6)

  • Draft conference paper for FAccT 2026
  • Present at intelligence community conferences (ODNI Xpo, DHS S&T Summit)
  • Host workshops for interested agencies (BIS, OFAC, FBI, CBP)

Phase 2: Community Building (Months 6-12)

  • Launch public GitHub repo with documentation
  • Create mailing list / Slack channel for adopters
  • Develop tutorial materials (videos, Jupyter notebooks)
  • Host hackathon: "Design novel investigations for omphalOS"

Phase 3: Institutionalization (Months 12-24)

  • Integrate into existing systems (CBP's ACE, BIS's ECASS)
  • Train interagency analysts on omphalOS
  • Establish governance structure (investigation catalog review board)
  • Develop certification program (omphalOS practitioner credential)

Conclusion

omphalOS is not just a tool—it is a methodological intervention. It demonstrates that intelligence analysis can be algorithmically mediated without becoming algorithmically determined, transparent without sacrificing sophistication, and reproducible without ossifying into inflexibility.

The research contribution is architectural: showing how to embed epistemic humility, auditability, and polycentrism into computational systems for high-stakes decision-making. These principles are domain-agnostic—they apply beyond export control to any field where analytical integrity matters more than optimization metrics.

By open-sourcing omphalOS, we enable the research community to build on these foundations, test the claims empirically, and improve the design. This is intelligence analysis as open science—not contradiction, but complement.