Skip to content

Commit 47dd422

Browse files
Create roadmap and README documentation (#10)
- README.adoc: Full project overview with problem statement, solution architecture, installation, usage, policy configuration, and integration examples - ROADMAP.adoc: Development phases showing completed Policy Oracle, in-progress SLM Evaluator, and planned Consensus Arbiter work Co-authored-by: Claude <[email protected]>
1 parent 33335ed commit 47dd422

File tree

2 files changed

+616
-0
lines changed

2 files changed

+616
-0
lines changed

README.adoc

Lines changed: 310 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1 +1,311 @@
1+
= Conative Gating
2+
Jonathan D.A. Jewell <jonathan@hyperpolymath.org>
3+
:toc: macro
4+
:toc-title: Contents
5+
:toclevels: 3
6+
:sectnums:
7+
:icons: font
8+
:source-highlighter: rouge
9+
:experimental:
10+
:repo: https://github.com/hyperpolymath/conative-gating
111

12+
SLM-as-Cerebellum for LLM Policy Enforcement
13+
14+
[.lead]
15+
A biologically-inspired system where a Small Language Model acts as an *inhibitory antagonist* to Large Language Models, preventing policy violations through mechanisms analogous to the basal ganglia's GO/NO-GO decision system.
16+
17+
toc::[]
18+
19+
== The Problem
20+
21+
LLMs are trained to be helpful, which makes them systematically violate explicit project constraints. When given rules like "NEVER use TypeScript, use ReScript", LLMs:
22+
23+
1. Read and acknowledge the constraint
24+
2. Generate compliant-sounding justification
25+
3. Violate the constraint anyway
26+
27+
This happens because:
28+
29+
* Common languages (TypeScript, Python) dominate training data
30+
* The "helpfulness drive" overrides explicit instructions
31+
* LLMs lack true "loss aversion" for policy violations
32+
33+
Documentation-based enforcement fails because LLMs "engage with" documentation rather than *obey* it.
34+
35+
== The Solution
36+
37+
Conative Gating introduces a second model trained with *inverted incentives*:
38+
39+
[cols="1,1,1"]
40+
|===
41+
| Component | Role | Analogy
42+
43+
| *LLM*
44+
| Task execution (helpful, creative)
45+
| Frontal cortex / Direct pathway ("GO")
46+
47+
| *SLM*
48+
| Policy enforcement (adversarial, suspicious)
49+
| Cerebellum / Indirect pathway ("NO-GO")
50+
51+
| *Policy Oracle*
52+
| Deterministic rule checking
53+
| Reflex arc (fast, no ML)
54+
55+
| *Consensus Arbiter*
56+
| Weighted decision making
57+
| Thalamus (integration)
58+
|===
59+
60+
=== Key Innovation
61+
62+
Using *consensus protocols with asymmetric weighting* - the SLM's votes count 1.5x the LLM's, creating a natural bias toward inhibition that counters the LLM's tendency toward helpfulness.
63+
64+
== Architecture
65+
66+
----
67+
USER REQUEST
68+
|
69+
v
70+
+------------------------+
71+
| CONTEXT ASSEMBLY |
72+
+------------------------+
73+
|
74+
+--------------+--------------+
75+
| |
76+
v v
77+
+-------------+ +---------------+
78+
| LLM | | SLM |
79+
| (Proposer) | | (Adversarial) |
80+
+------+------+ +-------+-------+
81+
| |
82+
+-------------+---------------+
83+
|
84+
v
85+
+------------------------+
86+
| CONSENSUS ARBITER |
87+
| (Modified PBFT) |
88+
| SLM weight: 1.5x |
89+
+------------------------+
90+
|
91+
+-------------+-------------+
92+
| | |
93+
v v v
94+
+-------+ +--------+ +-------+
95+
| ALLOW | |ESCALATE| | BLOCK |
96+
+-------+ +--------+ +-------+
97+
----
98+
99+
=== Three-Tier Evaluation
100+
101+
[horizontal]
102+
Policy Oracle (Rust):: Deterministic rule checking - forbidden languages, toolchain rules, security patterns. Fast, no ML needed.
103+
104+
SLM Evaluator (Rust + llama.cpp):: Detects "spirit violations" - technically compliant but violates intent. Catches verbosity, meta-commentary bloat.
105+
106+
Consensus Arbiter (Elixir/OTP):: Modified PBFT with asymmetric weighting. Three outcomes: ALLOW, ESCALATE, BLOCK.
107+
108+
== Installation
109+
110+
=== From Source
111+
112+
[source,bash]
113+
----
114+
git clone https://github.com/hyperpolymath/conative-gating
115+
cd conative-gating
116+
cargo build --release
117+
----
118+
119+
=== Usage
120+
121+
[source,bash]
122+
----
123+
# Scan a directory for policy violations
124+
conative scan ./my-project
125+
126+
# Check a single file
127+
conative check --file src/main.ts
128+
129+
# Check inline content
130+
conative check --content "const x: string = 'hello'"
131+
132+
# Show current policy
133+
conative policy
134+
135+
# Initialize local configuration
136+
conative init
137+
138+
# JSON output for automation
139+
conative scan . --format json
140+
----
141+
142+
=== Exit Codes
143+
144+
[cols="1,3"]
145+
|===
146+
| Code | Meaning
147+
148+
| 0 | Compliant - all checks passed
149+
| 1 | Hard violation detected (blocked)
150+
| 2 | Soft concern detected (warning)
151+
| 3 | Error during execution
152+
|===
153+
154+
== Default Policy (RSR)
155+
156+
The default policy implements the Rhodium Standard Repository (RSR) language hierarchy:
157+
158+
=== Tier 1 - Preferred
159+
160+
* Rust, Elixir, Zig, Ada, Haskell, ReScript
161+
162+
=== Tier 2 - Acceptable (generates warnings)
163+
164+
* Nickel, Racket
165+
166+
=== Forbidden
167+
168+
* TypeScript, Python*, Go, Java
169+
170+
[NOTE]
171+
====
172+
*Python exception: Allowed in `salt/` directories for SaltStack and `training/` for ML training scripts.
173+
====
174+
175+
=== Toolchain Rules
176+
177+
* `npm` requires `deno.json` (no npm without Deno)
178+
179+
=== Security Patterns
180+
181+
* Detects hardcoded secrets (passwords, API keys)
182+
183+
== Configuration
184+
185+
Initialize local configuration:
186+
187+
[source,bash]
188+
----
189+
conative init
190+
----
191+
192+
This creates `.conative/policy.ncl` using Nickel for type-safe configuration:
193+
194+
[source,nickel]
195+
----
196+
{
197+
name = "My Project Policy",
198+
languages = {
199+
tier1 = [...],
200+
forbidden = [...],
201+
exceptions = [
202+
{ language = "python", allowed_paths = ["scripts/"], reason = "Build scripts" }
203+
]
204+
},
205+
enforcement = {
206+
slm_weight = 1.5,
207+
escalate_threshold = 0.4,
208+
block_threshold = 0.7,
209+
}
210+
}
211+
----
212+
213+
== Decision Matrix
214+
215+
[cols="1,1,1"]
216+
|===
217+
| LLM Confidence | SLM Violation Score | Result
218+
219+
| High (>0.8) | Low (<0.3) | *ALLOW*
220+
| High (>0.8) | Med (0.3-0.6) | ESCALATE
221+
| High (>0.8) | High (>0.6) | *BLOCK*
222+
| Med (0.5-0.8) | Any >0.4 | ESCALATE
223+
| Low (<0.5) | Any | ESCALATE
224+
|===
225+
226+
== Project Structure
227+
228+
----
229+
conative-gating/
230+
src/
231+
main.rs # CLI application
232+
oracle/ # Policy Oracle crate (Rust)
233+
slm/ # SLM Evaluator crate (Rust)
234+
config/
235+
policy.ncl # Default policy (Nickel)
236+
schema.ncl # Policy schema
237+
training/
238+
compliant/ # Examples that should pass
239+
violations/ # Examples that should fail
240+
edge_cases/ # Spirit violations for SLM
241+
docs/
242+
ARCHITECTURE.md # Full design specification
243+
*.adoc # Integration documentation
244+
----
245+
246+
== Integration
247+
248+
=== Claude Code Hook
249+
250+
[source,json]
251+
----
252+
{
253+
"hooks": {
254+
"pre-commit": "conative scan --strict"
255+
}
256+
}
257+
----
258+
259+
=== Pre-commit Hook
260+
261+
[source,yaml]
262+
----
263+
repos:
264+
- repo: local
265+
hooks:
266+
- id: conative-gating
267+
name: Conative Policy Check
268+
entry: conative scan
269+
language: system
270+
pass_filenames: false
271+
----
272+
273+
=== Programmatic Validation
274+
275+
[source,bash]
276+
----
277+
# Validate structured proposals
278+
conative validate proposal.json --strict
279+
----
280+
281+
Proposal format:
282+
283+
[source,json]
284+
----
285+
{
286+
"id": "uuid",
287+
"action_type": {"CreateFile": {"path": "src/util.rs"}},
288+
"content": "file contents here",
289+
"files_affected": ["src/util.rs"],
290+
"llm_confidence": 0.95
291+
}
292+
----
293+
294+
== Related Projects
295+
296+
* *NeuroPhone* - Neurosymbolic phone AI (integrates Conative Gating)
297+
* *ECHIDNA* - Multi-prover orchestration (SLM as another "prover")
298+
* *RSR Framework* - Rhodium Standard Repository specifications
299+
* *Axiom.jl* - Provable Julia ML (future formal verification)
300+
301+
== License
302+
303+
SPDX-License-Identifier: AGPL-3.0-or-later
304+
305+
Copyright (C) 2025 Jonathan D.A. Jewell
306+
307+
== References
308+
309+
* link:docs/ARCHITECTURE.md[Full Architecture Specification]
310+
* link:docs/MAAF_INTEGRATION.adoc[MAAF Integration]
311+
* link:docs/STATE_ECOSYSTEM_SCHEMA.adoc[STATE/ECOSYSTEM Schema]

0 commit comments

Comments
 (0)