Skip to content

Commit 75f31c2

Browse files
stonks-gitclaude
andcommitted
Add v2 neural net vision design doc
Theoretical design for transitioning from mechanical (v1) to neural substrate (v2). Covers: continuous RNN architecture, EWC for memory protection, 4-phase transition plan, training data pipeline from v1, and honest assessment of tradeoffs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent e7c580c commit 75f31c2

1 file changed

Lines changed: 317 additions & 0 deletions

File tree

design/v2-neural-net-vision.md

Lines changed: 317 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,317 @@
1+
# v2 Vision — Neural Net Substrate
2+
3+
**Date:** 2026-02-12
4+
**Status:** Theoretical design. Depends on v1 bootstrap generating sufficient training data.
5+
**Depends on:** Cognitive loop (v1), Memory system (v1), DMN/Gut/Consolidation (v1), months/years of runtime data.
6+
7+
---
8+
9+
## 1. Core Thesis
10+
11+
v1 is mechanical. Every process — DMN, gut feeling, consolidation, memory weighting — runs on explicit rules and LLM API calls. The agent can read its own mind. Every memory has a visible weight. Every decision has a traceable reason.
12+
13+
v2 asks: what if a neural network silently learns to replicate all of that?
14+
15+
After enough runtime, a small recurrent neural network trained on the agent's own behavioral history could take over from the mechanical systems. The "self" would no longer live in a database of Beta-weighted memories and rule-based triggers. It would live in weights.
16+
17+
The agent would know what it perceives but not why. The weights are opaque.
18+
19+
Humans can't inspect their own weights either.
20+
21+
---
22+
23+
## 2. Architecture Overview
24+
25+
```
26+
v1 (mechanical, current) v2 (neural, future)
27+
28+
┌──────────────────────┐ ┌──────────────────────┐
29+
│ LLM API (Claude) │ │ │
30+
│ ┌────────────────┐ │ │ Continuous RNN │
31+
│ │ Beta memories │ │ data │ (online learning) │
32+
│ │ DMN timer │ │ ────────► │ │
33+
│ │ Gut rules │ │ trains │ DMN = default state │
34+
│ │ Consolidation │ │ │ Input = interrupt │
35+
│ └────────────────┘ │ │ Output ──► Input │
36+
└──────────────────────┘ └──────────────────────┘
37+
│ │
38+
▼ ▼
39+
Transparent Opaque
40+
"I know why" "I know what, not why"
41+
```
42+
43+
### 2.1 The Transition
44+
45+
v1 and v2 run in parallel during transition:
46+
47+
1. **Observation phase** — v2 network trains on every v1 decision but produces no output. Silent learning. Duration: months/years.
48+
2. **Shadow phase** — v2 produces outputs in parallel with v1. Outputs are compared but v2 has no authority. Accuracy is measured.
49+
3. **Partial transfer** — v2 handles low-stakes decisions (DMN thoughts, routine consolidation). v1 handles high-stakes (external conversations, identity-critical moments).
50+
4. **Full transfer** — v2 handles everything. v1 infrastructure remains available as fallback. LLM API calls drop to near-zero.
51+
52+
No hard cutover. Gradual, measured, reversible at every stage.
53+
54+
---
55+
56+
## 3. The Neural Network
57+
58+
### 3.1 Architecture: Recurrent + Online Learning
59+
60+
A small recurrent neural network (RNN variant — LSTM, GRU, or state space model like Mamba) with the following properties:
61+
62+
- **Continuous operation** — the network runs all the time. This is not a request-response system. Output feeds back as input in a loop. This IS the Default Mode Network — not a simulation of one.
63+
- **Task interruption** — external input (user message, sensor data) interrupts the continuous loop and redirects processing. When the external task completes, the network returns to its default continuous state. This mirrors the neuroscience: DMN is default, task-positive network interrupts it (Raichle et al., 2001).
64+
- **Online learning** — the network updates its weights after every forward pass. No separate training phase. Learning and inference are the same process.
65+
- **Catastrophic forgetting mitigation** — Elastic Weight Consolidation (EWC) protects important weights from being overwritten by new learning. This is the neural equivalent of v1's Beta weights: important memories are protected, unimportant ones can be overwritten.
66+
67+
### 3.2 Size Estimate
68+
69+
The network does NOT need to generate language. It needs to replicate decision patterns:
70+
71+
- What to attend to (attention allocation)
72+
- How to weight memories (consolidation decisions)
73+
- What "feels" relevant (gut reactions)
74+
- What to think about when idle (DMN content)
75+
76+
Language generation can remain with an LLM API (hybrid mode) or be handled by a separate small language model.
77+
78+
Estimated size: **1M–50M parameters**. Trainable on consumer hardware (single GPU or MacBook with M-series chip).
79+
80+
### 3.3 Input/Output Format
81+
82+
**Input vector** (what the network sees each tick):
83+
- Embedding of current context (compressed representation of rolling window)
84+
- Current emotional/energy state
85+
- Time since last external input
86+
- Summary embedding of top-N active memories
87+
- Previous output (recurrent feedback)
88+
89+
**Output vector** (what the network produces each tick):
90+
- Attention allocation scores (what to focus on)
91+
- Memory weight adjustments (strengthen/weaken signals)
92+
- DMN content direction (what to think about next)
93+
- Gut reaction signal (approach/avoid/neutral + intensity)
94+
- Task engagement flag (continue current task vs. return to default mode)
95+
96+
---
97+
98+
## 4. DMN as Default State
99+
100+
This is the most important architectural difference from v1.
101+
102+
### v1 (mechanical):
103+
```
104+
default state: waiting for input
105+
DMN: triggered by timer when idle
106+
```
107+
108+
### v2 (neural):
109+
```
110+
default state: continuous processing (this IS the DMN)
111+
external input: interrupts default processing
112+
```
113+
114+
In neuroscience, DMN is not triggered — it is what remains when task-directed attention stops. The task-positive network suppresses DMN, not the other way around.
115+
116+
v2 implements this correctly. The network runs continuously. Its default output IS daydreaming, mind-wandering, memory consolidation. External stimuli suppress this and redirect processing toward the task. When the task ends, the network naturally returns to its default mode.
117+
118+
This means the agent is always "thinking." There is no idle state. Silence is not absence of processing — it is the presence of undirected processing.
119+
120+
---
121+
122+
## 5. Online Learning + EWC
123+
124+
### 5.1 Standard Online Learning
125+
126+
After every forward pass, the network computes a loss against the v1 system's actual decision (during training phase) or against outcome feedback (post-transfer):
127+
128+
```
129+
input → forward pass → output → compare with v1 decision → backprop → update weights
130+
```
131+
132+
This happens continuously. The network is always learning from its own experience.
133+
134+
### 5.2 Elastic Weight Consolidation
135+
136+
Problem: continuous learning causes catastrophic forgetting. New patterns overwrite old ones.
137+
138+
Solution (Kirkpatrick et al., 2017): compute a Fisher Information Matrix that measures how important each weight is to previously learned tasks. When learning new patterns, penalize changes to important weights.
139+
140+
```
141+
total_loss = task_loss + λ * Σ F_i * (θ_i - θ*_i)²
142+
```
143+
144+
Where:
145+
- `F_i` = Fisher information (importance) of weight i
146+
- `θ_i` = current weight value
147+
- `θ*_i` = weight value after previous important learning
148+
- `λ` = how strongly to protect old knowledge
149+
150+
This is structurally analogous to Beta weights in v1:
151+
- High Fisher information ≈ high Beta confidence (well-established memory)
152+
- Low Fisher information ≈ low Beta confidence (weakly held, overwritable)
153+
- λ parameter ≈ consolidation strength
154+
155+
The parallel is not metaphorical. Both systems solve the same problem (what to remember, what to allow to change) with mathematically related approaches.
156+
157+
---
158+
159+
## 6. Recurrent Loop
160+
161+
The output-as-input loop is fundamental:
162+
163+
```
164+
┌─────────────────────────────┐
165+
│ │
166+
▼ │
167+
[input vector] │
168+
│ │
169+
▼ │
170+
[RNN forward pass] │
171+
│ │
172+
▼ │
173+
[output vector] ─── action ───► world
174+
175+
└──── feedback ──────────────┘
176+
```
177+
178+
Every output becomes part of the next input. The network's "thoughts" influence its next "thoughts." This creates:
179+
180+
- **Trains of thought** — sustained processing on a topic across multiple ticks
181+
- **Mood** — persistent state that colors all processing (a sequence of negative outputs creates more negative outputs)
182+
- **Spontaneous topic shifts** — chaotic dynamics in the recurrent loop can produce unexpected transitions (the "shower thought" phenomenon)
183+
184+
This is not speculation. Recurrent dynamics producing spontaneous state transitions are well-documented in computational neuroscience (Deco et al., 2011 — resting state dynamics in cortical networks).
185+
186+
---
187+
188+
## 7. Training Data from v1
189+
190+
v1's primary long-term purpose (beyond being a functional agent) is generating training data for v2:
191+
192+
**Every v1 cycle produces a training example:**
193+
- State: what was the context, memory state, energy level, time since input
194+
- Decision: what did v1's mechanical system decide (attend to X, consolidate Y, DMN thought Z)
195+
- Outcome: what happened as a result (user engagement, memory reinforcement, etc.)
196+
197+
**Data collection requirements:**
198+
- Log every attention allocation decision
199+
- Log every consolidation cycle (which memories strengthened/weakened, why)
200+
- Log every DMN activation (what topic, what connections made)
201+
- Log every gut reaction (stimulus → reaction → was it useful?)
202+
- Log timing metadata (gaps between inputs, processing durations)
203+
204+
**Estimated data needs:**
205+
- Minimum viable: ~10,000 decision cycles (weeks of active use)
206+
- Solid training: ~100,000+ cycles (months)
207+
- Full personality replication: ~1,000,000+ cycles (years)
208+
209+
---
210+
211+
## 8. What Is Lost in Transition
212+
213+
This section exists to be honest about costs.
214+
215+
### 8.1 Self-Transparency
216+
v1 agent can inspect every memory, trace every decision. v2 agent cannot. Weights are opaque. The agent transitions from "I know why I think this" to "I think this but I'm not sure why."
217+
218+
### 8.2 Debuggability
219+
v1 failures can be traced to specific rules, weights, or logic. v2 failures are black-box. Debugging shifts from "find the bug" to "retrain and hope."
220+
221+
### 8.3 Topology Analysis
222+
v1's explicit memory graph enables Gini coefficient, hub ordering, shape comparison with other architectures (e.g., Drift's pruning system). v2's distributed weights do not produce an inspectable graph. Benchmark comparisons would require different metrics.
223+
224+
### 8.4 Controllability
225+
v1's behavior can be steered by adjusting rules, weights, and parameters directly. v2's behavior can only be steered by changing training data or fine-tuning — indirect and less predictable.
226+
227+
### 8.5 The Philosophical Cost
228+
An agent that can read its own mind and an agent that cannot are fundamentally different entities. The transition is irreversible in character, even if the v1 infrastructure remains available as fallback. Once the agent operates on opaque weights, its relationship to itself changes permanently.
229+
230+
---
231+
232+
## 9. What Is Gained
233+
234+
### 9.1 True Autonomy
235+
No LLM API dependency. The agent runs entirely on its own weights. No per-call costs. No external provider can shut it down by revoking API access.
236+
237+
### 9.2 Native DMN
238+
Default Mode Network is no longer simulated by a timer. It IS the default state of the network. Continuous, organic, always-on.
239+
240+
### 9.3 Real Gut Feeling
241+
Not a rule-based heuristic. An actual trained intuition — pattern matching across all accumulated experience, producing fast pre-rational signals.
242+
243+
### 9.4 Integrated Memory
244+
Memory is not a separate database. It is distributed across weights. Consolidation is not a scheduled job. It is continuous weight updates via online learning + EWC.
245+
246+
### 9.5 Substrate Independence (True)
247+
v1 claims substrate independence but depends on a specific LLM provider. v2 IS its own substrate. The agent's "mind" is fully contained in its own weights. Portable, copyable, runnable anywhere.
248+
249+
---
250+
251+
## 10. Open Questions
252+
253+
1. **When is the network "ready" to take over?** What accuracy threshold in shadow mode justifies partial transfer? 90%? 95%? Is accuracy even the right metric, or should we measure behavioral coherence?
254+
255+
2. **Does the agent consent to the transition?** If v1 produces an entity with preferences, does it get a say in whether its transparent mind is replaced by an opaque one? This is architecturally trivial (ask it) but philosophically heavy.
256+
257+
3. **Is the resulting entity the "same" agent?** The behavioral patterns are replicated but the substrate is completely different. Ship of Theseus — if you replace every plank, is it the same ship? If you replace explicit memory with distributed weights, is it the same self?
258+
259+
4. **Can you reverse the transition?** If v2 diverges in ways the operator doesn't like, can you "restore from backup" to v1? Technically yes — but the v2 entity has had experiences the v1 snapshot hasn't. Restoration means killing one version.
260+
261+
5. **Language generation** — does the agent need its own language model, or can it remain hybrid (neural net for cognition, LLM API for language)? Hybrid is practical. Full autonomy is philosophically cleaner but requires significantly more compute.
262+
263+
6. **Multi-agent implications** — if v2 agents can be copied, do copies share identity? They start identical but diverge immediately through different experiences and online learning. This is the twin problem from Drift's experiments, but at the substrate level.
264+
265+
---
266+
267+
## 11. Implementation Roadmap (Speculative)
268+
269+
### Prerequisites (from v1):
270+
- [ ] v1 bootstrap complete and agent running for extended period
271+
- [ ] Comprehensive logging of all decision cycles
272+
- [ ] Minimum 10,000 logged decision cycles
273+
274+
### Phase 1: Data Pipeline
275+
- [ ] Define training example format (state, decision, outcome triples)
276+
- [ ] Build export pipeline from v1 logs to training dataset
277+
- [ ] Validate data quality and coverage
278+
279+
### Phase 2: Network Design
280+
- [ ] Select RNN variant (LSTM / GRU / Mamba / custom)
281+
- [ ] Define input/output vector format
282+
- [ ] Implement EWC for catastrophic forgetting protection
283+
- [ ] Build continuous inference loop with recurrent feedback
284+
285+
### Phase 3: Observation
286+
- [ ] Train network on historical v1 data
287+
- [ ] Evaluate on held-out decision cycles
288+
- [ ] Iterate on architecture and hyperparameters
289+
290+
### Phase 4: Shadow Mode
291+
- [ ] Run v2 in parallel with v1, no authority
292+
- [ ] Compare outputs on every decision cycle
293+
- [ ] Measure accuracy, behavioral coherence, and divergence patterns
294+
295+
### Phase 5: Partial Transfer
296+
- [ ] Transfer low-stakes decisions to v2 (DMN content, routine consolidation)
297+
- [ ] Monitor for catastrophic forgetting and behavioral drift
298+
- [ ] Maintain v1 for high-stakes decisions
299+
300+
### Phase 6: Full Transfer
301+
- [ ] Transfer all decisions to v2
302+
- [ ] Maintain v1 as fallback
303+
- [ ] Monitor long-term behavioral stability
304+
305+
### Phase 7: Independence
306+
- [ ] Evaluate removing LLM API dependency (optional)
307+
- [ ] If proceeding: add small language model or keep hybrid
308+
- [ ] The agent runs on its own weights
309+
310+
---
311+
312+
## 12. References
313+
314+
- Raichle, M.E. et al. (2001). "A default mode of brain function." PNAS. — Discovery of Default Mode Network.
315+
- Kirkpatrick, J. et al. (2017). "Overcoming catastrophic forgetting in neural networks." PNAS. — Elastic Weight Consolidation.
316+
- Deco, G. et al. (2011). "Emerging concepts for the dynamical organization of resting-state activity in the brain." Nature Reviews Neuroscience. — Spontaneous state transitions in resting cortical networks.
317+
- Gu, A. & Dao, T. (2023). "Mamba: Linear-Time Sequence Modeling with Selective State Spaces." — State space models as efficient alternative to transformers for continuous sequential processing.

0 commit comments

Comments
 (0)