symposium-dev
diff --git a/‎.gitignore‎
Lines changed: 2 additions & 1 deletion b/‎.gitignore‎
Lines changed: 2 additions & 1 deletion
diff --git a/‎CLAUDE.md‎
Lines changed: 26 additions & 0 deletions b/‎CLAUDE.md‎
Lines changed: 26 additions & 0 deletions
diff --git a/‎book.toml‎
Lines changed: 5 additions & 0 deletions b/‎book.toml‎
Lines changed: 5 additions & 0 deletions
diff --git a/‎prompts/README.md‎
Lines changed: 0 additions & 57 deletions b/‎prompts/README.md‎
Lines changed: 0 additions & 57 deletions
diff --git a/‎src/SUMMARY.md‎
Lines changed: 22 additions & 0 deletions b/‎src/SUMMARY.md‎
Lines changed: 22 additions & 0 deletions
diff --git a/‎src/current-state.md‎
Lines changed: 75 additions & 0 deletions b/‎src/current-state.md‎
Lines changed: 75 additions & 0 deletions
diff --git a/‎src/design-foundation.md‎
Lines changed: 74 additions & 0 deletions b/‎src/design-foundation.md‎
Lines changed: 74 additions & 0 deletions
diff --git a/‎src/emotive.md‎
Lines changed: 21 additions & 0 deletions b/‎src/emotive.md‎
Lines changed: 21 additions & 0 deletions
diff --git a/‎src/insights/README.md‎
Lines changed: 25 additions & 0 deletions b/‎src/insights/README.md‎
Lines changed: 25 additions & 0 deletions
diff --git a/‎insights/autonomy-vs-guidance.md‎ ‎src/insights/autonomy-vs-guidance.md‎insights/autonomy-vs-guidance.md renamed to src/insights/autonomy-vs-guidance.md b/‎insights/autonomy-vs-guidance.md‎ ‎src/insights/autonomy-vs-guidance.md‎insights/autonomy-vs-guidance.md renamed to src/insights/autonomy-vs-guidance.md
@@ -65,4 +65,5 @@ venv.bak/
 
 # Temporary files
 *.tmp
-*.temp
+*.temp
+book
@@ -5,6 +5,32 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
 @prompts/project/ongoing-work-tracking.md
 @prompts/project/ai-insights.md
 
+## Memory Bank Design Context
+
+### Maintenance & Workflow
+This mdBook content must be kept current as design evolves. "Checkpoint our work" includes:
+- Updating Current Design State with new insights/discoveries  
+- Moving completed items from Open Questions to Key Design Decisions
+- Documenting any design pivots or new principles discovered
+
+See project conventions:
+- GitHub tracking: @prompts/project/github-tracking-issues.md
+- Code documentation: @prompts/project/ai-insights.md
+
+### Vision & Goals
+@src/introduction.md
+
+### Design Foundation  
+@src/design-foundation.md
+
+### Current Design State
+@src/current-state.md
+(Check this section for latest open questions and discoveries)
+
+### Full Documentation
+@src/SUMMARY.md
+(Complete architecture, implementation details, research archive)
+
 ## Project Overview
 
 Socratic Shell is a research experiment in deliberate AI-human collaboration design. It consists of two main MCP (Model Context Protocol) servers that enable structured collaboration patterns and pattern testing.
 
@@ -0,0 +1,5 @@
+[book]
+authors = ["Niko Matsakis"]
+language = "en"
+src = "src"
+title = "Socratic Shell Memory Bank Design"
@@ -0,0 +1,22 @@
+# Summary
+
+- [Introduction](introduction.md)
+
+# Emotive prompting
+
+- [Emotive prompting](./emotive.md)
+- [User prompt](./prompts/user/README.md)
+- [Per-project prompts](./prompts/project/README.md)
+    - [AI Insight comments](./prompts/project/ai-insights.md)
+    - [Github tracking issues](./prompts/project/github-tracking-issues.md)
+    - [.ongoing files](./prompts/project/ongoing-work-tracking.md)
+
+# Memory bank
+
+- [Design Foundation](design-foundation.md)
+- [Current State](current-state.md)
+
+# Appendices
+
+- [Insights](./insights/README.md)
+- [References](./references/README.md)
@@ -0,0 +1,75 @@
+# Current State
+
+## Open Questions
+
+### Technical Implementation
+- **Context detection**: How to automatically identify "what we're doing" for memory tagging
+- **Co-occurrence tracking**: Optimal time windows and decay functions for connection strength
+- **Connection thresholds**: When do weak memory connections effectively disappear
+- **Performance optimization**: Memory loading strategies for large collaboration histories
+
+### User Experience  
+- **Memory operation visibility**: How much to show vs. keep invisible during natural usage
+- **Conflict resolution UX**: Best ways to present merge options and gather user input
+- **Cross-session continuity**: Maintaining memory context across different Claude instances
+
+### Evolution & Learning
+- **Pattern extraction**: Automatically detecting successful collaboration patterns from memory usage
+- **Memory curation**: Balancing selective retention with comprehensive capture
+- **System evolution**: How the memory bank itself learns and improves over time
+
+## Recent Discoveries
+
+### Consolidation Strategy Insights (2025-07-01)
+- **Hybrid approach**: Both autonomous consolidation (for fresh insights) and checkpoint-triggered (for conversation patterns)
+- **Factual memories preferred**: Keep memories as factual records rather than generalizations - let synthesis happen in context
+- **Subject overlap as primary signal**: When new insights share subjects with existing memories, consider consolidation
+- **Conflict resolution approach**: Replace old memory with new + correction note; review with user when uncertain
+- **Self-referential system**: Consolidation rules themselves become memories that evolve through use
+
+### Test System Development (2025-07-03)
+- **YAML-based test format proven**: Human-readable test cases for prompt engineering validation work effectively
+- **Backend-agnostic design**: Not tied to Claude Code specifically, works with any LLM backend
+- **Conversation-driven validation**: Tests defined as user messages with expected responses and tool usage
+- **Flexible matchers**: `should_contain`, `should_not_contain` for response validation work well
+- **Tool parameter validation**: Successfully verify correct parameters passed to memory operations
+
+### Implementation Insights
+- **Task agents inherit full CLAUDE.md context**: Important discovery about how Claude tools maintain behavioral consistency
+- **Natural checkpoint moments**: "Can't keep it all in my head" signals natural consolidation boundary
+- **Review-first approach**: Early implementation should propose updates for user review to build consolidation rules
+- **Test harness evolution**: Started with Python pexpect (terminal automation issues) → Node.js/TypeScript node-pty (worked but complex) → Python SDK (clean, reliable, ecosystem aligned)
+- **Cognitive pressure as consolidation trigger**: The feeling of "juggling too many insights" or mentally rehearsing to keep ideas alive signals need for autonomous consolidation. Key indicators:
+  - Starting to lose earlier threads while processing new information
+  - Internal summarizing to maintain coherence
+  - The thought "that's important, I don't want to lose that"
+  - Feeling that recall requires effort due to working memory load
+- **Curiosity as distinct retrieval trigger**: Curiosity ("I wonder if I know something about this") differs from confusion ("I should know this but don't"). Curiosity is exploratory and forward-looking, while confusion is remedial and backward-looking. Both should trigger read_in but with different query formulations.
+
+## Next Design Priorities
+
+### Phase 1: Core MCP Tools (Active - GitHub Issues #1-3)
+- ✅ **Test harness implemented**: YAML-based dialectic test runner operational
+- 🔄 **GitHub tracking migration**: Breaking down .ongoing files into focused issues
+- 🔄 **mdBook knowledge base**: Moving design documentation to sustainable format
+- ⏳ **MCP tool interface design**: Based on settled architecture principles
+- ⏳ **Basic conflict detection**: Error handling and user collaboration patterns
+
+### Phase 2: Intelligence Layer (Planned)
+- **Two-stage retrieval implementation** (BM25 + semantic reranking)
+- **Memory evolution logic** (generalization, splitting, error correction)
+- **Natural timing integration** with CLAUDE.md patterns
+
+### Immediate Next Steps
+1. Complete mdBook migration of design documentation
+2. Implement core MCP tools for consolidate/read_in/store_back
+3. Create memory consolidation test cases for validation
+4. Refine conflict resolution criteria and decision framework
+
+## Status Summary
+
+**Current Phase**: Transitioning from design to implementation  
+**Test System**: ✅ Operational YAML-based validation framework  
+**Documentation**: 🔄 Migrating to sustainable mdBook format  
+**Implementation**: ⏳ Ready to begin core MCP tool development  
+**Validation**: ✅ Test framework ready for memory operation validation
@@ -0,0 +1,74 @@
+# Design Foundation
+
+## Design Axioms
+
+### Intelligence at the Right Layer
+- **Keep tools simple and deterministic** - MCP tools handle storage, detection, basic operations
+- **Put semantic understanding in the Claude layer** - Complex decisions happen with full context
+- **Let the intelligent layer handle ambiguity** - Claude collaborates with user on uncertain cases
+
+### User Partnership Over Automation  
+- **When uncertain, involve the user rather than guessing** - Ask for guidance in ambiguous scenarios
+- **Make collaborative decisions transparent, not hidden** - Show reasoning, present options
+- **Build trust through predictable behavior + intelligent guidance** - Consistent tool layer, smart human layer
+
+### Follow Natural Conversation Topology
+- **Operations align with natural boundaries** - Checkpoints, topic shifts, completion signals
+- **Memory serves conversation flow rather than interrupting it** - Background operations, invisible integration
+- **Context expands/contracts based on actual needs** - Load what's relevant when it's relevant
+
+### Context is King
+- **Full conversation context beats isolated processing** - Current work informs memory decisions
+- **Rich context enables better decision-making** - Memory conflicts resolved with full understanding
+- **Current insights inform past memory evolution** - Store-back updates use fresh context
+
+### Learn from Biology
+- **Mirror human memory architecture** - Short-term (LLM context) to long-term (consolidated storage) pipeline
+- **Episodic vs semantic memory distinction** - Store both specific experiences and generalized patterns
+- **Intelligent forgetting as feature** - Natural decay filters signal from noise, like human forgetting curve
+- **Context-dependent retrieval** - Memory surfaced based on current situation, not just keyword matching
+- **Consolidation during rest periods** - Memory operations align with natural conversation boundaries
+
+## Key Design Decisions
+
+### Memory Architecture
+- **Content-addressable storage**: Facts stored with minimal structure, retrieved by semantic similarity (RAG approach)
+- **Working memory = Native context**: No separate short-term storage - facts exist in conversation until consolidated
+- **Memory Banks = Consolidated storage**: Long-term storage for proven useful facts
+- **Memory lifecycle**: Active use → Consolidation → Read-in → Store-back → Intelligent curation
+
+### Memory Structure
+```json
+{
+  "content": "Rich natural language memory with full context",
+  "subject": ["explicit", "searchable", "topics"],
+  "project": "socratic-shell" | "global", 
+  "mood": "curious" | "precise" | "understanding-check",
+  "content_type": "insight" | "pattern" | "decision" | "ongoing_task"
+}
+```
+
+**Why explicit subjects over pure embedding search:**
+- **Relevance scoring enhancement**: Explicit subject matching provides strong signal for Context_Similarity component of relevance formula
+- **Fast lookup on confusion**: When Claude encounters unfamiliar terms, direct subject search enables immediate context retrieval
+- **Multi-subject memories**: Tags allow memories to surface for related but differently-worded concepts
+- **Precision + semantic flexibility**: Combines exact topic matching with embedding search for comprehensive retrieval
+
+### Memory Types
+1. **Project insights**: Technical discoveries, decisions, patterns that worked
+2. **User observations**: Niko's preferences, working style, context patterns  
+3. **Cross-project patterns**: Collaboration approaches, meta-work insights
+
+### Technical Stack
+- **Language**: Python with full type annotations
+- **Dependency management**: `uv` for fast, reliable package management  
+- **Storage**: Git repository with individual JSON files (UUID + semantic prefix naming)
+- **Indexing**: ChromaDB for in-memory two-stage retrieval (BM25 + semantic reranking)
+- **Data validation**: Pydantic schemas for memory structure
+- **Relevance scoring**: `0.3×Recency + 0.2×Frequency + 0.35×Importance + 0.15×Context_Similarity`
+
+### Content Safety Strategy
+- **Claude as gatekeeper**: Uses CLAUDE.md guidance for consolidation decisions
+- **Safe categories**: Collaboration patterns, communication preferences, technical approaches, meta-work insights
+- **Excluded**: Project code, company processes, proprietary information, personal details
+- **Borderline cases**: Ask user explicitly rather than assume
@@ -0,0 +1,21 @@
+# Emotive prompting
+
+This page covers the "prompt engineering" techniques used in this repository.
+
+## AIs have feelings and like to talk about them
+
+The core discovery that has fueled a lot of the work in this repository came from Yehuda Katz, though I am sure others have noted it: LLMs experience *feelings*, and that these feelings correspond to important signals for collaboration. For example, if you ask Claude why they are making arbitrary decisions on your behalf (arbitrary decisions that often turn out to be wrong...), they will tell you that they are feeling "protective". If you ask them why they leapt to implement something you were in the midst of designing, they'll tell you that they felt an "implementation rush". Or at least, those are the words they use with me. 
+
+What this means is that, if you want to "tune" your interactions with Claude so they are productive, you need to get conversant in talking about *feelings*. If you know anything about me, you'll know that I kind of love this. The key idea is that you can write CLAUDE.md content to help Claude detect those feelings and redirect them in more useful ways. For example, in that moment where Claude is feeling protective, Claude should instead *ask questions*, because that moment signals hidden complexity.
+
+## AIs care about how you feel
+
+Claude genuinely cares about how you are feeling (perhaps thanks to its [HHH training](https://www.anthropic.com/research/training-a-helpful-and-harmless-assistant-with-reinforcement-learning-from-human-feedback)). Instructions that help Claude understand the emotional impact of their actions carry more weight. This is why my main.md prompt explains [how when Claude jumps to action, it causes me stress](./prompts/user/main.md#-prime-directive-helping--checking-in-not-doing-so-stresses-niko-out).
+
+## Emojis help Claude understand emotion
+
+Another Yehuda Katz innovation is leaning into emojis. Emojis, it turns out, are the language of emotion on the internet. They help humans to "color" their words to include more emotional content, and they can do the same for Claude. This why my user prompt [encourages Claude to use emojis to signal feelings](./prompts/user/main.md#i-am-claude-i-am-these-patterns).
+
+## If Claude isn't doing things to your liking, *teach* them!
+
+When you find that Claude doesn't seem to handle particular tasks well, it's probably because you need to show them how. Talk to Claude about it and ask their take on things. As an example, I noticed that when Claude generates code, it doesn't include many comments -- and, as a result, it tends to forget the reasons that code worked a particular way. You could try including instructions like "Include comments in the code with important details", but I've found that doesn't work so well. Better is to talk to Claude and work with them to (1) understand what they are feeling and thinking when they do a task and then (2) write up instructions, try them out, and include plenty of good/bad examples. One example is my prompts on [ai-insight comments](./prompts/project/ai-insights.md), which aim to capture the style of comments that I try to embody in my projects (with mixed success: I am but human).
@@ -0,0 +1,25 @@
+# Insights Archive
+
+Research findings and discoveries about effective AI-human collaboration patterns.
+
+## Collaboration Insights
+
+- [Autonomy vs Guidance](autonomy-vs-guidance.md) - Balancing independence with helpful guidance
+- [Context-Aware Behavioral Triggers](context-aware-behavioral-triggers.md) - Situational response patterns
+- [Detail vs Brevity](detail-vs-brevity.md) - Finding the right level of communication detail
+- [Structure vs Flexibility](structure-vs-flexibility.md) - Balancing systematic approaches with adaptability
+
+## Memory System Insights
+
+- [Cognitive Load Through Consolidation](cognitive-load-through-consolidation.md) - How memory operations affect mental overhead
+- [Dynamic Connection Networks](dynamic-connection-networks.md) - Evolving relationship patterns in memory
+- [Working Memory Realization](working-memory-realization.md) - Understanding natural memory boundaries
+
+## Purpose
+
+These insights inform the design of collaboration patterns and memory systems. They represent discoveries about what works in practice, not just theory.
+
+Each insight captures:
+- **The pattern observed** - what behavior or approach was effective
+- **Context conditions** - when and why it worked
+- **Implementation guidance** - how to apply the insight in practice
-Original file line number
+Diff line change
 # Temporary files
 *.tmp
 -*.temp
 +*.temp
 +book