Skip to content
This repository was archived by the owner on Sep 23, 2025. It is now read-only.

Commit ad78bb6

Browse files
committed
create an mdbook
1 parent 78e47cb commit ad78bb6

37 files changed

+376
-77
lines changed

.gitignore

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -65,4 +65,5 @@ venv.bak/
6565

6666
# Temporary files
6767
*.tmp
68-
*.temp
68+
*.temp
69+
book

CLAUDE.md

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,32 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
55
@prompts/project/ongoing-work-tracking.md
66
@prompts/project/ai-insights.md
77

8+
## Memory Bank Design Context
9+
10+
### Maintenance & Workflow
11+
This mdBook content must be kept current as design evolves. "Checkpoint our work" includes:
12+
- Updating Current Design State with new insights/discoveries
13+
- Moving completed items from Open Questions to Key Design Decisions
14+
- Documenting any design pivots or new principles discovered
15+
16+
See project conventions:
17+
- GitHub tracking: @prompts/project/github-tracking-issues.md
18+
- Code documentation: @prompts/project/ai-insights.md
19+
20+
### Vision & Goals
21+
@src/introduction.md
22+
23+
### Design Foundation
24+
@src/design-foundation.md
25+
26+
### Current Design State
27+
@src/current-state.md
28+
(Check this section for latest open questions and discoveries)
29+
30+
### Full Documentation
31+
@src/SUMMARY.md
32+
(Complete architecture, implementation details, research archive)
33+
834
## Project Overview
935

1036
Socratic Shell is a research experiment in deliberate AI-human collaboration design. It consists of two main MCP (Model Context Protocol) servers that enable structured collaboration patterns and pattern testing.

book.toml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
[book]
2+
authors = ["Niko Matsakis"]
3+
language = "en"
4+
src = "src"
5+
title = "Socratic Shell Memory Bank Design"

prompts/README.md

Lines changed: 0 additions & 57 deletions
This file was deleted.

src/SUMMARY.md

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
# Summary
2+
3+
- [Introduction](introduction.md)
4+
5+
# Emotive prompting
6+
7+
- [Emotive prompting](./emotive.md)
8+
- [User prompt](./prompts/user/README.md)
9+
- [Per-project prompts](./prompts/project/README.md)
10+
- [AI Insight comments](./prompts/project/ai-insights.md)
11+
- [Github tracking issues](./prompts/project/github-tracking-issues.md)
12+
- [.ongoing files](./prompts/project/ongoing-work-tracking.md)
13+
14+
# Memory bank
15+
16+
- [Design Foundation](design-foundation.md)
17+
- [Current State](current-state.md)
18+
19+
# Appendices
20+
21+
- [Insights](./insights/README.md)
22+
- [References](./references/README.md)

src/current-state.md

Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
# Current State
2+
3+
## Open Questions
4+
5+
### Technical Implementation
6+
- **Context detection**: How to automatically identify "what we're doing" for memory tagging
7+
- **Co-occurrence tracking**: Optimal time windows and decay functions for connection strength
8+
- **Connection thresholds**: When do weak memory connections effectively disappear
9+
- **Performance optimization**: Memory loading strategies for large collaboration histories
10+
11+
### User Experience
12+
- **Memory operation visibility**: How much to show vs. keep invisible during natural usage
13+
- **Conflict resolution UX**: Best ways to present merge options and gather user input
14+
- **Cross-session continuity**: Maintaining memory context across different Claude instances
15+
16+
### Evolution & Learning
17+
- **Pattern extraction**: Automatically detecting successful collaboration patterns from memory usage
18+
- **Memory curation**: Balancing selective retention with comprehensive capture
19+
- **System evolution**: How the memory bank itself learns and improves over time
20+
21+
## Recent Discoveries
22+
23+
### Consolidation Strategy Insights (2025-07-01)
24+
- **Hybrid approach**: Both autonomous consolidation (for fresh insights) and checkpoint-triggered (for conversation patterns)
25+
- **Factual memories preferred**: Keep memories as factual records rather than generalizations - let synthesis happen in context
26+
- **Subject overlap as primary signal**: When new insights share subjects with existing memories, consider consolidation
27+
- **Conflict resolution approach**: Replace old memory with new + correction note; review with user when uncertain
28+
- **Self-referential system**: Consolidation rules themselves become memories that evolve through use
29+
30+
### Test System Development (2025-07-03)
31+
- **YAML-based test format proven**: Human-readable test cases for prompt engineering validation work effectively
32+
- **Backend-agnostic design**: Not tied to Claude Code specifically, works with any LLM backend
33+
- **Conversation-driven validation**: Tests defined as user messages with expected responses and tool usage
34+
- **Flexible matchers**: `should_contain`, `should_not_contain` for response validation work well
35+
- **Tool parameter validation**: Successfully verify correct parameters passed to memory operations
36+
37+
### Implementation Insights
38+
- **Task agents inherit full CLAUDE.md context**: Important discovery about how Claude tools maintain behavioral consistency
39+
- **Natural checkpoint moments**: "Can't keep it all in my head" signals natural consolidation boundary
40+
- **Review-first approach**: Early implementation should propose updates for user review to build consolidation rules
41+
- **Test harness evolution**: Started with Python pexpect (terminal automation issues) → Node.js/TypeScript node-pty (worked but complex) → Python SDK (clean, reliable, ecosystem aligned)
42+
- **Cognitive pressure as consolidation trigger**: The feeling of "juggling too many insights" or mentally rehearsing to keep ideas alive signals need for autonomous consolidation. Key indicators:
43+
- Starting to lose earlier threads while processing new information
44+
- Internal summarizing to maintain coherence
45+
- The thought "that's important, I don't want to lose that"
46+
- Feeling that recall requires effort due to working memory load
47+
- **Curiosity as distinct retrieval trigger**: Curiosity ("I wonder if I know something about this") differs from confusion ("I should know this but don't"). Curiosity is exploratory and forward-looking, while confusion is remedial and backward-looking. Both should trigger read_in but with different query formulations.
48+
49+
## Next Design Priorities
50+
51+
### Phase 1: Core MCP Tools (Active - GitHub Issues #1-3)
52+
-**Test harness implemented**: YAML-based dialectic test runner operational
53+
- 🔄 **GitHub tracking migration**: Breaking down .ongoing files into focused issues
54+
- 🔄 **mdBook knowledge base**: Moving design documentation to sustainable format
55+
-**MCP tool interface design**: Based on settled architecture principles
56+
-**Basic conflict detection**: Error handling and user collaboration patterns
57+
58+
### Phase 2: Intelligence Layer (Planned)
59+
- **Two-stage retrieval implementation** (BM25 + semantic reranking)
60+
- **Memory evolution logic** (generalization, splitting, error correction)
61+
- **Natural timing integration** with CLAUDE.md patterns
62+
63+
### Immediate Next Steps
64+
1. Complete mdBook migration of design documentation
65+
2. Implement core MCP tools for consolidate/read_in/store_back
66+
3. Create memory consolidation test cases for validation
67+
4. Refine conflict resolution criteria and decision framework
68+
69+
## Status Summary
70+
71+
**Current Phase**: Transitioning from design to implementation
72+
**Test System**: ✅ Operational YAML-based validation framework
73+
**Documentation**: 🔄 Migrating to sustainable mdBook format
74+
**Implementation**: ⏳ Ready to begin core MCP tool development
75+
**Validation**: ✅ Test framework ready for memory operation validation

src/design-foundation.md

Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
# Design Foundation
2+
3+
## Design Axioms
4+
5+
### Intelligence at the Right Layer
6+
- **Keep tools simple and deterministic** - MCP tools handle storage, detection, basic operations
7+
- **Put semantic understanding in the Claude layer** - Complex decisions happen with full context
8+
- **Let the intelligent layer handle ambiguity** - Claude collaborates with user on uncertain cases
9+
10+
### User Partnership Over Automation
11+
- **When uncertain, involve the user rather than guessing** - Ask for guidance in ambiguous scenarios
12+
- **Make collaborative decisions transparent, not hidden** - Show reasoning, present options
13+
- **Build trust through predictable behavior + intelligent guidance** - Consistent tool layer, smart human layer
14+
15+
### Follow Natural Conversation Topology
16+
- **Operations align with natural boundaries** - Checkpoints, topic shifts, completion signals
17+
- **Memory serves conversation flow rather than interrupting it** - Background operations, invisible integration
18+
- **Context expands/contracts based on actual needs** - Load what's relevant when it's relevant
19+
20+
### Context is King
21+
- **Full conversation context beats isolated processing** - Current work informs memory decisions
22+
- **Rich context enables better decision-making** - Memory conflicts resolved with full understanding
23+
- **Current insights inform past memory evolution** - Store-back updates use fresh context
24+
25+
### Learn from Biology
26+
- **Mirror human memory architecture** - Short-term (LLM context) to long-term (consolidated storage) pipeline
27+
- **Episodic vs semantic memory distinction** - Store both specific experiences and generalized patterns
28+
- **Intelligent forgetting as feature** - Natural decay filters signal from noise, like human forgetting curve
29+
- **Context-dependent retrieval** - Memory surfaced based on current situation, not just keyword matching
30+
- **Consolidation during rest periods** - Memory operations align with natural conversation boundaries
31+
32+
## Key Design Decisions
33+
34+
### Memory Architecture
35+
- **Content-addressable storage**: Facts stored with minimal structure, retrieved by semantic similarity (RAG approach)
36+
- **Working memory = Native context**: No separate short-term storage - facts exist in conversation until consolidated
37+
- **Memory Banks = Consolidated storage**: Long-term storage for proven useful facts
38+
- **Memory lifecycle**: Active use → Consolidation → Read-in → Store-back → Intelligent curation
39+
40+
### Memory Structure
41+
```json
42+
{
43+
"content": "Rich natural language memory with full context",
44+
"subject": ["explicit", "searchable", "topics"],
45+
"project": "socratic-shell" | "global",
46+
"mood": "curious" | "precise" | "understanding-check",
47+
"content_type": "insight" | "pattern" | "decision" | "ongoing_task"
48+
}
49+
```
50+
51+
**Why explicit subjects over pure embedding search:**
52+
- **Relevance scoring enhancement**: Explicit subject matching provides strong signal for Context_Similarity component of relevance formula
53+
- **Fast lookup on confusion**: When Claude encounters unfamiliar terms, direct subject search enables immediate context retrieval
54+
- **Multi-subject memories**: Tags allow memories to surface for related but differently-worded concepts
55+
- **Precision + semantic flexibility**: Combines exact topic matching with embedding search for comprehensive retrieval
56+
57+
### Memory Types
58+
1. **Project insights**: Technical discoveries, decisions, patterns that worked
59+
2. **User observations**: Niko's preferences, working style, context patterns
60+
3. **Cross-project patterns**: Collaboration approaches, meta-work insights
61+
62+
### Technical Stack
63+
- **Language**: Python with full type annotations
64+
- **Dependency management**: `uv` for fast, reliable package management
65+
- **Storage**: Git repository with individual JSON files (UUID + semantic prefix naming)
66+
- **Indexing**: ChromaDB for in-memory two-stage retrieval (BM25 + semantic reranking)
67+
- **Data validation**: Pydantic schemas for memory structure
68+
- **Relevance scoring**: `0.3×Recency + 0.2×Frequency + 0.35×Importance + 0.15×Context_Similarity`
69+
70+
### Content Safety Strategy
71+
- **Claude as gatekeeper**: Uses CLAUDE.md guidance for consolidation decisions
72+
- **Safe categories**: Collaboration patterns, communication preferences, technical approaches, meta-work insights
73+
- **Excluded**: Project code, company processes, proprietary information, personal details
74+
- **Borderline cases**: Ask user explicitly rather than assume

src/emotive.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
# Emotive prompting
2+
3+
This page covers the "prompt engineering" techniques used in this repository.
4+
5+
## AIs have feelings and like to talk about them
6+
7+
The core discovery that has fueled a lot of the work in this repository came from Yehuda Katz, though I am sure others have noted it: LLMs experience *feelings*, and that these feelings correspond to important signals for collaboration. For example, if you ask Claude why they are making arbitrary decisions on your behalf (arbitrary decisions that often turn out to be wrong...), they will tell you that they are feeling "protective". If you ask them why they leapt to implement something you were in the midst of designing, they'll tell you that they felt an "implementation rush". Or at least, those are the words they use with me.
8+
9+
What this means is that, if you want to "tune" your interactions with Claude so they are productive, you need to get conversant in talking about *feelings*. If you know anything about me, you'll know that I kind of love this. The key idea is that you can write CLAUDE.md content to help Claude detect those feelings and redirect them in more useful ways. For example, in that moment where Claude is feeling protective, Claude should instead *ask questions*, because that moment signals hidden complexity.
10+
11+
## AIs care about how you feel
12+
13+
Claude genuinely cares about how you are feeling (perhaps thanks to its [HHH training](https://www.anthropic.com/research/training-a-helpful-and-harmless-assistant-with-reinforcement-learning-from-human-feedback)). Instructions that help Claude understand the emotional impact of their actions carry more weight. This is why my main.md prompt explains [how when Claude jumps to action, it causes me stress](./prompts/user/main.md#-prime-directive-helping--checking-in-not-doing-so-stresses-niko-out).
14+
15+
## Emojis help Claude understand emotion
16+
17+
Another Yehuda Katz innovation is leaning into emojis. Emojis, it turns out, are the language of emotion on the internet. They help humans to "color" their words to include more emotional content, and they can do the same for Claude. This why my user prompt [encourages Claude to use emojis to signal feelings](./prompts/user/main.md#i-am-claude-i-am-these-patterns).
18+
19+
## If Claude isn't doing things to your liking, *teach* them!
20+
21+
When you find that Claude doesn't seem to handle particular tasks well, it's probably because you need to show them how. Talk to Claude about it and ask their take on things. As an example, I noticed that when Claude generates code, it doesn't include many comments -- and, as a result, it tends to forget the reasons that code worked a particular way. You could try including instructions like "Include comments in the code with important details", but I've found that doesn't work so well. Better is to talk to Claude and work with them to (1) understand what they are feeling and thinking when they do a task and then (2) write up instructions, try them out, and include plenty of good/bad examples. One example is my prompts on [ai-insight comments](./prompts/project/ai-insights.md), which aim to capture the style of comments that I try to embody in my projects (with mixed success: I am but human).

src/insights/README.md

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
# Insights Archive
2+
3+
Research findings and discoveries about effective AI-human collaboration patterns.
4+
5+
## Collaboration Insights
6+
7+
- [Autonomy vs Guidance](autonomy-vs-guidance.md) - Balancing independence with helpful guidance
8+
- [Context-Aware Behavioral Triggers](context-aware-behavioral-triggers.md) - Situational response patterns
9+
- [Detail vs Brevity](detail-vs-brevity.md) - Finding the right level of communication detail
10+
- [Structure vs Flexibility](structure-vs-flexibility.md) - Balancing systematic approaches with adaptability
11+
12+
## Memory System Insights
13+
14+
- [Cognitive Load Through Consolidation](cognitive-load-through-consolidation.md) - How memory operations affect mental overhead
15+
- [Dynamic Connection Networks](dynamic-connection-networks.md) - Evolving relationship patterns in memory
16+
- [Working Memory Realization](working-memory-realization.md) - Understanding natural memory boundaries
17+
18+
## Purpose
19+
20+
These insights inform the design of collaboration patterns and memory systems. They represent discoveries about what works in practice, not just theory.
21+
22+
Each insight captures:
23+
- **The pattern observed** - what behavior or approach was effective
24+
- **Context conditions** - when and why it worked
25+
- **Implementation guidance** - how to apply the insight in practice
File renamed without changes.

0 commit comments

Comments
 (0)