Skip to content
This repository was archived by the owner on Sep 23, 2025. It is now read-only.

Commit 1305e67

Browse files
committed
Add Hippo MVP design document
- AI-generated insights with reinforcement learning approach - Consolidation-moment generation for better quality - Importance-weighted scoring with temporal decay - Clean data model with separate content/score timestamps - Ready for implementation phase
1 parent 03993f1 commit 1305e67

File tree

1 file changed

+199
-0
lines changed

1 file changed

+199
-0
lines changed

src/hippo/design-doc.md

Lines changed: 199 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,199 @@
1+
# Hippo MVP Design Document
2+
3+
*AI-Generated Salient Insights - Minimal Viable Prototype*
4+
5+
## Core Hypothesis
6+
7+
**Can AI-generated insights + reinforcement learning actually surface more valuable knowledge than traditional memory systems?**
8+
9+
The key insight: Generate insights cheaply and frequently, let natural selection through reinforcement determine what survives.
10+
11+
## MVP Scope
12+
13+
### What It Does
14+
1. **Automatic Insight Generation**: AI generates insights continuously during conversation at natural moments (consolidation, "make it so", "ah-ha!" moments, pattern recognition)
15+
2. **Simple Storage**: Single JSON file with configurable path
16+
3. **Natural Decay**: Insights lose relevance over time unless reinforced
17+
4. **Reinforcement**: During consolidation moments, user can upvote/downvote insights
18+
5. **Context-Aware Search**: Retrieval considers both content and situational context with fuzzy matching
19+
20+
### What It Doesn't Do (Yet)
21+
- Graph connections between insights
22+
- Complex reinforcement algorithms
23+
- Cross-session learning
24+
- Memory hierarchy (generic vs project-specific)
25+
- Automatic insight detection triggers
26+
27+
## Data Model
28+
29+
```json
30+
{
31+
"insights": [
32+
{
33+
"uuid": "abc123-def456-789",
34+
"content": "User prefers dialogue format over instruction lists for collaboration prompts",
35+
"context": "design discussion about hippo",
36+
"importance": 0.7,
37+
"created_at": "2025-07-23T17:00:00Z",
38+
"content_last_modified_at": "2025-07-23T17:00:00Z",
39+
"score_at_last_change": 1.0,
40+
"score_last_modified_at": "2025-07-23T17:00:00Z"
41+
}
42+
]
43+
}
44+
```
45+
46+
### Field Semantics
47+
48+
- **created_at**: When the insight was first generated (never changes)
49+
- **content_last_modified_at**: When the content or context was last edited
50+
- **importance**: AI-generated 0-1 rating of insight significance (set at creation)
51+
- **score_at_last_change**: The score when it was last modified (starts at 1.0)
52+
- **score_last_modified_at**: When the score was last explicitly changed (upvote/downvote)
53+
54+
### Score Computation
55+
56+
Current score computed on-demand: `(score_at_last_change * importance) * (0.9 ^ days_since_score_last_modified)`
57+
58+
#### Score Evolution Examples
59+
60+
```
61+
Day 0: Insight created → score_at_last_change = 1.0, last_change_date = today
62+
Day 3: Current score = 1.0 * 0.9³ = 0.729 (computed on-demand)
63+
Day 3: User upvotes → score_at_last_change = 0.729 * 2.0 = 1.458, last_change_date = today
64+
Day 7: Current score = 1.458 * 0.9⁴ = 0.953 (computed on-demand)
65+
Day 7: User downvotes → score_at_last_change = 0.953 * 0.1 = 0.095, last_change_date = today
66+
```
67+
68+
#### Score Interpretation
69+
70+
- **> 1.0**: Reinforced insights that have proven valuable
71+
- **0.5 - 1.0**: Recent insights or those aging naturally
72+
- **< 0.5**: Old insights that haven't been reinforced
73+
- **< 0.1**: Effectively irrelevant, candidates for cleanup
74+
75+
#### Search Ranking
76+
77+
Current score (computed on-demand) is a primary factor in search results:
78+
- Higher scores surface first
79+
- Combined with content/context match quality
80+
- Provides natural filtering of stale insights
81+
82+
## Key Design Decisions
83+
84+
### Insight Generation Triggers
85+
- **Consolidation moments only** - not continuous during conversation
86+
- **Specific triggers**: "make it so" moments, explicit checkpointing, end of substantial conversations
87+
- **Reflective approach** - generate with full session context for better importance assessment
88+
89+
### Context Design
90+
- **Situational context** rather than thematic categories
91+
- Examples: "design discussion about hippo", "debugging React performance issues", "code review of authentication system"
92+
- **Fuzzy matching** - "debugging Rust performance" should surface insights from "debugging React performance"
93+
94+
### Reinforcement Mechanism
95+
- **Consolidation moments** are primary reinforcement opportunities
96+
- **Simple feedback**: upvote (boost score + refresh timestamp) or downvote (accelerate decay)
97+
- **Ignore** = natural aging continues
98+
99+
### Storage
100+
- **Single file**: `hippo.json` with `--path` command line argument
101+
- **MCP tool interface** - AI uses automatically, no manual commands needed
102+
- **JSON format** for simplicity in MVP
103+
104+
## Technical Architecture
105+
106+
### Core Operations
107+
```
108+
record_insight(content, context) → uuid
109+
search_insights(query, context_filter?) → List[InsightResult]
110+
reinforce_insight(uuid, feedback: upvote|downvote)
111+
decay_insights() → updates all scores
112+
```
113+
114+
### Decay Function (Simple)
115+
```
116+
score = score * (0.9 ^ days_since_last_reinforcement)
117+
```
118+
119+
### Search Algorithm
120+
1. **Content matching** - substring/similarity on insight content
121+
2. **Context matching** - fuzzy matching on situational context
122+
3. **Relevance scoring** - combine content match + context match + current score
123+
4. **Partial context bonus** - "debugging X" matches "debugging Y" with medium relevance
124+
125+
## Integration with Collaborative Patterns
126+
127+
### Insight Generation Moments
128+
- **"Make it so" moments** - decisions and consolidations
129+
- **Problem solving** - when we figure something out
130+
- **Pattern recognition** - when AI notices recurring themes
131+
- **Contradictions** - when new information challenges previous insights
132+
- **Meta moments** - observations about our collaboration itself
133+
134+
### Consolidation Workflow
135+
1. AI surfaces recent insights from current session
136+
2. User provides upvote/downvote feedback
137+
3. AI applies reinforcement and continues
138+
4. No explicit commands needed - part of natural flow
139+
140+
## Success Metrics
141+
142+
### Validation Questions
143+
- Do reinforced insights get referenced in future conversations?
144+
- Do reinforced insights feel more relevant than random historical ones?
145+
- Does the system surface useful knowledge that would otherwise be forgotten?
146+
- Is the insight generation frequency appropriate (not too noisy, not too sparse)?
147+
148+
### Measurable Outcomes
149+
- **Reference rate**: How often do we actually use surfaced insights?
150+
- **Reinforcement patterns**: Which types of insights get consistently upvoted?
151+
- **Search effectiveness**: Do context-based searches return relevant results?
152+
153+
## Implementation Plan
154+
155+
### Phase 1: Basic Infrastructure
156+
- JSON storage with decay function
157+
- MCP tool for record/search/reinforce operations
158+
- Command line interface for testing
159+
160+
### Phase 2: AI Integration
161+
- Automatic insight generation during conversations
162+
- Integration with consolidation moments
163+
- Real-time storage via MCP
164+
165+
### Phase 3: Validation Period
166+
- 2-3 weeks of actual usage in collaboration
167+
- Collect metrics on insight utility
168+
- Refine generation triggers and reinforcement
169+
170+
## Future Extensions (Post-MVP)
171+
172+
### Memory Hierarchy
173+
```
174+
hippo-generic.json # User collaboration patterns
175+
hippo-socratic-shell.json # Project-specific insights
176+
hippo-rust-blog.json # Domain-specific insights
177+
```
178+
179+
### Graph Connections
180+
- Insights that appear together in consolidation
181+
- Causal relationships (A led to B)
182+
- Contradictory relationships (A replaced by B)
183+
184+
### Advanced Reinforcement
185+
- Weak reinforcement from search/reference
186+
- Cross-session learning
187+
- Predictive surfacing based on current context
188+
189+
## Open Questions
190+
191+
1. **Generation frequency**: How many insights per conversation is optimal?
192+
2. **Context granularity**: How specific should contexts be?
193+
3. **Decay rate**: Is 10% per day the right decay function?
194+
4. **Reinforcement scaling**: How much should upvotes boost scores?
195+
5. **Search ranking**: How to balance content vs context vs recency in results?
196+
197+
---
198+
199+
*The goal is to validate whether AI-generated insights with reinforcement learning can create a more useful memory system than traditional human-curated approaches. The MVP focuses on the core feedback loop: generate → decay → reinforce → surface.*

0 commit comments

Comments
 (0)