lytics
diff --git a/‎PLAN.md‎
Lines changed: 223 additions & 1 deletion b/‎PLAN.md‎
Lines changed: 223 additions & 1 deletion
diff --git a/‎packages/core/src/indexer.ts‎
Lines changed: 8 additions & 0 deletions b/‎packages/core/src/indexer.ts‎
Lines changed: 8 additions & 0 deletions
@@ -240,6 +240,40 @@ Git history is valuable context that LLMs can't easily access. We add intelligen
 
 ---
 
+## Current: Performance & Reliability (v0.6.x - v0.7.x)
+
+> Critical high-impact improvements for production readiness and user experience.
+
+**Epic:** #104 (Progress: 6/9 complete)
+
+### Completed Improvements ✅
+
+| Feature | Status | Version | Impact |
+|---------|--------|---------|--------|
+| Index size reporting | ✅ Done | v0.4.3 | Track disk usage growth |
+| Adaptive concurrency | ✅ Done | v0.6.0 | Auto-detect optimal batch size by CPU/memory |
+| Incremental indexing | ✅ Done | v0.5.1 | <30s updates for single file changes (#122) |
+| Progress indicators | ✅ Done | v0.1.0 | Real-time feedback for long operations |
+| Error handling | ✅ Done | v0.3.0 | Graceful degradation |
+| Basic validation | ✅ Done | v0.2.0 | Git repo and path checks |
+
+### Remaining Work 🔄
+
+| Issue | Priority | Impact | Status |
+|-------|----------|--------|--------|
+| #152 - MCP lazy initialization | P0 | Reduce startup from 2-5s to <500ms | 🔲 Todo |
+| #153 - GitHub history in planner | P0 | Add commit context to AI plans | 🔲 Todo |
+| #154 - Memory monitoring | P1 | Prevent leaks, maintain <500MB usage | 🔲 Todo |
+
+**Success Metrics:**
+- ✅ Large repo indexing: <5min for 50k files
+- ✅ Incremental updates: <30s for single file changes
+- 🔲 MCP server startup: <500ms (currently 2-5s)
+- 🔲 Memory usage: <500MB steady state
+- 🔲 Planner quality: Include git history context
+
+---
+
 ## Next: Extended Git Intelligence (v0.5.0)
 
 > Building on git history with deeper insights.
@@ -277,7 +311,195 @@ Git history is valuable context that LLMs can't easily access. We add intelligen
 
 ---
 
-## Future: Extended Intelligence (v0.6+)
+## Next: Dashboard & Visualization (v0.7.1)
+
+> Making codebase insights visible and accessible.
+
+**Epic:** #145
+
+### Philosophy
+
+Dev-agent provides rich context about codebases, but it's currently text-only. A dashboard makes insights:
+- **Visible** - See language breakdown, component types, health status at a glance
+- **Interactive** - Explore relationships, drill into packages
+- **Actionable** - Identify areas needing attention
+
+### Goals
+
+1. **Enhanced CLI** (`dev dashboard`) - Terminal-based stats with rich formatting
+2. **Web Dashboard** - Next.js app with real-time insights
+3. **Data Infrastructure** - Aggregate stats during indexing for efficient display
+
+### Components
+
+| Component | Status | Priority |
+|-----------|--------|----------|
+| **CLI Enhancements** | | |
+| Language breakdown display | 🔲 Todo | 🔴 High |
+| Component type statistics | 🔲 Todo | 🔴 High |
+| Package-level stats (monorepo) | 🔲 Todo | 🔴 High |
+| Rich formatting (tables, colors) | 🔲 Todo | 🔴 High |
+| **Core Data Collection** | | |
+| Track language metrics in indexer | 🔲 Todo | 🔴 High |
+| Aggregate component type counts | 🔲 Todo | 🔴 High |
+| Package-level aggregation | 🔲 Todo | 🟡 Medium |
+| Change frequency tracking | 🔲 Todo | 🟡 Medium |
+| **Web Dashboard** | | |
+| Next.js app setup (`apps/dashboard/`) | 🔲 Todo | 🔴 High |
+| Tremor component library | 🔲 Todo | 🔴 High |
+| API routes (stats, health) | 🔲 Todo | 🔴 High |
+| Real-time stats display | 🔲 Todo | 🔴 High |
+| Language distribution charts | 🔲 Todo | 🟡 Medium |
+| Component type visualizations | 🔲 Todo | 🟡 Medium |
+| Health status indicators | 🔲 Todo | 🟡 Medium |
+| Vector index metrics (simple) | 🔲 Todo | 🟡 Medium |
+| Basic package list (monorepo) | 🔲 Todo | 🟡 Medium |
+
+### Architecture
+
+```
+apps/
+└── dashboard/          # Next.js 16 + React 19 + Tremor
+    ├── app/
+    │   ├── page.tsx    # Main dashboard
+    │   └── api/
+    │       └── stats/  # Next.js API routes
+    └── components/
+        └── tremor/     # Tremor dashboard components
+
+packages/core/
+└── src/
+    └── indexer/
+        └── stats-aggregator.ts  # New: Collect detailed stats
+```
+
+### Implementation Plan
+
+**Implementation Phases:**
+
+**Phase 1: Data Foundation**
+- Enhance IndexStats with language/component breakdowns
+- Aggregate stats during indexing (minimal overhead)
+- Foundation for all visualizations
+
+**Phase 2: CLI Enhancements**
+- Rich terminal output with tables and colors
+- Package-level breakdown for monorepos
+- Immediate user value
+
+**Phase 3: Web Dashboard**
+- Next.js 16 app in `apps/dashboard/`
+- Tremor component setup
+- Basic stats display with charts
+
+**Phase 4: Advanced Features**
+- Interactive exploration
+- Package explorer (monorepo support)
+- Real-time updates
+
+---
+
+## Next: Advanced LanceDB Visualizations (v0.7.2)
+
+> Making vector embeddings visible and explorable.
+
+### Philosophy
+
+LanceDB stores 384-dimensional embeddings for semantic search, but these are invisible to users. Advanced visualizations reveal:
+- **Where code lives** in semantic space (2D projections)
+- **What's related** beyond imports (similarity networks)
+- **How embeddings evolve** over time (drift tracking)
+- **Search quality** insights (what works, what doesn't)
+
+### Goals
+
+1. **Semantic Code Map** - 2D/3D projection of vector space
+2. **Similarity Explorer** - Interactive component relationship graph
+3. **Search Quality Dashboard** - Analyze search performance
+4. **Embedding Health** - Coverage and quality metrics per directory
+
+### Components
+
+| Component | Description | Priority |
+|-----------|-------------|----------|
+| **Semantic Code Map** | | |
+| t-SNE/UMAP projection to 2D | Visualize embedding space | 🔴 High |
+| Interactive scatter plot | Click to see code snippet | 🔴 High |
+| Color by language/type | Visual code categorization | 🟡 Medium |
+| Cluster detection | Auto-identify code groups | 🟡 Medium |
+| **Similarity Network** | | |
+| Component relationship graph | Force-directed layout | 🔴 High |
+| Semantic similarity edges | Show hidden relationships | 🔴 High |
+| Interactive exploration | Zoom, pan, filter | 🟡 Medium |
+| Duplication detection | High similarity alerts | 🟡 Medium |
+| **Search Quality** | | |
+| Search metrics dashboard | Track performance over time | 🔴 High |
+| Query similarity heatmap | Understand search patterns | 🟡 Medium |
+| "Dead zone" detection | Queries with poor results | 🟡 Medium |
+| Recommendation engine | Suggest better queries | 🟢 Low |
+| **Embedding Health** | | |
+| Coverage heatmap by directory | Identify blind spots | 🔴 High |
+| Quality scoring per file | Flag low-quality embeddings | 🟡 Medium |
+| Drift tracking over time | Monitor embedding changes | 🟡 Medium |
+| Re-index recommendations | Suggest what needs updating | 🟢 Low |
+
+### Architecture
+
+```
+Dashboard UI
+    ↓
+Advanced Viz Components (D3.js, Plotly, or similar)
+    ↓
+New API Routes
+    ├─ GET /api/embeddings/projection (t-SNE/UMAP data)
+    ├─ GET /api/embeddings/similarity (network graph)
+    ├─ GET /api/embeddings/quality (coverage metrics)
+    └─ GET /api/embeddings/search-history (query analysis)
+    ↓
+LanceDB + Vector Analysis
+    └─ Dimensionality reduction, similarity queries, metrics
+```
+
+### Dependencies
+
+**New:**
+- `umap-js` or `tsne-js` - Dimensionality reduction
+- `d3` or `@visx/visx` - Advanced visualizations
+- `react-force-graph` - Network graphs (or `sigma.js`)
+- `@tensorflow/tfjs` (optional) - Advanced vector operations
+
+### Implementation Phases
+
+**Phase 1: Semantic Code Map**
+- Implement t-SNE/UMAP projection
+- Create 2D scatter plot visualization
+- Add basic interactivity (hover, click)
+
+**Phase 2: Similarity Network**
+- Build component similarity graph
+- Implement force-directed layout
+- Add filtering and exploration
+
+**Phase 3: Search Quality**
+- Track search queries and results
+- Build metrics dashboard
+- Implement quality scoring
+
+**Phase 4: Embedding Health**
+- Coverage analysis by directory
+- Quality scoring per file
+- Drift detection system
+
+### Success Metrics
+
+- Developers can visually explore codebase semantics
+- Identify code duplication without running analysis tools
+- Understand which areas need re-indexing
+- Improve search query formulation based on insights
+
+---
+
+## Future: Extended Intelligence (v0.8+)
 
 ### Multi-Language Support
 
 
@@ -0,0 +1,8 @@
+/**
+ * Repository Indexer module exports
+ */
+
+export { RepositoryIndexer } from './indexer/index';
+export { StatsAggregator } from './indexer/stats-aggregator';
+export * from './indexer/types';
+export * from './indexer/utils';