-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Port Hippo to Rust for improved startup performance
Status: Phase 2 Complete → Phase 3 Ready
Current Understanding
Python version has ~6 second startup time (3.8s import + 1.9s model load) which is problematic for CLI usage. Research shows Rust alternatives can achieve 100-500ms cold starts - a 12-60x improvement.
Based on comprehensive research from both Claude AI and Gemini, FastEmbed-rs emerges as the clear primary choice:
- Direct all-MiniLM-L6-v2 support without conversion
- 100-500ms startup times vs current 6 seconds
- Production-proven (backed by Qdrant, used in production)
- Quantized models + ONNX Runtime for optimal performance
- Simple API matching current usage pattern
Architecture Decisions
MCP Framework: Use official Rust MCP SDK (https://github.com/modelcontextprotocol/rust-sdk) - available on crates.io as mcp-sdk
Storage Strategy: Port existing file-based storage to Rust with exact JSON compatibility to preserve existing memories
Migration Approach: Side-by-side development - Python system continues running while we develop Rust version, switch over after adequate testing validates functionality
Project Structure: Single crate in rs/ directory with both library API (src/lib.rs) and standalone server (src/main.rs)
- Library-first design for embedding into proxy MCP servers
- Standalone server capability for direct usage
- Standard Rust
tests/directory for integration tests
Testing Strategy: ✅ COMPLETE
- Unit tests for core embedding/similarity logic
- Integration tests with mock filesystem/timing (no Python comparison tests needed)
- End-to-end tests against library methods (not separate server process)
- 21/21 tests passing with innovative CI optimization approach
Implementation Phases
Phase 1: Core Prototype ✅ COMPLETE
- Set up
rs/project structure with Cargo.toml - Implement minimal FastEmbed-rs integration in
src/search.rs - Port core data models (
Insight,HippoStorage) tosrc/models.rswith serde - Validate FastEmbed-rs performance claims and semantic similarity accuracy
- Basic unit tests for embedding and similarity logic
- Comprehensive testing strategy with 21/21 tests passing
Phase 2: Storage & MCP Integration ✅ COMPLETE
- Add
mcp-sdkdependency to Cargo.toml - Implement MCP server using official Rust SDK in
src/main.rs - Port all 4 MCP tools:
hippo_record_insight,hippo_search_insights,hippo_modify_insight,hippo_reinforce_insight - Create library API in
src/lib.rsfor embedding into other servers - Add async I/O with tokio and file watching with notify crate
- Integration tests with mock filesystem and timing
- MCP Integration Tests: 3/3 passing with full tool functionality validation
Phase 3: Feature Parity & Testing ← CURRENT PHASE
- Performance benchmarking vs Python version in real usage
- Cross-compilation setup for distribution
- Documentation and migration guide
- Structured logging with tracing ecosystem (if needed)
- CLI interface with clap (if needed for standalone usage)
Phase 4: Production Deployment
- Performance validation in real usage
- Switch Q CLI configuration from Python to Rust server
- Archive Python implementation
Next Steps
Phase 3 Implementation Plan:
-
Performance Validation (1 session):
- Benchmark startup time vs Python version
- Validate memory usage and search accuracy
- Test with real Q CLI integration
-
Cross-compilation & Distribution (1 session):
- Set up cross-compilation for multiple platforms
- Create release binaries
- Update setup scripts
-
Documentation (1 session):
- Update mdbook with Rust implementation details
- Document migration path and performance improvements
- Create deployment guide
Context
Primary motivation is startup performance for CLI usage, plus simplified integration with other Rust components in Socratic Shell project. Semantic search is critical functionality that must be preserved.
Phase 2 Achievement: Full MCP server implementation with all 4 tools working correctly. Integration tests validate complete functionality including tool registration, request handling, and error management. The Rust server is feature-complete and ready for production deployment.
Research reports: