Skip to content

Latest commit

 

History

History
188 lines (135 loc) · 5.46 KB

File metadata and controls

188 lines (135 loc) · 5.46 KB

Elasticsearch Migration Guide

🎯 Migration Overview

Your Zapfeed project has been successfully migrated from TiDB Serverless to Elasticsearch to participate in the AI Accelerate Hackathon (Elastic Challenge).

What Changed

BEFORE:

  • TiDB Serverless with vector search
  • Vector embeddings stored in MySQL tables
  • Direct SQL vector distance queries

AFTER:

  • Standard MySQL database (via Prisma)
  • Elasticsearch for vector search and hybrid search
  • Enhanced search capabilities with keyword + semantic search

🚀 Quick Setup

1. Install Dependencies

# Run the automated installer
node scripts/install-dependencies.js

# Or manually:
npm install @elastic/elasticsearch@^8.11.0
npm uninstall @tidbcloud/prisma-adapter @tidbcloud/serverless

2. Update Database Schema

# Generate new Prisma client (vector fields removed)
npx prisma generate

# Create migration (if needed)
npx prisma migrate dev --name remove-embedded-documents

3. Configure Elasticsearch

Update your .env file with Elasticsearch configuration:

# Option 1: Elastic Cloud (Recommended for hackathon)
ELASTICSEARCH_CLOUD_ID=your_cloud_id_here
ELASTICSEARCH_API_KEY=your_api_key_here

# Option 2: Self-hosted
ELASTICSEARCH_URL=http://localhost:9200
ELASTICSEARCH_USERNAME=elastic
ELASTICSEARCH_PASSWORD=your_password

4. Setup Elastic Cloud (Recommended)

  1. Go to https://cloud.elastic.co/
  2. Create a free account (14-day trial)
  3. Create a new deployment
  4. Copy your Cloud ID and create an API Key
  5. Add them to your .env file

🔧 Key Architecture Changes

Database Service (lib/database.ts)

  • Standard Prisma client connection
  • No more TiDB adapters

Elasticsearch Service (lib/elasticsearch.ts)

  • Vector storage and search
  • Hybrid search (keyword + semantic)
  • Auto-index creation
  • Team-based filtering

Updated API Routes

  • app/api/chat/route.ts - Uses hybrid search for RAG
  • app/api/feedback/collect/route.ts - Stores embeddings in Elasticsearch
  • All routes now use standard MySQL connection

🎨 Hackathon Features Unlocked

✅ Elastic Challenge Requirements Met:

  1. Hybrid Search: Combines keyword and vector search
  2. Google Cloud Integration: Works with Vertex AI/Gemini embeddings
  3. Conversational AI: Enhanced RAG with better search results
  4. Agent-based Solution: Smart feedback analysis system

🚀 Enhanced Capabilities:

  • Better Search Relevance: Hybrid scoring algorithms
  • Real-time Indexing: Immediate search availability
  • Scalable Architecture: Elasticsearch clustering support
  • Rich Analytics: Built-in aggregations and metrics

🧪 Testing the Migration

1. Start Your Application

npm run dev

2. Test Feedback Collection

  1. Go to your feedback collection page
  2. Submit some feedback
  3. Check if embeddings are being indexed in Elasticsearch

3. Test AI Chat

  1. Go to the chat interface
  2. Ask questions about feedback
  3. Verify hybrid search is working (should find relevant context)

🔍 Verification Commands

# Check Elasticsearch connection
curl -X GET "localhost:9200/_cluster/health?pretty"

# Check index creation
curl -X GET "localhost:9200/zapfeed-documents/_mapping?pretty"

# Search documents
curl -X GET "localhost:9200/zapfeed-documents/_search?pretty"

🚨 Common Issues & Solutions

Issue: "Cannot find module '@elastic/elasticsearch'"

Solution: Run npm install @elastic/elasticsearch@^8.11.0

Issue: "Elasticsearch connection failed"

Solutions:

  1. Check your ELASTICSEARCH_CLOUD_ID and ELASTICSEARCH_API_KEY
  2. Verify network connectivity
  3. Try the self-hosted option with Docker:
    docker run -d --name elasticsearch -p 9200:9200 -e "discovery.type=single-node" -e "xpack.security.enabled=false" docker.elastic.co/elasticsearch/elasticsearch:8.11.0

Issue: Prisma client errors

Solution:

npx prisma generate --force
npx prisma migrate reset # ⚠️ This deletes data

🏆 Hackathon Submission Benefits

Demo Points to Highlight:

  1. Intelligent Search: Show how hybrid search finds relevant feedback better than basic keyword search
  2. Google Cloud AI Integration: Demonstrate Gemini embeddings working with Elasticsearch
  3. Real-time Analytics: Show immediate feedback processing and searchability
  4. Conversational Interface: Highlight the AI chat that understands context

Performance Improvements:

  • Faster Search: Elasticsearch optimized for search workloads
  • Better Relevance: Hybrid scoring combines semantic and keyword matching
  • Scalability: Can handle millions of documents
  • Cost Efficiency: No TiDB costs, Gemini free tier usage

📈 Next Steps

  1. Deploy to Production: Use Elastic Cloud for hosting
  2. Add More Features:
    • Aggregation queries for analytics
    • Real-time feedback monitoring
    • Advanced search filters
  3. Optimize Performance: Fine-tune search parameters
  4. Monitor Usage: Set up Elasticsearch monitoring

🎯 Hackathon Judging Criteria Alignment

  • Innovation: Novel hybrid search for feedback analysis
  • Technical Excellence: Proper Elasticsearch integration with Google Cloud AI
  • User Experience: Enhanced search and chat capabilities
  • Business Impact: Better feedback insights for businesses
  • Use of Partner Technology: Full Elasticsearch feature utilization

Good luck with your hackathon submission! 🚀