This document explains how ExecuTorch React Native has been integrated to replace the mock embedding service in the Turso database with real text embeddings.
The integration uses ExecuTorch's useTextEmbeddings hook with the ALL_MINILM_L6_V2 model to generate real text embeddings for semantic search in the medical database. This replaces the previous mock embedding generation with actual AI-powered embeddings.
Provides React hooks and a queued service for generating text embeddings:
useEmbeddingModel(): React hook that initializes the ExecuTorch embedding modeluseQueuedEmbeddingService(): React hook that provides a queued embedding servicequeuedEmbeddingService: Global queued service that handles concurrent requestsgenerateEmbedding(): Utility function to generate embeddings for single textgenerateEmbeddings(): Utility function to generate embeddings for multiple texts
The database service has been enhanced to:
- Accept an embedding model via
setEmbeddingModel() - Use real embeddings via the queued embedding service in
generateEmbedding() - Generate embeddings for search queries and document storage
- Handle concurrent requests safely through the queued service
A React component that demonstrates the integration:
- Shows model and database status
- Provides a search interface for medical documents and Q&A
- Displays search results with similarity scores
- Handles loading states and errors
- Model Initialization: The
useEmbeddingModel()hook initializes the ALL-MiniLM-L6-v2 model - Queued Service: The
queuedEmbeddingServicehandles concurrent requests by queuing them - Database Setup: The TursoDBService uses the queued embedding service
- Document Storage: When documents are added, real embeddings are generated from their content
- Search: Search queries are converted to embeddings and compared with stored embeddings
- Results: Similarity scores are calculated using cosine similarity
import { useQueuedEmbeddingService } from '../lib/embedding-service';
import { getTursoDBService } from '../lib/turso-db-service';
function MyComponent() {
const embeddingService = useQueuedEmbeddingService();
const dbService = getTursoDBService();
useEffect(() => {
if (embeddingService.isReady) {
// The queued service is automatically set up
console.log('Embedding service is ready');
}
}, [embeddingService.isReady]);
const performSearch = async (query: string) => {
const results = await dbService.searchMedicalDocuments({
query,
limit: 5,
threshold: 0.7
});
return results;
};
}- Model: ALL-MiniLM-L6-v2
- Dimensions: 384
- Max Tokens: 256
- Language: English
- Use Case: General-purpose semantic similarity
- Model Size: ~91MB (XNNPACK)
- Memory Usage: ~150MB (Android), ~190MB (iOS)
- Inference Time: ~53-78ms on modern devices
- Concurrent Safety: Queued requests prevent model conflicts
The implementation includes comprehensive error handling:
- Model loading failures throw clear error messages
- Concurrent requests are queued to prevent conflicts
- Database errors are caught and reported
- Search failures show user-friendly error messages
- Network issues are handled gracefully
To test the integration:
- Navigate to the Search tab in the app
- Wait for the embedding model to load (status will show "Ready")
- Enter a medical search query (e.g., "heart disease", "diabetes treatment")
- View the search results with similarity scores
- Compare results with the previous mock implementation
- Real Semantic Search: Actual AI-powered embeddings provide meaningful similarity
- Better Accuracy: Real embeddings capture semantic meaning, not just random vectors
- Concurrent Safety: Queued service prevents model conflicts
- Scalable: Can handle various medical topics and queries
- Maintainable: Clean separation between embedding logic and database operations
- No Fallbacks: Pure AI embeddings without mock fallbacks
- Support for multiple embedding models
- Batch embedding generation for better performance
- Caching of frequently used embeddings
- Support for other languages
- Fine-tuning for medical domain specificity