Semantic Recipe Finder is a full-stack application that uses natural language semantic search to find recipes. Built with FastAPI backend and Streamlit frontend, it leverages sentence-transformers and ChromaDB for intelligent recipe discovery.
- Live Demo: HuggingFace Spaces
- Documentation: GitHub Pages
- Repository: GitHub
- Semantic Search: Natural language queries powered by sentence-transformers (all-MiniLM-L6-v2)
- Fast Vector Search: ChromaDB with 384-dimensional embeddings for efficient similarity search
- RESTful API: FastAPI backend with comprehensive endpoints and OpenAPI documentation
- Modern UI: Streamlit frontend with responsive recipe cards and detailed views
- Comprehensive Tests: 61 unit and integration tests with pytest
- Docker Ready: Multi-container setup with docker-compose for easy deployment
- Python 3.10
- Clone the repository
git clone https://github.com/hanifekaptan/semantic-recipe-finder.git
cd semantic-recipe-finder- Create and activate virtual environment
python -m venv .venv
.venv\Scripts\activate # Windows
source .venv/bin/activate # Linux/Mac- Install dependencies
pip install -r requirements.txt- Run the backend (FastAPI)
uvicorn app.main:app --reload --host 127.0.0.1 --port 8000The API will be available at http://localhost:8000 with interactive docs at /docs.
- Run the frontend (Streamlit) (in a new terminal)
streamlit run frontend/app.py --server.port 8501The UI will be available at http://localhost:8501.
API_BASE_URL: Backend URL for Streamlit frontend (default:http://localhost:8000)LOG_LEVEL: Logging level (default:INFO)
semantic-recipe-finder/
βββ app/ # FastAPI backend application
β βββ api/ # API routes and endpoints
β β βββ health.py # Health check endpoint
β β βββ routes.py # Recipe detail endpoint
β β βββ search.py # Search endpoint
β βββ core/ # Core configuration and utilities
β β βββ config.py # Global configuration and state
β β βββ logging.py # Logging setup
β βββ models/ # Pydantic data models
β β βββ recipe_card.py # Recipe card model (search results)
β β βββ recipe_detail.py # Full recipe detail model
β β βββ search_query.py # Search request model
β β βββ search_response.py # Search response model
β βββ services/ # Business logic services
β β βββ detail_service.py # Recipe detail retrieval
β β βββ loading_service.py # Data and model loading
β β βββ search_service.py # Semantic search logic
β β βββ vectorstore.py # ChromaDB operations
β βββ utils/ # Utility functions
β β βββ data_preprocessor.py # Text cleaning
β β βββ vectorizer.py # Text vectorization
β βββ main.py # FastAPI app initialization
βββ frontend/ # Streamlit frontend application
β βββ api/
β β βββ client.py # Backend API client
β βββ components/ # Reusable UI components
β β βββ header.py # App header
β β βββ recipe_card.py # Recipe card display
β β βββ recipe_detail.py # Detailed recipe view
β β βββ search_bar.py # Search input
β βββ pages/ # Streamlit pages
β β βββ detail.py # Recipe detail page
β β βββ search.py # Search results page
β βββ utils/
β β βββ utility.py # Helper functions
β βββ app.py # Streamlit app entrypoint
βββ data/ # Data storage
β βββ raw/
β β βββ recipes.csv # Original recipe dataset
β βββ processed/
β βββ ids_embs.npy # Recipe IDs
β βββ metadata_embs.npy # Recipe metadata embeddings
β βββ persist/ # ChromaDB persistent storage
βββ docker/ # Docker configurations
β βββ backend.Dockerfile
β βββ frontend.Dockerfile
β βββ entrypoint.sh
βββ tests/ # Test suite
β βββ integration/ # API integration tests
β β βββ test_smoke_api.py
β βββ unit/ # Unit tests
β βββ services/ # Service layer tests
β βββ utils/ # Utility tests
βββ docker-compose.yml # Multi-container orchestration
βββ Dockerfile # HuggingFace Space Dockerfile
βββ requirements.txt # Python dependencies
βββ pytest.ini # Pytest configuration
- Semantic Search: Uses
all-MiniLM-L6-v2sentence-transformer model for query encoding - Vector Database: ChromaDB with DuckDB+Parquet backend for 100 recipe embeddings (384 dimensions)
- Service Layer: Clean separation between API routes, business logic, and data access
- Error Handling: Comprehensive exception handling with proper HTTP status codes
- API Documentation: Auto-generated OpenAPI (Swagger) docs at
/docs
- Component-Based: Modular UI components for search, cards, and detail views
- API Client: HTTP client with error handling for backend communication
- Session State: Manages search results and navigation state
- Responsive Design: Clean, user-friendly interface optimized for recipe browsing
- User enters natural language query in Streamlit UI
- Frontend sends request to
/searchendpoint - Backend cleans and vectorizes query text
- ChromaDB performs similarity search on recipe embeddings
- Top 100 results retrieved from DataFrame
- Results paginated and returned to frontend
- Frontend displays recipe cards with key information
The project includes comprehensive test coverage with pytest:
# Run all tests
pytest
# Run with coverage
pytest --cov=app --cov=frontend
# Run specific test suite
pytest tests/unit/services/
pytest tests/integration/Test Statistics:
- 61 total tests (28 service tests, 18 utility tests, 15 integration tests)
- Unit tests: Mock-based testing for services and utilities
- Integration tests: FastAPI TestClient for full API testing
- All tests passing with proper fixtures and parametrization
docker-compose up --buildThis starts both backend (port 8000) and frontend (port 8501) containers.
# Backend only
docker build -f docker/backend.Dockerfile -t recipe-finder-backend .
docker run -p 8000:8000 recipe-finder-backend
# Frontend only
docker build -f docker/frontend.Dockerfile -t recipe-finder-frontend .
docker run -p 8501:8501 recipe-finder-frontendThe root Dockerfile is configured for HuggingFace Spaces deployment with both services.
The application uses a subset of recipe data with:
- 100 recipes from Food.com dataset
- Metadata: Name, description, category, ingredients, nutrition facts, ratings
- Embeddings: Pre-computed 384-dimensional vectors from recipe metadata
- Storage: ChromaDB persistent storage at
data/processed/persist/
Backend:
- FastAPI 0.115.6
- sentence-transformers (all-MiniLM-L6-v2)
- ChromaDB 0.5.23
- Pandas, NumPy
- Pydantic for data validation
Frontend:
- Streamlit 1.41.1
- httpx for API calls
- Python 3.10+
Testing:
- pytest 9.0.2
- pytest-asyncio
- unittest.mock
DevOps:
- Docker & docker-compose
- GitHub Actions (coming soon)
- HuggingFace Spaces deployment
Health check endpoint for monitoring.
Response: { "status": "ok", "ready": true }
Semantic search for recipes.
Request Body:
{
"query": "quick pasta dinner",
"offset": 0,
"limit": 20
}Response:
{
"search_results": [
{
"recipe_id": 123,
"similarity_score": 0.87,
"card": {
"recipe_id": 123,
"name": "Quick Pasta Carbonara",
"description": "Creamy pasta dish...",
"recipe_category": "Main Course",
"keywords": ["pasta", "quick", "italian"],
"n_ingredients": 5,
"total_time_minutes": 20,
"calories": 450.0,
"aggregated_rating": 4.5
}
}
],
"total_results": 42,
"offset": 0,
"limit": 20
}Get full recipe details.
Response:
{
"recipe_id": 123,
"name": "Quick Pasta Carbonara",
"description": "Creamy pasta dish...",
"recipe_category": "Main Course",
"keywords": ["pasta", "quick", "italian"],
"ingredients": ["spaghetti", "eggs", "bacon", "parmesan", "pepper"],
"instructions": ["Step 1...", "Step 2..."],
"n_ingredients": 5,
"total_time_minutes": 20,
"calories": 450.0,
"fat_content": 15.0,
"protein_content": 25.0,
"aggregated_rating": 4.5
}This project is licensed under the Apache License 2.0. See the LICENSE file for details.
Hanife Kaptan - hanifekaptan.dev@gmail.com
Project Link: https://github.com/hanifekaptan/semantic-recipe-finder
β Star this repo if you find it helpful!