Skip to content

Full-stack hybrid book recommendation system combining Collaborative Filtering and Content-Based Filtering with weighted hybrid scoring, modular data pipelines, and model persistence. Deployed via Flask with responsive HTML/CSS UI and integrated CI/CD for production-ready, scalable, and interactive recommendations.

License

Notifications You must be signed in to change notification settings

Md-Emon-Hasan/BookSage-AI

Repository files navigation

BookSage AI: hybrid book recommendation system

BookSage AI is a hybrid book recommendation system combining Collaborative Filtering (KNN-based) and Content-Based (TF-IDF + Cosine Similarity) models, with a weighted hybrid approach for personalized results. The project ingests and preprocesses large-scale book datasets, applies active-user and popular-book filtering, and dynamically generates recommendations enriched with metadata (title, author, publisher, year, and cover image). I engineered a modular architecture with clear data pipelines, model persistence, and reusable functions, ensuring scalability and maintainability. The system was deployed as a Dockerized Flask web application with a responsive HTML/CSS front-end, integrated into a CI/CD pipeline for automated builds and deployments. This design demonstrates proficiency in ML model building, orchestration, backend API development, front-end integration, containerization, and production-grade deployment workflows.

BookSage-AI

demo.mp4

Live Demo

Try the Hybrid Book Recommendation System live: https://booksage-ai.onrender.com/


Core Technologies

Category Technology / Resource
Core Language Python 3.11
Backend Framework FastAPI
Data Processing Pandas (Data Cleaning & Merging), NumPy (Matrix Ops)
Recommendation Models Hybrid System: Collaborative Filtering + Content-Based Filtering
Collaborative Filtering SciPy (csr_matrix), scikit-learn (NearestNeighbors)
Content-Based Filtering scikit-learn (TfidfVectorizer, cosine_similarity)
Hybrid Fusion Logic Weighted average score combination
Data Sources Book-Crossing Dataset (BX-Books, BX-Users, BX-Ratings)
Feature Engineering TF-IDF on combined features (title, author, publisher, year)
Model Persistence Pickle (Model & Processed Data Serialization)
Memory System In-memory caching of processed data for faster responses
Evaluation Metrics Popularity-based filtering, Active user filtering
Orchestration Layer Modular service classes (DataLoader, DataPreprocessor, ModelManager, HybridModel)
Frontend HTML5, Jinja2 Templates, Static CSS & JavaScript
Deployment Docker (Python 3.11-slim base), requirements.txt dependency locking
Portability Pathlib-based cross-platform directory resolution
Error Handling Graceful fallbacks & empty results handling

Comparison with Standard Systems

Feature BookSage AI Typical Recommenders
Method Flexibility 3 modes + hybrid tuning Usually single-method
Cold Start Handling Popular books fallback Often fails
Explainability Shows scores + metadata Black-box results
UI Customization Adjustable weights/counts Fixed parameters

Project Structure

Project Structure

BookSage-AI/
├── .github/
│   └── workflows/
│       └── main.yml
├── app/
│   ├── core/
│   │   ├── config.py
│   │   ├── logger.py
│   │   ├── models.py
│   │   └── __init__.py
│   ├── services/
│   │   ├── collaborative_model.py
│   │   ├── content_model.py
│   │   ├── data_loader.py
│   │   ├── data_preprocessor.py
│   │   ├── hybrid_model.py
│   │   ├── model_manager.py
│   │   ├── recommendation_engine.py
│   │   └── __init__.py
│   ├── main.py
│   └── __init__.py
├── data/
│   ├── BX-Book-Ratings.csv
│   ├── BX-Books.csv
│   └── BX-Users.csv
├── models/
│   ├── book_pivot.pkl
│   ├── books_content.pkl
│   ├── books_data.pkl
│   ├── cb_model.pkl
│   ├── cf_model.pkl
│   ├── content_sim_matrix.pkl
│   ├── final_rating.pkl
│   ├── tfidf_vectorizer.pkl
│   └── title_to_idx.pkl
├── notebooks/
│   └── experiment.ipynb
├── static/
│   ├── css/
│   │   ├── style.css
│   │   └── style_recommendor.css
│   └── js/
│       └── script.js
├── templates/
│   ├── index.html
│   └── recommendations.html
├── tests/
│   ├── conftest.py
│   ├── test_collaborative_model.py
│   ├── test_config.py
│   ├── test_content_model.py
│   ├── test_data_loader.py
│   ├── test_data_preprocessor.py
│   ├── test_endpoints.py
│   ├── test_hybrid_model.py
│   ├── test_logger.py
│   ├── test_models.py
│   ├── test_model_manager.py
│   ├── test_recommendation_engine.py
│   └── __init__.py
├── .flake8
├── .gitignore
├── app.png
├── demo.mp4
├── docker-compose.yml
├── Dockerfile
├── LICENSE
├── pyproject.toml
├── README.md
├── render.yml
├── requirements.txt
├── run.py
├── setup.py
└── train_models.py

Architecture Diagram (Mermaid)

graph TD
    A[Raw Data: BX-Books, BX-Users, BX-Ratings] --> B[Data Preprocessing & Feature Engineering]
    B --> C[Collaborative Filtering Model]
    B --> D[Content-Based Model]
    
    C --> E[User-Item Matrix - csr_matrix]
    D --> F[TF-IDF Features - Title+Author+Publisher+Year]
    
    E --> G[Hybrid Recommender - Weighted Score Fusion]
    F --> G
    
    G --> H[Flask API - Endpoints for Recommendations]
    H --> I[Frontend: HTML + Jinja2 + CSS/JS]
    
    subgraph Deployment
        J[Docker Container]
        J --> H
        J --> I
    end
    
    G --> K[Model Persistence - Pickle Serialization]
Loading

Quick Start

Prerequisites

  • Python 3.10+
  • pip

Installation

# Clone the repository
git clone https://github.com/Md-Emon-Hasan/BookSage-AI.git
cd BookSage-AI

# Create virtual environment
python -m venv venv
venv\Scripts\activate  # Windows
# source venv/bin/activate  # Linux/Mac

# Install dependencies
pip install -r requirements.txt

Running the Application

# Simple start (recommended)
python run.py

# Or with custom options
python run.py --host 0.0.0.0 --port 8000

# Train models first, then start server
python run.py --train

# Production mode (no auto-reload)
python run.py --prod

# Alternative: using uvicorn directly
uvicorn app.main:app --reload

# Open http://localhost:8000 in your browser

API Endpoints

Method Endpoint Description
GET / Home page with popular books
POST /recommend Get book recommendations
GET /search_books Search books by title
GET /api/health Health check

Docker

# Build and run
docker-compose up -d

# View logs
docker-compose logs -f

# Stop
docker-compose down

Testing

# Run all tests
pytest tests/ -v

# Run with coverage
pytest tests/ -v --cov=app --cov-report=term-missing

Prepared by:

Md Emon Hasan
Email: emon.mlengineer@gmail.com WhatsApp: +8801834363533
GitHub: Md-Emon-Hasan
LinkedIn: Md Emon Hasan
Facebook: Md Emon Hasan