An end-to-end modular pipeline for book recommendation using clustering techniques, built with best practices in MLOps, data engineering, and cloud deployment.
-
Modular, maintainable pipeline components (data ingestion, validation, transformation, training, prediction)
-
Data versioning and pipeline orchestration using DVC for reproducibility
-
Secure management of credentials via AWS Secrets Manager
-
Containerised with Docker for easy deployment on cloud platforms (e.g. AWS EC2)
-
Interactive Streamlit app interface for exploring book recommendations
-
Automated download of datasets from Kaggle and external sources
-
Comprehensive logging and exception handling for robustness
-
Git: https://git-scm.com/
-
Data link: https://www.kaggle.com/datasets/ra4u12/bookrecommendation
- Config file (Constants)
- Config Entity (Return values)
- App Config (Read config file)
- Components (Pipeline code files)
- Pipeline (Run components)
- Main file (run pipeline)
- App file (User interface)
Clone the repository
https://github.com/razyousuf/Book-Recommendation-Clustering-Pipeline.gitconda create -n book python=3.10 -yconda activate bookpip install -r requirements.txtNow run,
streamlit run app.pyNote: Do the port mapping to this port:- 8080
sudo apt-get update -y
sudo apt-get upgrade
#Install Docker
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh
sudo usermod -aG docker ubuntu
newgrp dockergit clone "your-project-repository-url"docker build -t raz/app:latest .docker images -adocker run -d -p 8501:8501 raz/appdocker psdocker stop container_iddocker rm $(docker ps -a -q)docker logindocker push raz/app:latestdocker rmi raz/app:latestdocker pull raz/appgit add .
git commit -m "Updated"
git push origin main