AI Product Photo Detector

Production-grade MLOps system for detecting AI-generated product photos in e-commerce

Nolan Cacheux · GitHub · LinkedIn

Live Demo

Demo Screenshots

Presentation

Explore the full project walkthrough, architecture decisions, and results in the presentation slides:

View Presentation on Google Slides

Overview

End-to-end machine learning system that classifies product photos as real or AI-generated using an EfficientNet-B0 model with Grad-CAM explainability. The project covers the full MLOps lifecycle: from DVC-managed data pipelines and GPU training (local, Colab, or Vertex AI) to a FastAPI serving layer with authentication, rate limiting, and Prometheus monitoring. Infrastructure is provisioned with Terraform and deployed serverlessly via Docker and GitHub Actions CI/CD.

Features

Category	Feature	Description
ML Model	EfficientNet-B0 + Grad-CAM	Transfer learning with ImageNet weights via `timm`, visual heatmap explainability
API	FastAPI with auth & rate limiting	Single, batch (up to 10), and explain endpoints with Pydantic v2 schemas
Training	3 training modes	Local (Docker/CPU), Google Colab (free T4 GPU), Vertex AI (production T4 GPU)
Monitoring	Prometheus + Grafana + drift	18+ custom metrics, auto-provisioned dashboards, real-time drift detection
Infrastructure	Terraform + Docker + Cloud Run	Modular IaC, full Docker Compose stack, serverless deployment
CI/CD	GitHub Actions (5 workflows)	Lint, test, build, deploy with quality gates (accuracy ≥ 0.85, F1 ≥ 0.80)

Training Options

Local Training: Docker-based, for development

When to use: Development, debugging, quick iterations on CPU.

Prerequisites: Python 3.11+, Docker & Docker Compose, Make

# Download the CIFAKE dataset
make data

# Train with default config
make train

# Or run the full DVC pipeline: download → validate → train
make dvc-repro

# Start MLflow UI to view experiments
make mlflow  # → http://localhost:5000

Training takes ~1–2 hours on CPU. Edit configs/train_config.yaml to adjust hyperparameters.

Google Colab: Free T4 GPU, one-click notebook

When to use: Quick experiments with free GPU, no local setup needed.

Prerequisites: Google account

The notebook (notebooks/train_colab.ipynb) handles everything automatically:

Installs dependencies and clones the repository
Downloads the CIFAKE dataset from HuggingFace
Trains EfficientNet-B0 with progress tracking
Evaluates the model and exports the checkpoint
Optionally uploads the trained model to GCS

Open in Colab → set runtime to T4 GPU → Run all cells. Training takes ~20 minutes.

Vertex AI Pipeline: Production training on GCP

When to use: Production retraining, CI/CD-triggered training, reproducible GPU runs.

Prerequisites: GCP project with Vertex AI enabled, gcloud CLI configured, GCS bucket with data

# Trigger via GitHub Actions
gh workflow run model-training.yml \
  -f epochs=15 \
  -f batch_size=64 \
  -f auto_deploy=true

# Or submit directly
python -m src.training.vertex_submit --epochs 15 --batch-size 64 --sync

Pipeline stages: Verify Data → Build Image → GPU Training (T4) → Evaluate → Quality Gate → Deploy

The pipeline DAG is defined using KFP (Kubeflow Pipelines SDK), which is the standard Python SDK for orchestrating workflows on Google Vertex AI Pipelines.

Training takes ~25 minutes. The quality gate blocks deployment if accuracy < 0.85 or F1 < 0.80.

Quick Start

Prerequisites:

Python 3.11+
Docker & Docker Compose
Make
Google Cloud CLI (gcloud)

First-time GCP setup:

gcloud auth login
gcloud config set project ai-product-detector-487013
gcloud auth application-default login  # Required for DVC

Installation:

git clone https://github.com/nolancacheux/AI-Product-Photo-Detector.git
cd AI-Product-Photo-Detector
make dev  # Install dependencies + pre-commit hooks

Download the model (required before Docker):

The trained model weights (54 MB) are stored in GCS via DVC, not in the Git repository.

# Option 1: Using DVC (requires dvc-gs: pip install dvc-gs)
dvc pull models/checkpoints/best_model.pt.dvc

# Option 2: Using gcloud CLI
gcloud storage cp gs://ai-product-detector-487013-mlops-data/dvc/files/md5/0b/b8844b5c1b11d212a306590671a645 models/checkpoints/best_model.pt

First time? Train the model from scratch:

If no model exists in GCS yet, you need to train it first. This downloads the CIFAKE dataset (120k images) and trains EfficientNet-B0:

make dev                # Install dependencies
dvc repro               # Run full pipeline: download data → validate → train
dvc push                # Upload the trained model to GCS for the team

Alternatively, run each step manually:

python scripts/download_cifake.py                           # Download dataset
python -m src.data.validate --data-dir data/processed       # Validate data
python -m src.training.train --config configs/train_config.yaml  # Train model

The trained model will be saved to models/checkpoints/best_model.pt.

Run locally:

make serve             # Local dev server → http://localhost:8000
docker compose up -d   # Full stack (production) → ports below

Note: make serve runs Uvicorn on port 8000 (local development). Docker Compose exposes the API on port 8080 (container/production).

Service	URL
API (Docker)	`http://localhost:8080`
Streamlit UI	`http://localhost:8501`
MLflow	`http://localhost:5000`
Prometheus	`http://localhost:9090`
Grafana	`http://localhost:3000` (default credentials: `admin` / `admin`)

Test:

make test  # pytest with coverage
make lint  # ruff + mypy

Production Deployment

The application is deployed on Google Cloud Run (serverless). Both the API and Web UI are publicly accessible:

Note: These services may be turned off to avoid unnecessary costs. This is a university project and keeping them running permanently is not required. If a link doesn't work, the service has simply been shut down.

See the full project walkthrough in our presentation slides.

Service	URL
API (Production)	https://ai-product-detector-714127049161.europe-west1.run.app
Web UI (Production)	https://ai-product-detector-ui-714127049161.europe-west1.run.app
API Documentation	https://ai-product-detector-714127049161.europe-west1.run.app/docs
Health Check	https://ai-product-detector-714127049161.europe-west1.run.app/health
Metrics (Prometheus)	https://ai-product-detector-714127049161.europe-west1.run.app/metrics

Production Screenshots

API Reference

Endpoints

Method	Endpoint	Description	Rate Limit
`POST`	`/predict`	Single image classification	30/min
`POST`	`/predict/batch`	Batch classification (up to 10 images)	5/min
`POST`	`/predict/explain`	Prediction + Grad-CAM heatmap	10/min
`GET`	`/health`	Readiness probe (model status, uptime, drift)	-
`GET`	`/metrics`	Prometheus metrics (text format)	-

Authentication

Authentication is optional in development and enforced in production via environment variables.

Variable	Description
`API_KEYS`	Comma-separated list of valid API keys
`REQUIRE_AUTH`	Set to `true` to enforce authentication

Pass the key via header: X-API-Key: YOUR_KEY

Response Format

{
  "prediction": "ai_generated",
  "probability": 0.87,
  "confidence": "high",
  "inference_time_ms": 45.2,
  "model_version": "1.0.0"
}

Monitoring

Monitoring Flow

/metrics (raw text) → Prometheus (collection & storage) → Grafana (dashboards & alerts)

The API exposes raw Prometheus metrics at /metrics. Prometheus scrapes this endpoint at regular intervals and stores the time-series data. Grafana connects to Prometheus as a datasource to render real-time dashboards and trigger alerts.

Prometheus Metrics

All exposed at GET /metrics in Prometheus text format:

Metric	Type	Description
`aidetect_predictions_total`	Counter	Total predictions by status, class, confidence
`aidetect_prediction_latency_seconds`	Histogram	Per-prediction latency distribution
`aidetect_prediction_probability`	Histogram	Probability score distribution
`aidetect_batch_predictions_total`	Counter	Batch request count
`aidetect_batch_size`	Histogram	Images per batch request
`aidetect_model_loaded`	Gauge	Model load status (0/1)
`http_request_duration_seconds`	Histogram	HTTP latency by endpoint
`http_requests_total`	Counter	HTTP requests by method, endpoint, status

Grafana Dashboards

Pre-configured and auto-provisioned via configs/grafana/provisioning/:

Request throughput - Requests/sec by endpoint
Latency percentiles - p50, p90, p99 per endpoint
Prediction distribution - Real vs AI-generated ratio over time
Model health - Load status, drift alerts, error rates

Default credentials: admin / admin

Dashboard Screenshots

Drift Detection

Real-time monitoring of prediction distribution shifts using a sliding window over the last 1,000 predictions. Tracks mean probability, confidence distribution, and class ratios. Configurable alert thresholds with status available at GET /drift.

Tech Stack

Layer	Technologies
ML	PyTorch 2.0+, torchvision, timm (EfficientNet-B0), Grad-CAM
API	FastAPI, Uvicorn, Pydantic v2, slowapi
MLOps	DVC (pipelines + versioning), MLflow (experiment tracking), HuggingFace Datasets
Monitoring	Prometheus, Grafana, structlog (JSON logging), custom drift detection
Infrastructure	Docker, Docker Compose, Terraform (modular), Cloud Run, Artifact Registry
CI/CD	GitHub Actions (CI, CD, Model Training, PR Preview, Request Quota)
Cloud	Google Cloud Platform (Vertex AI, Cloud Run, GCS, Artifact Registry, Secret Manager)

Cloud Infrastructure

Project Structure

AI-Product-Photo-Detector/
├── .github/workflows/
│   ├── ci.yml                        # Lint + type-check + test (3.11, 3.12) + security
│   ├── cd.yml                        # Build → push → deploy → smoke test
│   ├── model-training.yml            # Vertex AI GPU training pipeline
│   ├── pr-preview.yml                # PR preview deployments
│   └── request-quota.yml             # GCP quota increase requests
├── configs/
│   ├── grafana/                      # Dashboard definitions + provisioning
│   ├── prometheus/                   # Alerting rules
│   ├── inference_config.yaml         # API server configuration
│   ├── pipeline_config.yaml          # Vertex AI pipeline parameters
│   ├── prometheus.yml                # Prometheus scrape targets
│   └── train_config.yaml             # Training hyperparameters
├── docker/
│   ├── Dockerfile                    # Production API image (non-root)
│   ├── Dockerfile.training           # Vertex AI GPU training image
│   └── ui.Dockerfile                 # Streamlit UI image
├── docs/
│   ├── architecture.svg              # System architecture diagram
│   ├── ARCHITECTURE.md               # Design decisions
│   ├── CICD.md                       # CI/CD pipeline docs
│   ├── CONTRIBUTING.md               # Contribution guidelines
│   ├── COSTS.md                      # Cloud cost analysis
│   ├── DEPLOYMENT.md                 # Deployment guide
│   ├── INFRASTRUCTURE.md             # Infrastructure docs
│   ├── MONITORING.md                 # Monitoring guide
│   └── TRAINING.md                   # Training pipeline docs
├── notebooks/
│   └── train_colab.ipynb             # Colab notebook (free T4 GPU)
├── scripts/                          # Dataset download & sample data utilities
├── src/
│   ├── data/
│   │   └── validate.py               # Dataset validation & integrity checks
│   ├── inference/
│   │   ├── api.py                    # FastAPI application & routes
│   │   ├── auth.py                   # API key auth (HMAC, constant-time)
│   │   ├── explainer.py              # Grad-CAM heatmap generation
│   │   ├── predictor.py              # Model inference engine
│   │   ├── rate_limit.py             # Rate limiting configuration
│   │   ├── routes/
│   │   │   ├── v1/                   # Versioned API endpoints
│   │   ├── schemas.py                # Pydantic request/response models
│   │   ├── shadow.py                 # Shadow model A/B testing
│   │   ├── state.py                  # Application state management
│   │   └── validation.py             # Image validation utilities
│   ├── monitoring/
│   │   ├── drift.py                  # Real-time drift detection
│   │   └── metrics.py                # Prometheus metric definitions
│   ├── pipelines/
│   │   ├── evaluate.py               # Model evaluation stage
│   │   └── training_pipeline.py      # End-to-end training orchestrator
│   ├── training/
│   │   ├── augmentation.py           # Data augmentation transforms
│   │   ├── dataset.py                # PyTorch Dataset implementation
│   │   ├── gcs.py                    # GCS upload/download helpers
│   │   ├── model.py                  # EfficientNet-B0 architecture
│   │   ├── train.py                  # Training loop with MLflow tracking
│   │   └── vertex_submit.py          # Vertex AI job submission CLI
│   ├── ui/
│   │   └── app.py                    # Streamlit web interface
│   └── utils/
│       ├── config.py                 # Settings management (Pydantic Settings)
│       ├── logger.py                 # Structured logging setup
│       └── model_loader.py           # Model loading utilities
├── terraform/
│   ├── environments/
│   │   ├── dev/                      # Development environment
│   │   └── prod/                     # Production environment
│   ├── modules/
│   │   ├── cloud-run/                # Cloud Run service module
│   │   ├── iam/                      # IAM bindings module
│   │   ├── monitoring/               # Monitoring module
│   │   ├── registry/                 # Artifact Registry module
│   │   └── storage/                  # GCS bucket module
│   ├── backend.tf                    # Terraform state backend (GCS)
│   └── versions.tf                   # Provider version constraints
├── tests/
│   ├── load/                         # Locust + k6 load tests
│   ├── conftest.py                   # Shared test fixtures
│   └── test_*.py                     # 28+ test modules (API, auth, model, training, ...)
├── docker-compose.yml                # Full stack: API + UI + MLflow + Prometheus + Grafana
├── dvc.yaml                          # DVC pipeline: download → validate → train
├── Makefile                          # Development commands
├── pyproject.toml                    # Dependencies & tool config
└── LICENSE                           # MIT License

Documentation

Document	Description
Architecture	System architecture and design decisions
Training Guide	Training pipeline documentation (all 3 modes)
Deployment	Deployment guide
Monitoring	Monitoring and observability guide
CI/CD	CI/CD pipeline documentation
Infrastructure	Infrastructure and Terraform documentation
Costs	Cloud cost analysis
Contributing	Contribution guidelines

License

MIT License - see LICENSE for details.

Made by Nolan Cacheux

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Product Photo Detector

Live Demo

Presentation

Overview

Features

Training Options

Quick Start

Production Deployment

Endpoints

Authentication

Response Format

Monitoring Flow

Prometheus Metrics

Grafana Dashboards

Drift Detection

Tech Stack

Documentation

License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 247 Commits
.dvc		.dvc
.github		.github
configs		configs
data		data
docker		docker
docs		docs
images		images
models		models
notebooks		notebooks
scripts		scripts
src		src
terraform		terraform
tests		tests
.dockerignore		.dockerignore
.dvcignore		.dvcignore
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
docker-compose.dev.yml		docker-compose.dev.yml
docker-compose.prod.yml		docker-compose.prod.yml
docker-compose.yml		docker-compose.yml
dvc.yaml		dvc.yaml
presentation.html		presentation.html
presentation.md		presentation.md
pyproject.toml		pyproject.toml
requirements-ui.txt		requirements-ui.txt

License

nolancacheux/AI-Product-Photo-Detector

Folders and files

Latest commit

History

Repository files navigation

AI Product Photo Detector

Live Demo

Presentation

Overview

Features

Training Options

Quick Start

Production Deployment

Endpoints

Authentication

Response Format

Monitoring Flow

Prometheus Metrics

Grafana Dashboards

Drift Detection

Tech Stack

Documentation

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages