Fine-tuned Vision Transformer with AWS EKS Kubernetes orchestration, containerized deployment, and production-ready infrastructure
A production-ready REST API for brain tumor classification using a fine-tuned Vision Transformer (ViT) model. The application is containerized with Docker and deployed on AWS EKS (Kubernetes), featuring automated CI/CD pipeline with GitHub Actions for building and pushing Docker images to Amazon ECR.
- State-of-the-art ViT architecture (google/vit-base-patch16-224-in21k)
- Fine-tuned on medical imaging dataset (brain tumor classification)
- Self-attention mechanism for precise tumor detection
- Transfer learning from ImageNet-21k pre-trained weights
- Lightweight Python 3.10-slim base image
- Pre-cached model weights in Docker image for instant startup
- Non-root user for security
- Built-in HEALTHCHECK for container monitoring
- Multi-stage optimization for reduced image size
- Complete deployment manifest with resource requests/limits
- LoadBalancer service for external access
- Memory allocation: 1Gi requests / 2Gi limits
- CPU allocation: 500m requests / 1000m limits
- GitHub Actions workflow for automated builds
- Automatic Docker image build on push to main/develop branches
- Push to Amazon ECR with SHA-based and latest tags
- AWS credential configuration via GitHub Secrets
- Machine Learning: PyTorch, Transformers, Vision Transformer (ViT)
- Backend: FastAPI
- Container & Orchestration: Docker, Kubernetes, AWS EKS
- Container Registry: Amazon ECR
- Infrastructure: AWS VPC, EC2
- DevOps & CI/CD: GitHub Actions, kubectl
brain-tumor-classification/
โโโ fastapi_app/
โ โโโ app.py # FastAPI application
โ โโโ requirements.txt # Python dependencies
โ โโโ models/
โ โ โโโ vit-brain-tumor-classifier/
โ โ โโโ config.json # ViT model config
โ โ โโโ model.safetensors # Fine-tuned weights
โ โ โโโ preprocessor_config.json # Image preprocessor config
โ โโโ scripts/
โ โ โโโ __init__.py
โ โ โโโ data_model.py # Pydantic response models
โ โ โโโ logging.py # Logging configuration
โ โ โโโ utils.py # ViTBrainTumorClassifier class
โ โโโ templates/
โ โ โโโ index.html # Web UI (Tailwind CSS)
โ โโโ logs/ # Application logs
โโโ .github/workflows/
โ โโโ build-and-push.yml # GitHub Actions CI/CD pipeline
โ โโโ deployment.yaml # Kubernetes deployment manifest
โโโ Dockerfile # Docker image definition
โโโ README.md
GET /healthReturns model status and application version.
GET /Serves the interactive image classification UI.
POST /api/v1/classify
Content-Type: multipart/form-data
file: <binary image data>Request: Multipart form with image file (jpg, jpeg, png, gif, bmp)
Response:
{
"success": true,
"prediction": {
"predicted_class": "Glioma",
"confidence": 94.32,
"all_predictions": {
"Glioma": 94.32,
"Meningioma": 3.21,
"No Tumor": 1.45,
"Pituitary": 1.02
}
},
"message": ""
}- Architecture: Vision Transformer (ViT)
- Base Model: google/vit-base-patch16-224-in21k
- Output Classes: 4 tumor types (Glioma, Meningioma, No Tumor, Pituitary)
- Input Size: 224x224 RGB images
- Model Source: HuggingFace hub (codeby-hp/vit-brain-tumor-classifier)
- Vision Transformer (ViT) architecture and transfer learning
- FastAPI application development and REST API design
- Docker containerization with multi-stage builds
- Kubernetes deployment manifests and health probes
- AWS ECR for container image management
- GitHub Actions for CI/CD automation
- Container security best practices (non-root user, health checks)
- Production-ready API design patterns
- GPU Acceleration: Enable GPU support for faster and more efficient inference
- Batch Prediction: High-throughput batch inference endpoints
- Auto Scaling: Horizontal Pod Autoscaler (HPA) for dynamic workload scaling
- A/B Testing: Canary deployments for safe model comparison
- Advanced Monitoring: Prometheus metrics and Grafana dashboards
| Aspect | Traditional VM | Kubernetes (EKS) |
|---|---|---|
| Container Orchestration | Manual management | Automated scheduling & management |
| Scaling | Manual or basic auto-scaling | Supports HPA for auto-scaling (not yet configured) |
| High Availability | Configure load balancers manually | Built-in service discovery |
| Deployment | Manual rolling updates | Declarative, version-controlled rollouts |
| Self-Healing | Manual restart required | Automatic pod recovery |
| Resource Efficiency | Fixed resource allocation | Dynamic resource optimization |
| Cost Optimization | Always-on infrastructure | Fine-grained resource control |
Harsh Patel
๐ง code.by.hp@gmail.com
๐ GitHub โข LinkedIn
โญ If you find this project helpful, please star it!




