Skip to content

Commit 8d0c710

Browse files
committed
Add final project documentation
1 parent e65705d commit 8d0c710

19 files changed

+256
-0
lines changed

README.md

Lines changed: 143 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,143 @@
1+
# Production-Style ML Inference API with FastAPI, Docker, Prometheus, Grafana, GitHub Actions, Kubernetes, HPA, and Ingress
2+
3+
## Project Summary
4+
5+
This project demonstrates how to build a production-style machine learning inference service from the ground up. It starts with offline model training and artifact serialization, then exposes the model through a FastAPI application with health checks, Prometheus metrics, automated tests, containerization, CI/CD, Kubernetes deployment, horizontal pod autoscaling, and ingress-based routing.
6+
7+
The goal of this project is not just to train a model, but to show the engineering practices required to operate a machine learning model as a service.
8+
9+
## What This Project Includes
10+
11+
- offline model training and artifact saving
12+
- FastAPI inference API
13+
- request and response validation with Pydantic
14+
- liveness and readiness endpoints
15+
- Prometheus metrics exposure
16+
- automated testing with pytest
17+
- Docker containerization
18+
- Docker Compose observability stack with Prometheus and Grafana
19+
- load testing with Locust
20+
- GitHub Actions CI/CD
21+
- Docker image publishing to GitHub Container Registry (GHCR)
22+
- Kubernetes deployment and service routing
23+
- resource requests and limits
24+
- Horizontal Pod Autoscaler (HPA)
25+
- ingress-based access using ingress-nginx
26+
27+
## Tech Stack
28+
29+
- Python
30+
- FastAPI
31+
- scikit-learn
32+
- NumPy
33+
- joblib
34+
- pytest
35+
- Docker
36+
- Docker Compose
37+
- Prometheus
38+
- Grafana
39+
- Locust
40+
- GitHub Actions
41+
- GitHub Container Registry (GHCR)
42+
- Kubernetes
43+
- ingress-nginx
44+
45+
## Architecture
46+
47+
### Local and container architecture
48+
49+
Client
50+
→ FastAPI API
51+
→ Model artifact
52+
`/metrics`
53+
→ Prometheus
54+
→ Grafana
55+
56+
### Kubernetes architecture
57+
58+
Client
59+
→ Ingress
60+
→ Kubernetes Service
61+
→ FastAPI Pods
62+
→ Model artifact
63+
64+
### Delivery architecture
65+
66+
GitHub Push
67+
→ GitHub Actions
68+
→ Run Tests
69+
→ Build Docker Image
70+
→ Push to GHCR
71+
→ Kubernetes pulls image
72+
73+
## API Endpoints
74+
75+
### `GET /health/live`
76+
Returns application liveness status.
77+
78+
Example response:
79+
80+
```json
81+
{"status":"alive"}
82+
83+
84+
GET /health/ready
85+
86+
Returns readiness status after model loading.
87+
88+
Example response:
89+
90+
{"status":"ready"}
91+
POST /predict
92+
93+
Runs inference using the trained model.
94+
95+
Example request:
96+
97+
{
98+
"features": [5.1, 3.5, 1.4, 0.2]
99+
}
100+
101+
Example response:
102+
103+
{
104+
"prediction": 0
105+
}
106+
107+
GET /health/live
108+
GET /health/ready
109+
POST /predict
110+
GET /metrics
111+
GET /docs
112+
113+
## Project Structure
114+
115+
ml_inference_api/
116+
├── .github/
117+
├── app/
118+
├── model/
119+
├── tests/
120+
├── monitoring/
121+
├── k8s/
122+
├── load_tests/
123+
├── scripts/
124+
├── docs/
125+
│ ├── evidence/
126+
│ ├── reports/
127+
│ └── architecture/
128+
├── Dockerfile
129+
├── docker-compose.yml
130+
├── requirements.txt
131+
├── requirements-dev.txt
132+
├── pytest.ini
133+
└── README.md
134+
135+
## Evidence
136+
137+
Project verification screenshots are stored in:
138+
139+
docs/evidence/
140+
141+
Evidence mapping:
142+
143+
docs/evidence/evidence_index.md
40.9 KB
Loading
128 KB
Loading
13.8 KB
Loading
16.1 KB
Loading
20 KB
Loading
41.6 KB
Loading
30.8 KB
Loading

docs/evidence/37_hpa_working.png

41.3 KB
Loading
10.8 KB
Loading

0 commit comments

Comments
 (0)