Skip to content

Commit 0598f07

Browse files
AI Example model serving tensorflow (#563)
* Create AI Example model serving tensorflow * ai/model-serving-tensorflow service.yaml * ai/model-serving-tensorflow ingress.yaml * ai/model-serving-tensorflow pv.yaml * ai/model-serving-tensorflow pvc.yaml * Create Readme.md * Rename Readme.md to README.md * Update with structure format for README.md * Correct link for serving in ai/model-serving-tensorflow/README.md Co-authored-by: Janet Kuo <[email protected]> * Fix kubectl README.md * Update README.md * Update as per comments README.md * Update tensorflow/serving:2.19.0 deployment.yaml * remove hostname ai/model-serving-tensorflow/ingress.yaml --------- Co-authored-by: Janet Kuo <[email protected]>
1 parent 209452c commit 0598f07

File tree

6 files changed

+221
-0
lines changed

6 files changed

+221
-0
lines changed
Lines changed: 132 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,132 @@
1+
# TensorFlow Model Serving on Kubernetes
2+
3+
## 1 Purpose / What You'll Learn
4+
5+
This example demonstrates how to deploy a TensorFlow model for inference using [TensorFlow Serving](https://www.tensorflow.org/serving) on Kubernetes. You’ll learn how to:
6+
7+
- Set up TensorFlow Serving with a pre-trained model
8+
- Use a PersistentVolume to mount your model directory
9+
- Expose the inference endpoint using a Kubernetes `Service` and `Ingress`
10+
- Send a sample prediction request to the model
11+
12+
---
13+
14+
## 📚 Table of Contents
15+
16+
- [Prerequisites](#prerequisites)
17+
- [Quick Start / TL;DR](#quick-start--tldr)
18+
- [Detailed Steps & Explanation](#detailed-steps--explanation)
19+
- [Verification / Seeing it Work](#verification--seeing-it-work)
20+
- [Configuration Customization](#configuration-customization)
21+
- [Cleanup](#cleanup)
22+
- [Further Reading / Next Steps](#further-reading--next-steps)
23+
24+
---
25+
26+
## ⚙️ Prerequisites
27+
28+
- Kubernetes cluster (tested with v1.29+)
29+
- `kubectl` configured
30+
- Optional: `ingress-nginx` for external access
31+
- x86-based machine (for running TensorFlow Serving image)
32+
- Local hostPath support (for demo) or a cloud-based PVC
33+
34+
---
35+
36+
## ⚡ Quick Start / TL;DR
37+
38+
```bash
39+
40+
# Apply manifests
41+
kubectl apply -f https://raw.githubusercontent.com/kubernetes/examples/refs/heads/master/ai/model-serving-tensorflow/pv.yaml
42+
kubectl apply -f https://raw.githubusercontent.com/kubernetes/examples/refs/heads/master/ai/model-serving-tensorflow/pvc.yaml
43+
kubectl apply -f https://raw.githubusercontent.com/kubernetes/examples/refs/heads/master/ai/model-serving-tensorflow/deployment.yaml
44+
kubectl apply -f https://raw.githubusercontent.com/kubernetes/examples/refs/heads/master/ai/model-serving-tensorflow/service.yaml
45+
kubectl apply -f https://raw.githubusercontent.com/kubernetes/examples/refs/heads/master/ai/model-serving-tensorflow/ingress.yaml # Optional
46+
```
47+
48+
---
49+
50+
## 2. Expose the Servic
51+
52+
### 1. PersistentVolume & PVC Setup
53+
54+
> ⚠️ Note: For local testing, `hostPath` is used to mount `/mnt/models/my_model`. In production, replace this with a cloud-native storage backend (e.g., AWS EBS, GCP PD, or NFS).
55+
56+
57+
Model folder structure:
58+
```
59+
/mnt/models/my_model/
60+
└── 1/
61+
├── saved_model.pb
62+
└── variables/
63+
```
64+
65+
---
66+
67+
### 2. Expose the Service
68+
69+
- A `ClusterIP` service exposes gRPC (8500) and REST (8501).
70+
- An optional `Ingress` exposes `/tf/v1/models/my_model:predict` to external clients.
71+
72+
Update the `host` value in `ingress.yaml` to match your domain.
73+
74+
---
75+
76+
## 3 Verification / Seeing it Work
77+
78+
If using ingress:
79+
80+
```bash
81+
curl -X POST http://<ingress-host>/tf/v1/models/my_model:predict \
82+
-H "Content-Type: application/json" \
83+
-d '{ "instances": [[1.0, 2.0, 5.0]] }'
84+
```
85+
86+
Expected output:
87+
88+
```json
89+
{
90+
"predictions": [...]
91+
}
92+
```
93+
94+
To verify the pod is running:
95+
96+
```bash
97+
kubectl get pods
98+
kubectl wait --for=condition=Available deployment/tf-serving --timeout=300s
99+
kubectl logs deployment/tf-serving
100+
```
101+
102+
---
103+
104+
## 🛠️ Configuration Customization
105+
106+
- Update `model_name` and `model_base_path` in the deployment
107+
- Replace `hostPath` with `PersistentVolumeClaim` bound to cloud storage
108+
- Modify resource requests/limits for TensorFlow container
109+
110+
---
111+
112+
## 🧹 Cleanup
113+
114+
```bash
115+
kubectl delete -f https://raw.githubusercontent.com/kubernetes/examples/refs/heads/master/ai/model-serving-tensorflow/ingress.yaml # Optional
116+
kubectl delete -f https://raw.githubusercontent.com/kubernetes/examples/refs/heads/master/ai/model-serving-tensorflow/service.yaml
117+
kubectl delete -f https://raw.githubusercontent.com/kubernetes/examples/refs/heads/master/ai/model-serving-tensorflow/deployment.yaml
118+
kubectl delete -f https://raw.githubusercontent.com/kubernetes/examples/refs/heads/master/ai/model-serving-tensorflow/pvc.yaml
119+
kubectl delete -f https://raw.githubusercontent.com/kubernetes/examples/refs/heads/master/ai/model-serving-tensorflow/pv.yaml
120+
121+
```
122+
123+
---
124+
125+
## 4 Further Reading / Next Steps
126+
127+
- [TensorFlow Serving](https://www.tensorflow.org/tfx/serving)
128+
- [TF Serving REST API Reference](https://www.tensorflow.org/tfx/serving/api_rest)
129+
- [Kubernetes Ingress Controller](https://kubernetes.io/docs/concepts/services-networking/ingress-controllers/)
130+
- [Persistent Volumes](https://kubernetes.io/docs/concepts/storage/persistent-volumes/)
131+
132+
Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
apiVersion: apps/v1
2+
kind: Deployment
3+
metadata:
4+
name: tf-serving
5+
labels:
6+
app: tf-serving
7+
spec:
8+
replicas: 1
9+
selector:
10+
matchLabels:
11+
app: tf-serving
12+
template:
13+
metadata:
14+
labels:
15+
app: tf-serving
16+
spec:
17+
containers:
18+
- name: tensorflow-serving
19+
image: tensorflow/serving:2.19.0
20+
args:
21+
- "--model_name=my_model"
22+
- "--port=8500"
23+
- "--rest_api_port=8501"
24+
- "--model_base_path=/models/my_model"
25+
ports:
26+
- containerPort: 8500 # gRPC
27+
- containerPort: 8501 # REST
28+
volumeMounts:
29+
- name: model-volume
30+
mountPath: /models/my_model
31+
volumes:
32+
- name: model-volume
33+
persistentVolumeClaim:
34+
claimName: my-model-pvc
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
apiVersion: networking.k8s.io/v1
2+
kind: Ingress
3+
metadata:
4+
name: tf-serving-ingress
5+
annotations:
6+
nginx.ingress.kubernetes.io/rewrite-target: /$2
7+
spec:
8+
rules:
9+
- http:
10+
paths:
11+
- path: /tf(/|$)(.*)
12+
pathType: Prefix
13+
backend:
14+
service:
15+
name: tf-serving
16+
port:
17+
number: 8501
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
apiVersion: v1
2+
kind: PersistentVolume
3+
metadata:
4+
name: my-model-pv
5+
spec:
6+
capacity:
7+
storage: 1Gi
8+
accessModes:
9+
- ReadOnlyMany
10+
persistentVolumeReclaimPolicy: Retain
11+
hostPath:
12+
path: /mnt/models/my_model
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
apiVersion: v1
2+
kind: PersistentVolumeClaim
3+
metadata:
4+
name: my-model-pvc
5+
spec:
6+
accessModes:
7+
- ReadOnlyMany
8+
resources:
9+
requests:
10+
storage: 1Gi
11+
volumeName: my-model-pv
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
apiVersion: v1
2+
kind: Service
3+
metadata:
4+
name: tf-serving
5+
spec:
6+
selector:
7+
app: tf-serving
8+
ports:
9+
- name: grpc
10+
port: 8500
11+
targetPort: 8500
12+
- name: rest
13+
port: 8501
14+
targetPort: 8501
15+
type: ClusterIP

0 commit comments

Comments
 (0)