|
| 1 | +# TensorFlow Model Serving on Kubernetes |
| 2 | + |
| 3 | +## 1 Purpose / What You'll Learn |
| 4 | + |
| 5 | +This example demonstrates how to deploy a TensorFlow model for inference using [TensorFlow Serving](https://www.tensorflow.org/serving) on Kubernetes. You’ll learn how to: |
| 6 | + |
| 7 | +- Set up TensorFlow Serving with a pre-trained model |
| 8 | +- Use a PersistentVolume to mount your model directory |
| 9 | +- Expose the inference endpoint using a Kubernetes `Service` and `Ingress` |
| 10 | +- Send a sample prediction request to the model |
| 11 | + |
| 12 | +--- |
| 13 | + |
| 14 | +## 📚 Table of Contents |
| 15 | + |
| 16 | +- [Prerequisites](#prerequisites) |
| 17 | +- [Quick Start / TL;DR](#quick-start--tldr) |
| 18 | +- [Detailed Steps & Explanation](#detailed-steps--explanation) |
| 19 | +- [Verification / Seeing it Work](#verification--seeing-it-work) |
| 20 | +- [Configuration Customization](#configuration-customization) |
| 21 | +- [Cleanup](#cleanup) |
| 22 | +- [Further Reading / Next Steps](#further-reading--next-steps) |
| 23 | + |
| 24 | +--- |
| 25 | + |
| 26 | +## ⚙️ Prerequisites |
| 27 | + |
| 28 | +- Kubernetes cluster (tested with v1.29+) |
| 29 | +- `kubectl` configured |
| 30 | +- Optional: `ingress-nginx` for external access |
| 31 | +- x86-based machine (for running TensorFlow Serving image) |
| 32 | +- Local hostPath support (for demo) or a cloud-based PVC |
| 33 | + |
| 34 | +--- |
| 35 | + |
| 36 | +## ⚡ Quick Start / TL;DR |
| 37 | + |
| 38 | +```bash |
| 39 | + |
| 40 | +# Apply manifests |
| 41 | +kubectl apply -f https://raw.githubusercontent.com/kubernetes/examples/refs/heads/master/ai/model-serving-tensorflow/pv.yaml |
| 42 | +kubectl apply -f https://raw.githubusercontent.com/kubernetes/examples/refs/heads/master/ai/model-serving-tensorflow/pvc.yaml |
| 43 | +kubectl apply -f https://raw.githubusercontent.com/kubernetes/examples/refs/heads/master/ai/model-serving-tensorflow/deployment.yaml |
| 44 | +kubectl apply -f https://raw.githubusercontent.com/kubernetes/examples/refs/heads/master/ai/model-serving-tensorflow/service.yaml |
| 45 | +kubectl apply -f https://raw.githubusercontent.com/kubernetes/examples/refs/heads/master/ai/model-serving-tensorflow/ingress.yaml # Optional |
| 46 | +``` |
| 47 | + |
| 48 | +--- |
| 49 | + |
| 50 | +## 2. Expose the Servic |
| 51 | + |
| 52 | +### 1. PersistentVolume & PVC Setup |
| 53 | + |
| 54 | +> ⚠️ Note: For local testing, `hostPath` is used to mount `/mnt/models/my_model`. In production, replace this with a cloud-native storage backend (e.g., AWS EBS, GCP PD, or NFS). |
| 55 | +
|
| 56 | + |
| 57 | +Model folder structure: |
| 58 | +``` |
| 59 | +/mnt/models/my_model/ |
| 60 | +└── 1/ |
| 61 | + ├── saved_model.pb |
| 62 | + └── variables/ |
| 63 | +``` |
| 64 | + |
| 65 | +--- |
| 66 | + |
| 67 | +### 2. Expose the Service |
| 68 | + |
| 69 | +- A `ClusterIP` service exposes gRPC (8500) and REST (8501). |
| 70 | +- An optional `Ingress` exposes `/tf/v1/models/my_model:predict` to external clients. |
| 71 | + |
| 72 | +Update the `host` value in `ingress.yaml` to match your domain. |
| 73 | + |
| 74 | +--- |
| 75 | + |
| 76 | +## 3 Verification / Seeing it Work |
| 77 | + |
| 78 | +If using ingress: |
| 79 | + |
| 80 | +```bash |
| 81 | +curl -X POST http://<ingress-host>/tf/v1/models/my_model:predict \ |
| 82 | + -H "Content-Type: application/json" \ |
| 83 | + -d '{ "instances": [[1.0, 2.0, 5.0]] }' |
| 84 | +``` |
| 85 | + |
| 86 | +Expected output: |
| 87 | + |
| 88 | +```json |
| 89 | +{ |
| 90 | + "predictions": [...] |
| 91 | +} |
| 92 | +``` |
| 93 | + |
| 94 | +To verify the pod is running: |
| 95 | + |
| 96 | +```bash |
| 97 | +kubectl get pods |
| 98 | +kubectl wait --for=condition=Available deployment/tf-serving --timeout=300s |
| 99 | +kubectl logs deployment/tf-serving |
| 100 | +``` |
| 101 | + |
| 102 | +--- |
| 103 | + |
| 104 | +## 🛠️ Configuration Customization |
| 105 | + |
| 106 | +- Update `model_name` and `model_base_path` in the deployment |
| 107 | +- Replace `hostPath` with `PersistentVolumeClaim` bound to cloud storage |
| 108 | +- Modify resource requests/limits for TensorFlow container |
| 109 | + |
| 110 | +--- |
| 111 | + |
| 112 | +## 🧹 Cleanup |
| 113 | + |
| 114 | +```bash |
| 115 | +kubectl delete -f https://raw.githubusercontent.com/kubernetes/examples/refs/heads/master/ai/model-serving-tensorflow/ingress.yaml # Optional |
| 116 | +kubectl delete -f https://raw.githubusercontent.com/kubernetes/examples/refs/heads/master/ai/model-serving-tensorflow/service.yaml |
| 117 | +kubectl delete -f https://raw.githubusercontent.com/kubernetes/examples/refs/heads/master/ai/model-serving-tensorflow/deployment.yaml |
| 118 | +kubectl delete -f https://raw.githubusercontent.com/kubernetes/examples/refs/heads/master/ai/model-serving-tensorflow/pvc.yaml |
| 119 | +kubectl delete -f https://raw.githubusercontent.com/kubernetes/examples/refs/heads/master/ai/model-serving-tensorflow/pv.yaml |
| 120 | + |
| 121 | +``` |
| 122 | + |
| 123 | +--- |
| 124 | + |
| 125 | +## 4 Further Reading / Next Steps |
| 126 | + |
| 127 | +- [TensorFlow Serving](https://www.tensorflow.org/tfx/serving) |
| 128 | +- [TF Serving REST API Reference](https://www.tensorflow.org/tfx/serving/api_rest) |
| 129 | +- [Kubernetes Ingress Controller](https://kubernetes.io/docs/concepts/services-networking/ingress-controllers/) |
| 130 | +- [Persistent Volumes](https://kubernetes.io/docs/concepts/storage/persistent-volumes/) |
| 131 | + |
| 132 | + |
0 commit comments