This example demonstrates monitoring integration with the Inference Gateway using:
- Prometheus for metrics collection
- Grafana for visualization
- Helm chart for gateway deployment with monitoring enabled
- Metrics Collection: Prometheus scrapes gateway metrics
- Visualization: Grafana dashboards display metrics
- Gateway: Inference Gateway deployed via helm chart with monitoring enabled
- Local LLM: Ollama provider included for testing
- Task
- kubectl
- helm
- ctlptl (for cluster management)
- Deploy infrastructure:
task deploy-infrastructure- Deploy Inference Gateway with monitoring:
task deploy-inference-gateway- Access Grafana dashboards:
kubectl -n monitoring port-forward svc/grafana-service 3000:3000Or use the deployed ingress, add grafana.inference-gateway.local DNS to your /etc/hosts and open: http://grafana.inference-gateway.local/d/inference-gateway/inference-gateway-metrics
Login credentials:
Username: admin Password: admin
- Deploy Ollama and simulate requests responses being sent to the gateway:
task deploy-ollama
task simulate-requests
- Edit YAMLs in
prometheus/andgrafana/directories - Configure scrape intervals and dashboards as needed
- Monitoring settings configured via helm values in Taskfile.yaml
- ServiceMonitor CRD enables Prometheus scraping
task clean