Platform & Observability Engineer based in Muenster, Germany.
I run production Kubernetes, observability (Prometheus, Grafana, OpenTelemetry), and data/ML platforms at fleet scale, and bring that same production rigor to LLM inference infrastructure.
- Production platform and observability: Kubernetes, Helm, ArgoCD, Prometheus, Grafana, OpenTelemetry
- Data/ML platforms at fleet scale: Dagster, dbt, MLflow, Python, Go
- LLM inference infrastructure: llm-d, readiness gates, latency/cost/reliability, benchmarking
- Sole developer of an end-to-end battery analytics platform for 100,000+ IoT devices: a tested capacity/runtime library, Dagster ingestion, FastAPI services, dashboards, and monitoring, across 8+ repositories.
- 10+ merged infrastructure PRs in llm-d (router, inference-sim), OpenTelemetry, and Feast, with active open contributions in llm-d-kv-cache and llm-d-benchmark.
- aipreflight: production readiness gate for LLM/RAG services (eval, cost, latency, SLA gates).
- llmprobe: synthetic monitoring and CI smoke tests for LLM inference endpoints.
- tokentoll: LLM cost diffs in code review.
- llm-bench: live benchmark for OpenAI-compatible LLM APIs (TTFT, latency, throughput, errors).



