Skip to content
View Jwrede's full-sized avatar

Block or report Jwrede

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Jwrede/README.md

Jonathan Wrede

Platform & Observability Engineer based in Muenster, Germany.

I run production Kubernetes, observability (Prometheus, Grafana, OpenTelemetry), and data/ML platforms at fleet scale, and bring that same production rigor to LLM inference infrastructure.

What I do

  • Production platform and observability: Kubernetes, Helm, ArgoCD, Prometheus, Grafana, OpenTelemetry
  • Data/ML platforms at fleet scale: Dagster, dbt, MLflow, Python, Go
  • LLM inference infrastructure: llm-d, readiness gates, latency/cost/reliability, benchmarking

Selected proof

  • Sole developer of an end-to-end battery analytics platform for 100,000+ IoT devices: a tested capacity/runtime library, Dagster ingestion, FastAPI services, dashboards, and monitoring, across 8+ repositories.
  • 10+ merged infrastructure PRs in llm-d (router, inference-sim), OpenTelemetry, and Feast, with active open contributions in llm-d-kv-cache and llm-d-benchmark.

Open-source tooling

  • aipreflight: production readiness gate for LLM/RAG services (eval, cost, latency, SLA gates).
  • llmprobe: synthetic monitoring and CI smoke tests for LLM inference endpoints.
  • tokentoll: LLM cost diffs in code review.
  • llm-bench: live benchmark for OpenAI-compatible LLM APIs (TTFT, latency, throughput, errors).

Writing

Pinned Loading

  1. aipreflight aipreflight Public

    CI/CD readiness gate for AI apps and LLM endpoints: evals, RAG behavior, cost budgets, observability, and rollout checks.

    Python 1

  2. tokentoll tokentoll Public

    Catch LLM cost changes in code review. Infracost for LLM spend.

    Python 4

  3. llmprobe llmprobe Public

    Synthetic monitoring and CI smoke tests for LLM inference endpoints.

    Go 1

  4. llm-bench-data llm-bench-data Public

    Open LLM API performance dataset. Hourly TTFT, latency, and throughput measurements across major providers.

    1

  5. llm-bench llm-bench Public

    Continuous open LLM API benchmark. Live dashboard, public dataset, automatic model discovery.

    Go