You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- Inference pipelines: model formats (ONNX: Open Neural Network Exchange), runtimes (ONNX Runtime, TensorRT), servers (Triton), quantization/pruning/distillation for latency/cost.
884
884
- Data plane: zero‑copy, pinned memory, batching, dynamic shapes; autoscale on QPS (Queries Per Second)/latency SLOs (Service Level Objectives).
885
885
886
886
</details>
@@ -895,12 +895,12 @@ From [K8s cluster components](https://kubernetes.io/docs/concepts/architecture/)
895
895
-**Tools**: Grid carbon intensity APIs, Microsoft Sustainability Calculator
896
896
-**Standards**: ISO 14064, GHG Protocol, Carbon Disclosure Project
897
897
- Signals and objectives:
898
-
- Carbon intensity (grid gCO₂/kWh—grams of CO₂ per kilowatt‑hour): average vs marginal; real‑time + forecasts by region; combine with electricity price and datacenter PUE (Power Usage Effectiveness)/WUE (Water Usage Effectiveness).
898
+
- Carbon intensity (grid gCO₂/kWh: grams of CO₂ per kilowatt‑hour): average vs marginal; real‑time + forecasts by region; combine with electricity price and datacenter PUE (Power Usage Effectiveness)/WUE (Water Usage Effectiveness).
- Time shifting: run deferrable jobs in low‑carbon windows (cron + forecasts).
902
902
- Geo shifting: place in cleaner regions (multi‑region queues, policy‑aware schedulers).
903
-
- Power/perf: DVFS (Dynamic Voltage and Frequency Scaling) and power caps (e.g., RAPL—Running Average Power Limit), right‑size CPU/mem, consolidate to idle whole hosts, sleep states off‑peak.
903
+
- Power/perf: DVFS (Dynamic Voltage and Frequency Scaling) and power caps (e.g., RAPL: Running Average Power Limit), right‑size CPU/mem, consolidate to idle whole hosts, sleep states off‑peak.
904
904
- Kubernetes patterns:
905
905
- Scheduler plugins/extenders to weigh carbon score; KEDA (Kubernetes‑based Event‑Driven Autoscaling) for event‑driven pause/resume; PriorityClasses to preempt non‑critical work.
906
906
- Node labels for region/zone/carbon buckets; topology spread to pack/shed; carbon‑aware HPA (Horizontal Pod Autoscaler) inputs via external metrics.
@@ -909,7 +909,7 @@ From [K8s cluster components](https://kubernetes.io/docs/concepts/architecture/)
909
909
- Governance: budgets/quotas per team; dashboards and alerts on kgCO₂e (kilograms of CO₂ equivalent) per service.
0 commit comments