Skip to content

Commit 636223f

Browse files
authored
4/5
1 parent 3ba6c42 commit 636223f

File tree

1 file changed

+34
-1
lines changed

1 file changed

+34
-1
lines changed

README.md

Lines changed: 34 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -863,21 +863,54 @@ From [K8s cluster components](https://kubernetes.io/docs/concepts/architecture/)
863863
<details>
864864
<summary><b>Serverless, Edge Computing, AI Acceleration (2020s)</b></summary>
865865

866+
> Optimize the `bytes → flops → bytes` loop: minimize copies, keep tensors on‑device, and batch just enough to meet latency SLOs (Service Level Objectives).
867+
866868
- **Serverless computing**: Event-driven functions with automatic scaling (e.g., Azure Functions, AWS Lambda)
867869
- **Edge computing**: Processing data closer to sources to reduce latency
868870
- **AI acceleration**: Specialized hardware (GPUs, TPUs, NPUs) for machine learning workloads
869871
- **Key technologies**: Azure Functions, AWS Lambda, TensorFlow, PyTorch, CUDA
872+
- Serverless (FaaS—Function as a Service) internals:
873+
- Triggers: HTTP, queues, events, timers; scale‑to‑zero + cold starts; concurrency controls per instance.
874+
- Isolation: containers, sandboxed runtimes, or micro‑VMs (virtual machines; e.g., Firecracker); per‑request billing, idempotency + retries with DLQs (Dead‑Letter Queues).
875+
- Portable model: Knative Serving/Eventing (revisions, activator, autoscaler); CloudEvents for event metadata.
876+
- Edge computing:
877+
- Constraints: intermittent links, limited CPU/GPU (Graphics Processing Unit), data locality/privacy; patterns: streaming, windowed analytics, feature extraction at source.
878+
- Tooling: lightweight K8s (Kubernetes) distributions (k3s, AKS Edge), device plugins, OTA (Over‑The‑Air) updates, attestation (TPM—Trusted Platform Module/TEE—Trusted Execution Environment).
879+
- Protocols: MQTT (Message Queuing Telemetry Transport), OPC‑UA (Open Platform Communications Unified Architecture), gRPC (gRPC Remote Procedure Calls); 5G MEC (Multi‑access Edge Computing) for low‑latency ingress; local caches and twin models.
880+
- AI acceleration:
881+
- Hardware: GPUs (Graphics Processing Units; CUDA cores, Tensor Cores; FP32/FP16/BF16—floating‑point formats/INT8—8‑bit integer), TPUs (Tensor Processing Units), NPUs (Neural Processing Units); memory bandwidth and interconnect (NVLink, PCIe—Peripheral Component Interconnect Express, InfiniBand/RDMA—Remote Direct Memory Access) dominate throughput.
882+
- Scheduling: K8s device plugins, GPU sharing (MPS—Multi‑Process Service), partitioning (MIG—Multi‑Instance GPU), NUMA (Non‑Uniform Memory Access) alignment; topology‑aware placement.
883+
- Inference pipelines: model formats (ONNX—Open Neural Network Exchange), runtimes (ONNX Runtime, TensorRT), servers (Triton), quantization/pruning/distillation for latency/cost.
884+
- Data plane: zero‑copy, pinned memory, batching, dynamic shapes; autoscale on QPS (Queries Per Second)/latency SLOs (Service Level Objectives).
870885

871886
</details>
872887

873888
<details>
874889
<summary><b>Energy/Carbon-Aware Operations (2019-2025)</b></summary>
875890

891+
> Treat carbon like a first‑class SLO (Service Level Objective): define budgets, wire metrics, and route workloads by carbon score alongside latency and cost.
892+
876893
- **Carbon-aware scheduling**: Shifting workloads to times/regions with cleaner energy
877894
- **Technical approach**: Real-time carbon intensity signals, flexible workload policies
878895
- **Tools**: Grid carbon intensity APIs, Microsoft Sustainability Calculator
879896
- **Standards**: ISO 14064, GHG Protocol, Carbon Disclosure Project
880-
897+
- Signals and objectives:
898+
- Carbon intensity (grid gCO₂/kWh—grams of CO₂ per kilowatt‑hour): average vs marginal; real‑time + forecasts by region; combine with electricity price and datacenter PUE (Power Usage Effectiveness)/WUE (Water Usage Effectiveness).
899+
- Workload classes: deferrable (batch/ETL/ML training), movable (geo‑flex), latency‑critical (pin, optimize).
900+
- Scheduling and controls:
901+
- Time shifting: run deferrable jobs in low‑carbon windows (cron + forecasts).
902+
- Geo shifting: place in cleaner regions (multi‑region queues, policy‑aware schedulers).
903+
- Power/perf: DVFS (Dynamic Voltage and Frequency Scaling) and power caps (e.g., RAPL—Running Average Power Limit), right‑size CPU/mem, consolidate to idle whole hosts, sleep states off‑peak.
904+
- Kubernetes patterns:
905+
- Scheduler plugins/extenders to weigh carbon score; KEDA (Kubernetes‑based Event‑Driven Autoscaling) for event‑driven pause/resume; PriorityClasses to preempt non‑critical work.
906+
- Node labels for region/zone/carbon buckets; topology spread to pack/shed; carbon‑aware HPA (Horizontal Pod Autoscaler) inputs via external metrics.
907+
- Measurement and reporting:
908+
- Telemetry: per‑pod energy models, GPU/CPU utilization exporters, storage/network I/O; estimate embodied vs operational emissions.
909+
- Governance: budgets/quotas per team; dashboards and alerts on kgCO₂e (kilograms of CO₂ equivalent) per service.
910+
- Tooling and standards:
911+
- Data sources: grid carbon APIs (forecast + realtime); sustainability calculators/dashboards.
912+
- Frameworks: GHG (Greenhouse Gas) Protocol scopes 1–3; ISO (International Organization for Standardization) 14064; disclosures (CDP—Carbon Disclosure Project) and internal SLOs (Service Level Objectives; energy/SKU selection).
913+
881914
</details>
882915

883916
### Energy & Sustainability Milestones (1992-present)

0 commit comments

Comments
 (0)