Skip to content

Commit 8cc9fd3

Browse files
authored
Merge pull request #2237 from sthaha/doc-power-computation
docs: add comprehensive power attribution guide
2 parents b0c5960 + 3296f3c commit 8cc9fd3

File tree

4 files changed

+343
-0
lines changed

4 files changed

+343
-0
lines changed
175 KB
Loading

docs/developer/index.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
# Developer Documentation
2+
3+
This section contains documentation for developers working on Kepler.
4+
5+
## Architecture and Design
6+
7+
- [Power Attribution Guide](power-attribution-guide.md) - Comprehensive guide on how Kepler measures and attributes power consumption to processes, containers, VMs, and pods
8+
9+
## Development Workflow
10+
11+
- [Pre-commit Setup](pre-commit.md) - Setting up pre-commit hooks for code quality
12+
13+
## Contributing
14+
15+
For general contribution guidelines, please refer to the main project documentation.
Lines changed: 294 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,294 @@
1+
# Kepler Power Attribution Guide
2+
3+
This guide explains how Kepler measures and attributes power consumption to processes, containers, VMs, and pods.
4+
5+
## How Power Measurement Works
6+
7+
Kepler's power attribution follows a simple but effective approach: measure total system energy consumption from hardware, then distribute it fairly to individual workloads based on their resource usage.
8+
9+
### The Big Picture
10+
11+
Think of your computer like an apartment building with a single electricity meter. The meter shows total power consumption (e.g., 40W), but you need to know how much each apartment (process) is using. Kepler solves this by:
12+
13+
1. **Reading the main meter** - Hardware sensors (Intel RAPL) provide total energy consumption
14+
2. **Understanding system activity** - Monitor CPU usage to determine how "busy" the system is
15+
3. **Splitting costs fairly** - Divide energy between "active work" and "idle baseline"
16+
4. **Allocating to tenants** - Give each process power proportional to their CPU usage
17+
18+
### Core Insight: Active vs Idle Power
19+
20+
The key insight is that system power has two components:
21+
22+
- **Active Power**: Energy consumed doing actual work (running processes)
23+
- **Idle Power**: Baseline energy for keeping the system running (even when idle)
24+
25+
If your system uses 25% of CPU capacity, then 25% of total power goes to "active" and 75% stays as "idle."
26+
27+
### Attribution Principle
28+
29+
Once Kepler knows the active power available, it distributes it proportionally:
30+
31+
```text
32+
Process Power = (Process CPU Time / Total CPU Time) × Active Power
33+
```
34+
35+
This ensures that processes consuming more CPU get more power attribution, while the total never exceeds what hardware actually measured.
36+
37+
## Overview
38+
39+
Kepler uses a hierarchical power attribution system that starts with hardware energy measurements and distributes power proportionally based on CPU utilization. The system ensures energy conservation while providing fair attribution across workloads.
40+
41+
![Power Attribution Diagram](assets/power-attribution.png)
42+
43+
*Figure 1: Power attribution flow showing how 40W total power is split between active (10W) and idle (30W) components, then distributed to workloads based on CPU usage ratios.*
44+
45+
### Real-World Example
46+
47+
Using the diagram above:
48+
49+
- **Hardware reports**: 40W total system power
50+
- **System analysis**: 25% CPU usage ratio
51+
- **Power split**: 40W × 25% = 10W active, 30W idle
52+
- **VM attribution**: VM uses 100% of active CPU → gets all 10W active power
53+
- **Container breakdown**: Within the VM, containers get proportional shares of the 10W
54+
55+
## Architecture Components
56+
57+
### 1. Hardware Energy Reading (`internal/device/`)
58+
59+
The device layer provides the foundation for all power measurements:
60+
61+
#### Energy Zones
62+
63+
- **Package**: CPU package-level energy consumption
64+
- **Core**: Individual CPU core energy
65+
- **DRAM**: Memory subsystem energy
66+
- **Uncore**: Integrated graphics and other uncore components
67+
- **PSys**: Platform-level energy (most comprehensive when available)
68+
69+
#### Key Interfaces
70+
71+
- `EnergyZone`: Interface for reading energy from hardware zones
72+
- `CPUPowerMeter`: Main interface for accessing energy zones
73+
- `AggregatedZone`: Combines multiple zones of the same type
74+
75+
#### Energy Types
76+
77+
- **Energy**: Measured in microjoules (μJ) as cumulative counters
78+
- **Power**: Calculated as rate in microwatts (μW) using `Power = ΔEnergy / Δtime`
79+
80+
#### Wraparound Handling
81+
82+
Hardware energy counters have maximum values and wrap around to zero.
83+
Kepler handles this in `calculateEnergyDelta()`:
84+
85+
```go
86+
func calculateEnergyDelta(current, previous, maxJoules Energy) Energy {
87+
if current >= previous {
88+
return current - previous
89+
}
90+
// Handle counter wraparound
91+
if maxJoules > 0 {
92+
return (maxJoules - previous) + current
93+
}
94+
return 0 // Unable to calculate delta
95+
}
96+
```
97+
98+
### 2. Node-Level Power Calculation (`internal/monitor/node.go`)
99+
100+
The node calculation is the first step in power attribution, splitting total hardware energy into active and idle components.
101+
102+
#### CPU Usage Calculation
103+
104+
```go
105+
nodeCPUTimeDelta := pm.resources.Node().ProcessTotalCPUTimeDelta
106+
nodeCPUUsageRatio := pm.resources.Node().CPUUsageRatio
107+
```
108+
109+
#### Energy Split Algorithm
110+
111+
For each energy zone, Kepler calculates:
112+
113+
```go
114+
deltaEnergy := calculateEnergyDelta(absEnergy, prevZone.EnergyTotal, zone.MaxEnergy())
115+
activeEnergy = Energy(float64(deltaEnergy) * nodeCPUUsageRatio)
116+
idleEnergy := deltaEnergy - activeEnergy
117+
```
118+
119+
**Key Principle**: Active energy represents the portion consumed by CPU-intensive work, while idle energy represents baseline system power consumption.
120+
121+
#### Power Calculation
122+
123+
```go
124+
powerF64 := float64(deltaEnergy) / float64(timeDiff)
125+
power = Power(powerF64)
126+
activePower = Power(powerF64 * nodeCPUUsageRatio)
127+
idlePower = power - activePower
128+
```
129+
130+
### 3. Process Power Attribution (`internal/monitor/process.go`)
131+
132+
Individual processes receive power proportional to their CPU time usage relative to total system CPU time.
133+
134+
#### Attribution Formula
135+
136+
For each running process:
137+
138+
```go
139+
cpuTimeRatio := proc.CPUTimeDelta / nodeCPUTimeDelta
140+
activeEnergy := Energy(cpuTimeRatio * float64(nodeZoneUsage.activeEnergy))
141+
```
142+
143+
#### Power Assignment
144+
145+
```go
146+
process.Zones[zone] = Usage{
147+
Power: Power(cpuTimeRatio * nodeZoneUsage.ActivePower.MicroWatts()),
148+
EnergyTotal: absoluteEnergy,
149+
}
150+
```
151+
152+
#### Cumulative Energy Tracking
153+
154+
Process energy accumulates over time:
155+
156+
```go
157+
absoluteEnergy := activeEnergy
158+
if prev, exists := prev.Processes[pid]; exists {
159+
if prevUsage, hasZone := prev.Zones[zone]; hasZone {
160+
absoluteEnergy += prevUsage.EnergyTotal
161+
}
162+
}
163+
```
164+
165+
## Attribution Flow Example
166+
167+
Using the diagram as reference, here's how 40W total power gets attributed:
168+
169+
### Step 1: Hardware Measurement
170+
171+
- RAPL sensors report total energy consumption for the measurement interval
172+
- Convert to power: `40W total power`
173+
174+
### Step 2: Node CPU Usage Analysis
175+
176+
- System reports 25% CPU usage ratio
177+
- Split power: `40W × 25% = 10W active`, `40W - 10W = 30W idle`
178+
179+
### Step 3: Process Attribution
180+
181+
- VM process uses 100% of active CPU time
182+
- VM gets: `10W × (100% CPU usage) = 10W`
183+
- Container processes within VM get proportional shares of the 10W
184+
185+
### Step 4: Workload Aggregation
186+
187+
- **Container power** = sum of constituent process power
188+
- **VM power** = sum of all processes in the VM
189+
- **Pod power** = sum of container power (in Kubernetes)
190+
191+
## Key Principles
192+
193+
### 1. Energy Conservation
194+
195+
The total attributed power always equals the measured hardware power:
196+
197+
```text
198+
Σ(Process Power) + Idle Power = Total Hardware Power
199+
```
200+
201+
### 2. Proportional Attribution
202+
203+
Power distribution is strictly proportional to CPU time usage:
204+
205+
```text
206+
Process Power = (Process CPU Time / Total CPU Time) × Active Power
207+
```
208+
209+
### 3. Hierarchical Aggregation
210+
211+
Higher-level workloads inherit power from their constituent processes:
212+
213+
- **Pods** = sum of container power
214+
- **Containers** = sum of process power
215+
- **VMs** = sum of process power
216+
217+
### 4. Idle Power Handling
218+
219+
Idle power represents baseline system consumption and is tracked separately but not attributed to individual workloads.
220+
221+
## Implementation Details
222+
223+
### Thread Safety
224+
225+
- **Device Layer**: Not required to be thread-safe (single monitor goroutine)
226+
- **Monitor Layer**: All public methods except `Init()` must be thread-safe
227+
- **Singleflight Pattern**: Prevents redundant power calculations during concurrent requests
228+
229+
### Data Freshness
230+
231+
- Configurable staleness threshold ensures data isn't stale
232+
- Atomic snapshots provide consistent power readings across all workloads
233+
234+
### Terminated Process Handling
235+
236+
- Terminated processes are tracked in a separate collection
237+
- Power attribution continues until the next export cycle
238+
- Priority-based retention manages memory usage
239+
240+
### Error Handling
241+
242+
- Individual zone read failures don't stop attribution
243+
- Graceful degradation when hardware sensors are unavailable
244+
- Comprehensive logging for debugging attribution issues
245+
246+
## Configuration
247+
248+
### Key Settings
249+
250+
- **Collection Interval**: How frequently to read hardware sensors
251+
- **Staleness Threshold**: Maximum age of cached power data
252+
- **Zone Filtering**: Which RAPL zones to use for attribution
253+
- **Fake Meter**: Development mode when hardware unavailable
254+
255+
### Development Mode
256+
257+
```bash
258+
# Use fake CPU meter for development
259+
sudo ./bin/kepler --dev.fake-cpu-meter.enabled --config hack/config.yaml
260+
```
261+
262+
## Monitoring and Debugging
263+
264+
### Metrics Access
265+
266+
- **Local**: `http://localhost:28282/metrics`
267+
- **Compose**: `http://localhost:28283/metrics`
268+
- **Grafana**: `http://localhost:23000`
269+
270+
### Debug Options
271+
272+
```bash
273+
# Enable debug logging
274+
--log.level=debug
275+
276+
# Use stdout exporter for immediate inspection
277+
--exporter.stdout
278+
279+
# Enable performance profiling
280+
--debug.pprof
281+
```
282+
283+
### Key Metrics
284+
285+
- `kepler_node_cpu_watts{}`: Total node power consumption
286+
- `kepler_process_cpu_watts{}`: Individual process power
287+
- `kepler_container_cpu_watts{}`: Container-level aggregation
288+
- `kepler_vm_cpu_watts{}`: Virtual machine power attribution
289+
290+
## Conclusion
291+
292+
Kepler's power attribution system provides accurate, proportional distribution of hardware energy consumption to individual workloads. By using CPU utilization as the primary attribution factor and maintaining strict energy conservation, Kepler enables fine-grained energy accounting for modern containerized and virtualized environments.
293+
294+
The implementation balances accuracy with performance, providing thread-safe concurrent access while minimizing the overhead of continuous power monitoring.

docs/index.md

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
# Kepler Documentation
2+
3+
Welcome to the Kepler documentation. Kepler is a Prometheus exporter that measures energy consumption at container/pod/vm/process/node level by reading hardware sensors and attributing power to workloads based on CPU utilization.
4+
5+
## Documentation Sections
6+
7+
### [User Documentation](user/)
8+
9+
- [Installation Guide](user/installation.md) - How to install and deploy Kepler
10+
11+
### [Developer Documentation](developer/)
12+
13+
- [Power Attribution Guide](developer/power-attribution-guide.md) - How Kepler measures and attributes power consumption
14+
- [Pre-commit Setup](developer/pre-commit.md) - Development workflow setup
15+
16+
### [Configuration](configuration/)
17+
18+
- [Configuration Guide](configuration/configuration.md) - Configuring Kepler for your environment
19+
20+
### [Design Documents](design/)
21+
22+
- [Release Process](design/release.md) - Kepler release process and lifecycle
23+
24+
### [Metrics](metrics/)
25+
26+
- [Metrics Documentation](metrics/metrics.md) - Available Prometheus metrics exported by Kepler
27+
28+
## Quick Start
29+
30+
For a quick start, see the [Installation Guide](user/installation.md) and the main project README.
31+
32+
## Contributing
33+
34+
For information on contributing to Kepler, please see the [Developer Documentation](developer/) section.

0 commit comments

Comments
 (0)