Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
292 changes: 292 additions & 0 deletions docs/developer/proposal/EP_004-ACPI-support.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,292 @@
# EP-004: ACPI Support for Platform Power/Energy Metrics

**Status**: Draft
**Author**: Ray Huang
**Created**: 2025-08-28
**Last Updated**: 2025-08-28

## Summary

The proposal aims to add ACPI support for collecting power metrics in Kepler, specifically for nodes that do not have Redfish capabilities.
This enhancement will allow users to gather platform power metrics from a wider range of hardware, improving the flexibility of Kepler in diverse hardware environments.

## Problem Statement

Not all nodes have Redfish for getting platform power metrics, so this proposal aims to add another option to Kepler that allows users to collect power metrics with ACPI that do not support Redfish.

### Current Limitations

1. Nodes without Redfish support cannot get platform power metrics with Kepler.
2. RAPL may not be available on all hardware, limiting the ability to monitor power metrics.
3. Redfish support is unable to provide platform energy metrics.

## Goals

- **Primary**: Add ACPI power meter monitoring option to Kepler.
- **Seamless Integration**: Integrate with existing Kepler architecture and service patterns
- **Standard Metrics**: Provide platform power metrics via Prometheus following Kepler conventions
- **Multi-Environment Support**: Support Kubernetes and standalone deployments

## Non-Goals

- Replace Redfish support. (Just complementary)
- Support all ACPI features such as setting power limits. (focus on reading power_meter)
- Integration with virtualized environments (e.g., VMs without ACPI pass-through).
- Backward compatibility with legacy ACPI versions (<4.0) or non-Linux systems.

## Requirements

### Functional Requirements

- Use [golang sysfs package](https://github.com/prometheus/procfs) which is used by RAPL to read ACPI power_meter so that no additional dependencies.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what path(s) will be read for ACPI power by procfs? is reading /sys/class/hwmon supported in procfs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally thought reading /sys/bus/acpi/drivers/power_meter/ACPI000D:XX/power[1-*]_average.
But I am not sure if hwmon path and this path which is much common in different OS, this path should be the one that most OS support so that less issues will occur when OS changes. I am not sure which one is better.

I found that golang sysfs module doesn't support ACPI reading both the path I mentioned and hwmon path. I will change using sysfs module into reading file directly.

Thanks.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what permissions are needed to read acpi power?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the Ubuntu 24.04 host I accessed before, the power[1-*]_average are readable for all users. So we only need to mount into container, and this has already done because of RAPL (mount host sysfs). There are no additional capabilities needed for reading.

- Generate `kepler_platform_watts{source="acpi"}` and
`kepler_platform_joules_total{source="acpi"}` metrics
- Follow Kepler's configuration patterns and coding conventions

### Non-Functional Requirements

- **Performance**: Try to minimize overhead during metric collection.
- **Reliability**: Handle module loading failures and sysfs read errors without crashing Kepler.
- **Security**: Same as RAPL part. Only read access for host's sysfs.
- **Maintainability**: Modular code structure.
- **Testability**: Mock sysfs for unit/integration tests.

## Proposed Solution

### High-Level Architecture

Add ACPI option which is an alternative of Redfish to Kepler.
ACPI can collect platform power metrics from reading sysfs. The platform collector is able to reuse the one that Redfish uses if merged.

```text
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: mermaid (like in MSR proposal) is a better way to draw this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fix this, but I think I need to open another PR after all things are discussed and checked. Otherwise, the merge commit for resolving conflict will also be included.

┌─────────────────┐ ┌───────────────────┐
│ CPU Power │ │ Platform Power │
│ (RAPL) │ │ (Redfish or ACPI) │
└─────────────────┘ └───────────────────┘
│ │
▼ ▼
┌─────────────────┐ ┌────────────────────┐
│ Power Monitor │ │ Power Monitor │
│ (Attribution) │ │ (For ACPI, │
│ │ │ no attribution) │
└─────────────────┘ └────────────────────┘
│ │
▼ ▼
┌─────────────────┐ ┌────────────────────┐
│ Power Collector │ │ Platform Collector │
│ │ │ │
└─────────────────┘ └────────────────────┘
└──────────┬─────────────┘
┌──────────────────┐
│ Prometheus │
│ Exporter │
└──────────────────┘
```

### Aggregate power metrics

According to [the ACPI spec for power meter](https://docs.kernel.org/hwmon/acpi_power_meter.html#special-features), there may be more than one power meter available per platform (`/sys/bus/acpi/drivers/power_meter/ACPI000D:XX/power[1-*]_average`).
To acquire power metrics for whole platform, I will aggregate metrics from all available power meters.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aggregate in kepler? with sum or average ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My original idea is using sum to add them together, but I found that there are no docs mentioning that multiple power meters here. I think this requires more discussion before implementing.


## Detailed Design

### Package Structure

```text
cmd/
├── kepler/
│ └── main.go # Add ACPI power monitor when enabled
config/
│ └── config.go # Add config for ACPI
internal/
├── device/
│ └── platform/
│ └── acpi/
| ├── testdata/
| | └── sys/bus/acpi/drivers/power_meter/...
│ ├── service.go # Power monitor for ACPI implementation
│ ├── acpi_power_meter_test.go
│ └── acpi_power_meter.go # Implements power_meter interface which read sysfs to get metrics
└── exporter/prometheus/collector/
└── platform_collector.go # Platform power metrics collector
```

### API/Interface Changes

[Describe any changes to public APIs, service interfaces, or data structures. Show code snippets for new or modified interfaces.]

```go
// acpi_power_meter.go - Implements powerMeter interface
type acpiPowerMeter struct {
logger *slog.Logger
devicePath string
}

// Service for ACPI power meter
type Service struct {
logger *slog.Logger
powerMeter acpiPowerMeter
}
```

## Configuration

### Main Configuration Changes

`ACPI.Enabled` is False by default.

```go
type Platform struct {
Redfish Redfish `yaml:"redfish"`
ACPI ACPI `yaml:"acpi"`
}

type ACPI struct {
Enabled *bool `yaml:"enabled"`
}
```

**CLI flags**

```
--platform.acpi.enabled=true # Enable ACPI monitoring
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please note that for the actual feature implementation, lets introduce this as an experimental feature, get feedback and then promote it as a supported feature.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, what should I do for an experimental feature? Change --platform.acpi.enabled=true into --experimental.platform.acpi.enabled=true?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats right .. May be for proposals, we should add along with status: Implemented, a field for maturity: Experimental|Stable .

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There doesn’t seem to be any document explaining how to handle experimental features. For example, how to configure them or when to promote them to stable. I think a guide is needed. Otherwise, it may be hard for people to decide which feature should be an experimental one.

```

### Configuration File (if applicable)

```yaml
# add lines to /etc/kepler/config.yaml
platform:
acpi:
enabled: false
```

### Security Considerations

This requires sysfs read privilege which is originally requested by RAPL part.

## Deployment Examples

### Kubernetes Environment

```yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: kepler
spec:
template:
spec:
containers:
- name: kepler
args:
- --kube.enable=true
- --kube.node-name=$(NODE_NAME)
- --platform.acpi.enabled=true
env:
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
```

### Standalone Deployment

```bash
# Run Kepler with ACPI support
./kepler --platform.acpi.enabled=true
```

## Testing Strategy

### Test Coverage

- **Unit Tests**:
- Verify acpiPowerMeter correctly reads and parses a single power1_average file.
- Test aggregation of multiple power meters (power1_average, power2_average) into a single metric.
- Simulate sysfs file read failures and verify error handling.
- Test calculation of `kepler_platform_joules_total` by multiplying wattage with elapsed time.
- **Integration Tests**:
- Validate that enabling `ACPI.Enabled=true` initializes the ACPI power meter service.
- Ensure metrics are correctly labeled with `source="acpi"` and exposed via the Prometheus exporter.
- **E2E Tests**:
- Deploy Kepler in a Kubernetes cluster and verify metrics are available in Prometheus.
- Test a node without ACPI support and ensure Kepler falls back to other available collectors (e.g., RAPL).

### Test Infrastructure

- Testdata Structure: Mimic real sysfs structure under `internal/device/platform/acpi/testdata/`
- Fake multiple power meters (`ACPI000D:xx`) or multiple measurements (`power*_average`) for testing
- e.g., `testdata/sys/bus/acpi/drivers/power_meter/ACPI000D:00/power1_average`, with sample values for power readings (e.g., 450500000 microwatts for 450.5 watts).
Copy link
Collaborator

@sthaha sthaha Aug 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can there be a different unit as well? Eg. milliwatts instead of micro? We have had cases in the past (with old kepler) where the power reported was abnormally high ..
See: #1774 -> https://github.com/sustainable-computing-io/kepler-metal-ci/blob/main/docs/train-validate-e2e/2024-09-09_22-41-38/AbsPower/BPFOnly/ExponentialRegressionTrainer/report-v0.7.11-212-gac5ee8b8.md#platform---idle

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


## Migration and Compatibility

### Backward Compatibility
No conflict with existing RAPL support.

### Migration Path

1. **Phase 1**: Update Kepler to the new version with ACPI support.
2. **Phase 2**: Configure `platform.acpi.enabled=true` in `/etc/kepler/config.yaml` or via CLI.
3. **Phase 3**: Deploy updated DaemonSet or restart standalone Kepler.

### Rollback Strategy

Set `ACPI.Enabled` to false and restart the Kepler service.

## Metrics Output

ACPI provides native power **average** metrics, so we can expose `kepler_platform_joules_total` by multiplying `power[1-*]_average` with `power[1-*]_average_interval`.

```prometheus
# Platform power metrics with new source
kepler_platform_watts{source="acpi",node_name="worker-1"} 450.5
kepler_platform_joules_total{source="acpi",node_name="worker-1"} 123456.789
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
kepler_platform_joules_total{source="acpi",node_name="worker-1"} 123456.789

As discussed, we can't support energy joules total just like redfish from instantaneous power. In this case the 'average_power' is only with in the average_interval (which seems like a rolling window). I don't think we should read the power over the average_interval and integrate it.

Hence, lets not support energy computation from power since it is incorrect (based on my reading of the documentation: https://www.kernel.org/doc/html/latest/hwmon/acpi_power_meter.html

Incidentally I found https://github.com/torvalds/linux/blob/master/Documentation/hwmon/sysfs-interface.rst#energy but ACPI doesn't seem to implement it. Lets support this feature if and when kernel support it as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I am going to implement is reading the power[1-*]_average every power[1-*]_average_interval so that every reading will occur in the rolling window. I don't think the result will be incorrect. But the overhead might be higher depends on different intervals.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are we going to validate that the result is valid? IMHO, we should avoid providing a feature that isn't supported by the driver and can't be validated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I got it.


# Existing CPU power metrics (unchanged)
kepler_node_cpu_watts{zone="package",node_name="worker-1"} 125.2
kepler_node_cpu_joules_total{zone="package",node_name="worker-1"} 89234.567
```

## Implementation Plan

### Phase 1: Foundation

- Implement ACPI-related configuration and validation
- Implement ACPI power meter service structure

### Phase 2: Core Functionality

- Read and parse ACPI power meter data
- Prometheus exporter with ACPI source

### Phase 3: Testing and Documentation

- Kubernetes deployment testing
- ACPI-related testing
- User documentation and deployment guides

## Risks and Mitigations

### Technical risks

- **Several measurements under one power meter**: Aggregate `power*_average` to the power meter reading value.
- **Several ACPI power meters detected**: Implement aggregation logic for to combine readings from multiple ACPI power meters (`ACPI000D:xx`).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to take another look at this. We need to ensure that there is no hierarchy/topology in these directories otherwise we run the risk of double counting .. E.g.

E.g. what if ..

  • Multiple ACPI000D:XX devices detected
  • If meter A measures total platform power
  • And meter B measures a subset (CPU power)
  • Then A + B would be the wrong information to provide - right?

We need ...

  1. Real world data /evidence of Multiple Meters .. and the structure

  2. Do you know if we can make use of https://github.com/torvalds/linux/blob/master/drivers/hwmon/acpi_power_meter.c#L254 ?

Copy link
Contributor Author

@ExplorerRay ExplorerRay Aug 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out.

If there are multiple ACPI000D devices, we can use measures/ directory to check which device is measured by this power meter. ref

About multiple measurements power[1-*]_xxx part, I haven't found a reference that telling how to determine how power1, power2 are generated and what they measure.

The host I can access currently only has ACPI000D:00 and power1_xxx in it, so I can only refer to docs or code to know how multiple power meters or measurements work.


### Operational risks

- **No kernel module or device available**: Show error message and fallback to operation without ACPI enabled.

## Success Metrics

- **Functional Metric**: All nodes with `ACPI000D:xx` available will be able to expose platform power metrics.
- **Adoption Metric**: Documentation enables successful deployment by operations teams.

## Open Questions

1. **Multiple sources for same metrics**: Redfish and ACPI can both provide platform power metrics. Should I prioritize one over the other, or can Prometheus handle multiple sources automatically?

2. **Platform power metrics**: This question is related to the first one. ACPI reports the power average instead of instant power like Redfish. Should we use the same metrics name in Prometheus? Or should I use something like `kepler_platform_average_watts` for ACPI?

3. **About the E2E test**: I think the Github runner doesn't expose ACPI metrics. How do we do the testing in that case?
3 changes: 2 additions & 1 deletion docs/developer/proposal/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,8 @@ This directory contains Enhancement Proposals (EPs) for major features and chang

| ID | Title | Status | Author | Created |
|----|-------|--------|--------|---------|
| [EP-000](EP_TEMPLATE.md) | Enhancement Proposal Template | Accepted |Sunil Thaha | 2025-01-18 |
| [EP-000](EP_TEMPLATE.md) | Enhancement Proposal Template | Accepted | Sunil Thaha | 2025-01-18 |
| [EP-004](EP_004-ACPI-support.md) | ACPI Support for Platform Power/Energy Metrics | Draft | Ray Huang | 2025-08-28 |

## Proposal Status

Expand Down