Energy-Aware Scalability Prediction Model

Overview

The Power Prediction Model is a machine learning-based system that predicts power consumption for FFmpeg transcoding workloads. This document covers two predictors:

PowerPredictor (v0.1): Simple univariate predictor based on stream count
MultivariatePredictor (v0.2): Advanced multivariate predictor with ensemble models and confidence intervals

Key Features:

Automatic model selection (linear vs polynomial regression)
Ensemble models (Random Forest, Gradient Boosting)
Robust scenario name parsing
Handles missing data gracefully
Provides prediction confidence through R² scores and confidence intervals
Hardware-aware model storage and versioning
Exports predictions to CSV and Prometheus metrics

MultivariatePredictor (v0.2) - Advanced Features

Architecture

The MultivariatePredictor extends the basic PowerPredictor with:

Multiple Input Features:

stream_count: Number of concurrent transcoding streams
bitrate_mbps: Bitrate in megabits per second
total_pixels: Sum of width × height × fps across all outputs
cpu_usage_pct: Mean CPU usage percentage during scenario
encoder_type: One-hot encoded (x264, NVENC, etc.)
hardware_cpu_model: Hashed or one-hot encoded CPU model
container_cpu_pct: Docker container CPU overhead percentage

Ensemble of Regression Models:

Linear Regression: Baseline model
Polynomial Regression (degree=2,3): Non-linear relationships
RandomForestRegressor: Handles complex interactions
GradientBoostingRegressor: State-of-the-art performance (with XGBoost fallback)

Prediction Targets:

mean_power_watts: Mean power consumption
total_energy_joules: Total energy consumed
efficiency_score: Direct efficiency prediction

Confidence Intervals:

Bootstrapped prediction intervals
Configurable confidence level (default: 95%)
Shows prediction uncertainty

Hardware Awareness:

Per-hardware model storage
Automatic hardware fingerprinting
Fallback to universal model if hardware unknown

Usage Examples

Basic Training and Prediction

from advisor import MultivariatePredictor

# Create predictor with ensemble models
predictor = MultivariatePredictor(
    models=['linear', 'poly2', 'rf', 'gbm'],
    confidence_level=0.95,
    n_bootstrap=100,
    cv_folds=5
)

# Train on scenario results
success = predictor.fit(scenarios, target='mean_power_watts')

if success:
    # Make prediction with confidence intervals
    prediction = predictor.predict({
        'stream_count': 6,
        'bitrate_mbps': 3.0,
        'total_pixels': 1920*1080*30*60,
        'cpu_usage_pct': 80.0,
        'encoder_type': 'x264',
        'hardware_cpu_model': 'Intel_i7_9700K',
        'container_cpu_pct': 7.0
    }, return_confidence=True)

    print(f"Predicted power: {prediction['mean']:.2f} W")
    print(f"Confidence interval: [{prediction['ci_low']:.2f}, {prediction['ci_high']:.2f}] W")
    print(f"Confidence width: {prediction['ci_width']:.2f} W")
    print(f"Model used: {prediction['model']}")

Model Information

info = predictor.get_model_info()
print(f"Trained: {info['trained']}")
print(f"Best model: {info['best_model']}")
print(f"R² score: {info['best_score']['r2']:.4f}")
print(f"RMSE: {info['best_score']['rmse']:.2f} W")
print(f"Training samples: {info['n_samples']}")
print(f"Features: {', '.join(info['feature_names'])}")

Save and Load Models

from pathlib import Path

# Save trained model
model_path = Path('advisor/models/Intel_i7_9700K/power_model_v1.pkl')
predictor.save(model_path)

# Load saved model
loaded_predictor = MultivariatePredictor.load(model_path)

Batch Predictions

# Predict for multiple configurations efficiently
features_list = [
    {'stream_count': 2, 'bitrate_mbps': 2.5, ...},
    {'stream_count': 4, 'bitrate_mbps': 2.5, ...},
    {'stream_count': 8, 'bitrate_mbps': 5.0, ...},
]

predictions = predictor.predict_batch(features_list, return_confidence=True)

for i, pred in enumerate(predictions):
    print(f"Config {i+1}: {pred['mean']:.2f} ± {pred['ci_width']/2:.2f} W")

CLI Integration

The multivariate predictor is integrated into analyze_results.py:

# Use multivariate predictor for analysis
python3 scripts/analyze_results.py --multivariate

# Generate predictions for specific stream counts
python3 scripts/analyze_results.py --multivariate --predict-future 1,2,4,8,12,16

# Use simple predictor (backward compatible)
python3 scripts/analyze_results.py --predict-future 1,2,4,8,12

Prometheus Metrics

The results_exporter automatically trains the multivariate predictor and exposes metrics:

# Predicted power consumption
results_scenario_predicted_power_watts{run_id="test_results_20231215_143022"}

# Predicted energy consumption
results_scenario_predicted_energy_joules{run_id="test_results_20231215_143022"}

# Confidence interval bounds
results_scenario_prediction_confidence_low{run_id="test_results_20231215_143022"}
results_scenario_prediction_confidence_high{run_id="test_results_20231215_143022"}

Grafana Dashboards

Two new dashboards are available:

Future Load Predictions (future-load-predictions.json)
- Measured vs Predicted power comparison
- Prediction confidence intervals
- Prediction accuracy gauge
- Confidence interval width
- Detailed prediction results table
Efficiency Forecasting (efficiency-forecasting.json)
- Energy efficiency scores by scenario
- Efficiency rankings table
- Top 5 most efficient configurations
- Efficiency score distribution

Model Selection

The predictor automatically selects the best model based on cross-validation R² scores:

Training 5 models on 12 samples...
  linear: R²=0.9234, RMSE=15.23
  poly2: R²=0.9567, RMSE=11.45
  poly3: R²=0.9601, RMSE=10.89
  rf: R²=0.9823, RMSE=7.34
  gbm: R²=0.9891, RMSE=5.67

Best model: gbm (R²=0.9891, RMSE=5.67)

Confidence Intervals

Prediction uncertainty is quantified using bootstrapped confidence intervals:

Training Phase: Store training data (X, y)
Bootstrap Resampling: Create N bootstrap samples (default: 100)
Model Training: Train model on each bootstrap sample
Prediction: Generate N predictions for the same input
Confidence Bounds: Calculate percentiles (e.g., 2.5% and 97.5% for 95% CI)

Example output:

Predicted Power: 213.7 W
95% Confidence Interval: [202.3, 225.1] W
Confidence Width: 22.8 W

Interpretation:

Narrow CI (< 10 W): High confidence, model is certain
Medium CI (10-30 W): Moderate confidence, some uncertainty
Wide CI (> 30 W): Low confidence, model is uncertain

PowerPredictor (v0.1) - Simple Univariate Model

The original PowerPredictor remains available for backward compatibility and simple use cases.

Mathematical Model

Linear Regression (< 6 unique stream counts)

Used when training data contains fewer than 6 unique stream count values. Provides a simple, stable model suitable for small datasets.

Formula:

Power(streams) = β₀ + β₁ × streams

Parameters:

β₀ (intercept): Baseline power consumption representing idle/overhead power
β₁ (coefficient): Incremental power per additional stream (watts per stream)
streams: Number of concurrent transcoding streams

Example Interpretation: If β₀ = 40W and β₁ = 15W/stream, then:

0 streams: 40W (baseline/idle)
4 streams: 40 + (15 × 4) = 100W
8 streams: 40 + (15 × 8) = 160W

Assumptions:

Linear scaling: Each additional stream adds constant power
No thermal throttling or frequency scaling effects
Consistent hardware behavior across workload range

Polynomial Regression (≥ 6 unique stream counts)

Used when training data contains 6 or more unique stream count values. Captures non-linear effects in power consumption.

Formula:

Power(streams) = β₀ + β₁ × streams + β₂ × streams²

Parameters:

β₀ (intercept): Baseline power consumption
β₁ (linear coefficient): Linear component of power scaling
β₂ (quadratic coefficient): Non-linear scaling effects
streams: Number of concurrent transcoding streams

What the Quadratic Term Captures:

Thermal Throttling: At high loads, CPUs may reduce frequency to manage heat
Cache Contention: More streams compete for L3 cache, reducing efficiency
Memory Bandwidth Saturation: DRAM bandwidth becomes bottleneck
CPU Frequency Scaling: Turbo boost behavior changes with core utilization
Power Delivery Limits: VRM (Voltage Regulator Module) constraints

Example Interpretation: If β₀ = 35W, β₁ = 18W/stream, β₂ = -0.5W/stream²:

2 streams: 35 + (18 × 2) + (-0.5 × 4) = 69W
4 streams: 35 + (18 × 4) + (-0.5 × 16) = 99W
8 streams: 35 + (18 × 8) + (-0.5 × 64) = 147W (reduced efficiency)

The negative β₂ indicates diminishing returns: each additional stream adds less power than predicted by linear model.

Data Requirements

Input Data Structure

The model expects scenario data from ResultsAnalyzer with the following structure:

scenarios = [
    {
        'name': '2 Streams @ 2500k',           # String with stream count
        'power': {
            'mean_watts': 80.0,                # Mean power during test
            'median_watts': 79.5,              # Not used by predictor
            'min_watts': 75.0,                 # Not used by predictor
            'max_watts': 85.0,                 # Not used by predictor
            'total_energy_joules': 4800.0      # Not used by predictor
        },
        'bitrate': '2500k',                    # Not used by predictor
        'resolution': '1280x720',              # Not used by predictor
        'fps': 30,                             # Not used by predictor
        'duration': 60.0                       # Not used by predictor
    },
    # ... more scenarios
]

Required Fields:

name: String containing stream count information
power.mean_watts: Float representing average power consumption

Optional Fields: All other fields are ignored by the predictor but used by other analysis components.

Stream Count Inference

The model automatically extracts stream counts from scenario names using pattern matching.

Supported Patterns:

Pattern	Example	Extracted Count
`N stream(s)`	`"4 Streams @ 2500k"`	4
`N-stream`	`"8-stream test"`	8
`single stream`	`"Single Stream @ 1080p"`	1
Leading number	`"6 concurrent streams"`	6
Case insensitive	`"12 STREAMS Test"`	12

Non-Matchable Patterns:

"Baseline (Idle)" → None (no stream count)
"Multi Stream Test" → None (ambiguous count)
"High Quality Encode" → None (no stream count)

Scenarios without extractable stream counts are automatically filtered out during training.

Data Quality Guidelines

Minimum Requirements:

1 valid data point (model will train but predictions will be poor)
At least mean_watts power measurement for each scenario

Recommended for Linear Model:

4+ data points with different stream counts
Even distribution across stream count range
Example: [1, 2, 4, 8] streams

Recommended for Polynomial Model:

7+ data points with different stream counts
Wide range of stream counts
Example: [1, 2, 3, 4, 6, 8, 12] streams

Data Collection Best Practices:

Run each test for 60+ seconds to get stable power readings
Allow 10-15 second stabilization before measurement
Maintain consistent hardware configuration across tests
Keep ambient temperature stable
Use same codec, preset, and quality settings
Measure RAPL (Running Average Power Limit) counters for accuracy

Model Training Algorithm

Training Process

1. Data Extraction
   ├─ Parse scenario names to infer stream counts
   ├─ Extract mean_watts from power measurements
   └─ Filter scenarios missing either value

2. Feature Engineering
   ├─ X (features): Stream counts as numpy array [n_samples, 1]
   ├─ y (target): Power measurements as numpy array [n_samples]
   └─ Count unique stream values

3. Model Selection
   ├─ If unique_streams < 6:
   │   └─ Use Linear Regression
   └─ If unique_streams ≥ 6:
       └─ Use Polynomial Regression (degree=2)

4. Feature Transformation (Polynomial only)
   ├─ Input: [streams]
   ├─ Transform: [1, streams, streams²]
   └─ Example: [4] → [1, 4, 16]

5. Model Fitting
   ├─ Algorithm: Ordinary Least Squares (OLS)
   ├─ Objective: Minimize Σ(y_true - y_pred)²
   └─ Solver: sklearn LinearRegression

6. Model Validation
   ├─ Calculate R² score
   ├─ R² = 1 - (SS_res / SS_tot)
   │   Where:
   │   SS_res = Σ(y_true - y_pred)²  # Residual sum of squares
   │   SS_tot = Σ(y_true - y_mean)²  # Total sum of squares
   └─ Log R² for quality assessment

R² Score Interpretation

R² Value	Interpretation	Action
0.95 - 1.00	Excellent fit	High confidence predictions
0.85 - 0.95	Good fit	Reliable predictions
0.70 - 0.85	Moderate fit	Use with caution
0.50 - 0.70	Poor fit	Consider more data
< 0.50	Very poor fit	Model not reliable
Negative	Model worse than mean	Do not use predictions

Prediction Methodology

Prediction Algorithm

1. Input Validation
   └─ Check if model is trained (return None if not)

2. Feature Preparation
   ├─ Create feature array: X = [[streams]]
   └─ For polynomial: Transform to [1, streams, streams²]

3. Model Prediction
   ├─ Linear: Power = β₀ + β₁ × streams
   └─ Polynomial: Power = β₀ + β₁ × streams + β₂ × streams²

4. Post-Processing
   ├─ Clamp to non-negative: max(0, prediction)
   └─ Return as float (watts)

Interpolation vs Extrapolation

Interpolation (within training range): Generally reliable

Training data: [2, 4, 8] streams
Prediction:    6 streams ✓ (between 4 and 8)
Confidence:    High

Extrapolation (outside training range): Use with caution

Training data: [2, 4, 8] streams
Prediction:    16 streams ⚠ (beyond 8)
Confidence:    Moderate (within 2x range)

Prediction:    64 streams ✗ (far beyond 8)
Confidence:    Low (> 2x range, avoid)

Extrapolation Risks:

Linear model assumes constant scaling (may diverge from reality)
Polynomial model can diverge rapidly outside training range
Real systems may have thermal limits not captured in model
CPU throttling behavior may change at extreme loads
Power supply limits may cap maximum power

Usage Examples

Basic Training and Prediction

from advisor import PowerPredictor

# Create predictor
predictor = PowerPredictor()

# Load scenarios (from ResultsAnalyzer)
scenarios = [
    {'name': '2 Streams @ 2500k', 'power': {'mean_watts': 80.0}},
    {'name': '4 Streams @ 2500k', 'power': {'mean_watts': 150.0}},
    {'name': '8 Streams @ 1080p', 'power': {'mean_watts': 280.0}},
]

# Train model
success = predictor.fit(scenarios)
if success:
    print("Model trained successfully!")
else:
    print("Failed to train (no valid data)")

# Make predictions
power_6 = predictor.predict(6)
print(f"Predicted power for 6 streams: {power_6:.2f} W")

power_12 = predictor.predict(12)
print(f"Predicted power for 12 streams: {power_12:.2f} W")

Checking Model Information

# Get model metadata
info = predictor.get_model_info()

print(f"Trained: {info['trained']}")
print(f"Model Type: {info['model_type']}")
print(f"Training Samples: {info['n_samples']}")
print(f"Stream Range: {info['stream_range']}")

# Example output:
# Trained: True
# Model Type: linear
# Training Samples: 3
# Stream Range: (2, 8)

Integration with analyze_results.py

The model is automatically integrated when running analysis:

# Run analysis (includes power predictions)
python3 scripts/analyze_results.py test_results/test_results_20231215_143022.json

# Output includes:
# 1. Standard analysis report
# 2. Power scalability predictions section
# 3. Measured vs predicted comparison table
# 4. CSV export with predicted_mean_power_w column

Output Formats

Console Output

==================================================================================================
POWER SCALABILITY PREDICTIONS
==================================================================================================

Model Type: LINEAR
Training Samples: 4
Stream Range: 1 - 8 streams

Predicted Power Consumption:
──────────────────────────────────────────────────────────────────────────────────────────────────
   1 streams:    45.23 W
   2 streams:    78.45 W
   4 streams:   145.12 W
   8 streams:   278.67 W
  12 streams:   412.23 W

──────────────────────────────────────────────────────────────────────────────────────────────────
MEASURED vs PREDICTED COMPARISON
──────────────────────────────────────────────────────────────────────────────────────────────────
(Shows model fit quality on training data)
Streams    Measured (W)    Predicted (W)   Diff (W)
──────────────────────────────────────────────────────────────────────────────────────────────────
1          45.00           45.23           +0.23
2          80.00           78.45           -1.55
4          150.00          145.12          -4.88
8          280.00          278.67          -1.33
──────────────────────────────────────────────────────────────────────────────────────────────────

CSV Export

The predicted_mean_power_w column is added to the analysis CSV:

name,bitrate,resolution,fps,duration,mean_power_w,predicted_mean_power_w,...
"2 Streams @ 2500k",2500k,1280x720,30,60.0,80.0,78.45,...
"4 Streams @ 2500k",2500k,1280x720,30,60.0,150.0,145.12,...
"8 Streams @ 1080p",5000k,1920x1080,30,60.0,280.0,278.67,...

Column Meaning:

mean_power_w: Actual measured power from Prometheus/RAPL
predicted_mean_power_w: Model prediction based on stream count

Limitations and Caveats

Model Assumptions

Consistent Hardware: Same CPU, RAM, cooling across all measurements
Consistent Configuration: Same FFmpeg preset, codec, quality settings
Stream Count Primary Factor: Assumes power scales mainly with stream count
Stable Environment: Constant ambient temperature, no thermal throttling

Not Accounted For

Different Codecs: H.264 vs H.265 vs AV1 have different power profiles Different Resolutions: 720p vs 1080p vs 4K per stream Different Bitrates: 2500k vs 5000k per stream Different Presets: ultrafast vs medium vs slow Ambient Temperature: Heat affects CPU frequency and power Power Management: Governor settings (performance vs powersave) Background Load: Other processes competing for CPU Turbo Boost State: Enabled vs disabled NUMA Effects: Multi-socket systems with non-uniform memory access

When Model May Be Inaccurate

Small Datasets: < 3 training points Extrapolation: Predicting > 2x max training stream count Heterogeneous Data: Mixed codecs, resolutions, or settings Thermal Throttling: Training data includes throttled measurements Inconsistent Measurements: Wide variance in power readings Low R² Score: < 0.70 indicates poor model fit

Use Cases

1. Capacity Planning

Scenario: Determine how many concurrent streams a server can handle within power budget.

predictor = PowerPredictor()
predictor.fit(benchmark_scenarios)

# Power budget: 300W
max_power = 300.0
for streams in range(1, 20):
    predicted = predictor.predict(streams)
    if predicted > max_power:
        print(f"Max streams within {max_power}W: {streams - 1}")
        break

2. Cost Estimation

Scenario: Estimate monthly energy costs for different workload sizes.

# Predict power for target workload
streams = 10
power_watts = predictor.predict(streams)

# Calculate monthly energy
hours_per_month = 730
kwh_per_month = (power_watts * hours_per_month) / 1000

# Calculate cost (assuming $0.12/kWh)
cost_per_month = kwh_per_month * 0.12

print(f"{streams} streams: {kwh_per_month:.2f} kWh/month = ${cost_per_month:.2f}")

3. Thermal Management

Scenario: Identify safe operating limits before load testing.

# Check predicted power at different scales
for streams in [4, 8, 12, 16]:
    power = predictor.predict(streams)
    print(f"{streams} streams → {power:.0f}W")

    if power > 250:  # Server cooling limit
        print(f"    Exceeds thermal capacity")

4. Infrastructure Sizing

Scenario: Determine PDU (Power Distribution Unit) requirements for data center.

# Calculate total rack power for 10 servers
servers = 10
streams_per_server = 8

power_per_server = predictor.predict(streams_per_server)
total_rack_power = power_per_server * servers

print(f"Total rack power: {total_rack_power:.0f}W")
print(f"Required PDU capacity: {total_rack_power * 1.2:.0f}W (20% headroom)")

Model Validation

Assessing Prediction Quality

Check R² Score: Should be > 0.70 for reliable predictions

# Logged automatically during training
# INFO:root:PowerPredictor trained on 5 data points, R² = 0.9234

Review Comparison Table: Differences should be small

Streams    Measured (W)    Predicted (W)   Diff (W)
2          80.00           78.45           -1.55    ✓ Good
4          150.00          145.12          -4.88    ✓ Good
8          280.00          320.50          +40.50   ✗ Poor

Cross-Validation: Hold out some data points

# Train on subset
train_scenarios = scenarios[:-2]
predictor.fit(train_scenarios)

# Test on held-out data
test_scenarios = scenarios[-2:]
for scenario in test_scenarios:
    streams = predictor._infer_stream_count(scenario['name'])
    predicted = predictor.predict(streams)
    actual = scenario['power']['mean_watts']
    error = abs(predicted - actual) / actual * 100
    print(f"{scenario['name']}: {error:.1f}% error")

Improving Model Accuracy

Collect More Data:

Add scenarios with different stream counts
Fill gaps in stream count range
Add replicate measurements for averaging

Ensure Data Quality:

Verify stable power readings (low stddev)
Check for thermal throttling during tests
Confirm consistent test duration (60+ seconds)
Validate RAPL measurements are accurate

Consider Polynomial Model:

Collect ≥ 6 unique stream counts
Model will automatically switch to polynomial
Better captures non-linear scaling effects

Standardize Test Conditions:

Same FFmpeg preset across all tests
Same resolution and bitrate per stream
Same ambient temperature
Same power management settings

Troubleshooting

Model Won't Train

Issue: predictor.fit() returns False

Causes:

No scenarios with valid power data
No scenarios with parseable stream counts
All scenarios filtered out

Solutions:

# Debug: Check what data was extracted
predictor = PowerPredictor()
for scenario in scenarios:
    streams = predictor._infer_stream_count(scenario['name'])
    power = scenario.get('power', {}).get('mean_watts')
    print(f"{scenario['name']}: streams={streams}, power={power}")

Poor Predictions

Issue: Large differences between measured and predicted

Causes:

Low R² score (< 0.70)
Non-linear effects in data but using linear model
Inconsistent measurements in training data
Extrapolating far beyond training range

Solutions:

Collect more training data (aim for 6+ unique stream counts)
Check for outliers in training data
Review test conditions for consistency
Avoid predictions > 2x max training stream count

Negative Predictions

Issue: Model predicts negative power

This should not happen - predictions are clamped to non-negative values. If you see this, it's a bug.

Technical Implementation Details

Dependencies

numpy>=1.20.0          # Array operations, linear algebra
scikit-learn>=1.3.0    # Machine learning (LinearRegression, PolynomialFeatures)

Key Classes and Methods

PowerPredictor:

__init__(): Initialize empty model
fit(scenarios): Train on scenario data
predict(streams): Predict power for N streams
get_model_info(): Get model metadata
_infer_stream_count(name): Parse stream count from name

sklearn Components:

LinearRegression: OLS regression model
PolynomialFeatures(degree=2): Feature transformation for quadratic terms

Code Location

Model implementation: advisor/modeling.py
Integration: analyze_results.py
Tests: tests/test_modeling.py
Documentation: docs/power-prediction-model.md (this file)

Future Enhancements

Potential Improvements

Multi-Variable Models: Incorporate resolution, bitrate, codec
Time-Series Predictions: Account for thermal buildup over time
Confidence Intervals: Provide prediction uncertainty ranges
GPU Power Modeling: Extend to NVIDIA/AMD GPU transcoding
Ensemble Models: Combine multiple models for robustness
Automated Hyperparameter Tuning: Optimize polynomial degree
Feature Selection: Identify most predictive variables
Cross-Platform Validation: Test on different CPU architectures

Contributing

To improve the model:

Collect diverse training data (different workloads, hardware)
Document any prediction errors or limitations discovered
Suggest additional features or variables to incorporate
Share R² scores and model performance metrics

References

RAPL (Running Average Power Limit): Intel's power measurement interface
Ordinary Least Squares (OLS): Statistical method for linear regression
scikit-learn Documentation: https://scikit-learn.org/stable/modules/linear_model.html
Polynomial Regression: https://en.wikipedia.org/wiki/Polynomial_regression
R² Score: https://en.wikipedia.org/wiki/Coefficient_of_determination

License

This component follows the same license as the main ffmpeg-rtmp project.

Contact

For questions or issues related to the power prediction model:

Open an issue on GitHub repository
Include your R² score and training data characteristics
Provide example predictions showing unexpected behavior

FilesExpand file tree

power-prediction-model.md

Latest commit

History

power-prediction-model.md

File metadata and controls

Energy-Aware Scalability Prediction Model

Overview

MultivariatePredictor (v0.2) - Advanced Features

Architecture

Usage Examples

Basic Training and Prediction

Model Information

Save and Load Models

Batch Predictions

CLI Integration

Prometheus Metrics

Grafana Dashboards

Model Selection

Confidence Intervals

PowerPredictor (v0.1) - Simple Univariate Model

Mathematical Model

Linear Regression (< 6 unique stream counts)

Polynomial Regression (≥ 6 unique stream counts)

Data Requirements

Input Data Structure

Stream Count Inference

Data Quality Guidelines

Model Training Algorithm

Training Process

R² Score Interpretation

Prediction Methodology

Prediction Algorithm

Interpolation vs Extrapolation

Usage Examples

Basic Training and Prediction

Checking Model Information

Integration with analyze_results.py

Output Formats

Console Output

CSV Export

Limitations and Caveats

Model Assumptions

Not Accounted For

When Model May Be Inaccurate

Use Cases

1. Capacity Planning

2. Cost Estimation

3. Thermal Management

4. Infrastructure Sizing

Model Validation

Assessing Prediction Quality

Improving Model Accuracy

Troubleshooting

Model Won't Train

Poor Predictions

Negative Predictions

Technical Implementation Details

Dependencies

Key Classes and Methods

Code Location

Future Enhancements

Potential Improvements

Contributing

References

License

Contact