Skip to content

IGWPark/rbln-mon

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 

Repository files navigation

rbln-mon

A simple way to track resource use for NPU and GPU during inference. Built to monitor Rebellions NPU performance and compare it directly with GPU results.


Features

  • Tracks NPU and GPU usage, memory, power, and temperature
  • Works with both Rebellions NPU and NVIDIA GPU
  • Uses a clean decorator interface
  • Handles multiple devices
  • Has zero dependencies (relies only on system commands)

Installation

git clone https://github.com/IGWPark/rbln-mon.git
cd rbln-mon
pip install -e .

With Visualization Support

pip install -e .[viz]

Requirements

Core Library

  • Python 3.10 or higher
  • For NPU: rbln-stat (Rebellions drivers)
  • For GPU: nvidia-smi (NVIDIA drivers)
  • No Python dependencies, only subprocess

Optional

  • For visualization:

    pip install rbln-mon[viz]  # installs plotly
  • For notebooks:

    • NPU → torch, transformers, optimum-rbln
    • GPU → torch, transformers

Tested Configurations

Library

  • Python 3.12

Notebooks (examples only)

  • torch: 2.7.0
  • transformers: 4.51.3
  • optimum-rbln: 0.8.3 (for NPU examples)
  • plotly: 6.3.1 (for visualization)
  • nbformat: 5.10.4 (for notebooks)

NPU

  • Rebellions NPU: RBLN-CA22
  • rbln-stat: 1.3.73

GPU

  • Driver: 560.35.03
  • CUDA: 12.6

Usage

Tracking your device performance is as easy as adding one decorator. You don''t have to change your existing code, just tag your function :D

Simple Tracking

from rbln_mon import track

@track()  # Auto-detect backend
def run_model_inference():
    model = load_model()
    result = model(input_data)
    return result

NPU Tracking

@track(backend='npu', device_ids=[0], interval=0.1)
def generate_text(prompt, max_tokens=100):
    tokens = tokenizer.encode(prompt)
    output = model.generate(tokens, max_length=max_tokens)
    return tokenizer.decode(output)

GPU Tracking

@track(backend='gpu', device_ids=[0], interval=0.2)
def batch_inference(inputs, batch_size=32):
    results = []
    for i in range(0, len(inputs), batch_size):
        batch = inputs[i:i+batch_size]
        output = model(batch)
        results.extend(output)
    return results

Multi-Device Tracking

@track(backend='gpu', device_ids=[0, 1, 2])
def multi_gpu_inference(inputs):
    model = torch.nn.DataParallel(model, device_ids=[0, 1, 2])
    return model(inputs)

Output Example

NPU Tracking: inference(batch_size=4, max_tokens=64)
Duration: 5.93s | Samples: 29 | Backend: npu/1.3.73

Device 0 (RBLN-CA22):
  Util:  avg= 39.2%  peak= 76.9%
  Mem:   peak=  2.08GB / 15.7GB
  Power: avg= 35.9W  peak= 53.8W
  Temp:  40C -> 46C

Advanced Usage

Add Metadata

Metadata helps label your runs for easier comparison later. You can track details like model type, batch size, or experiment name.

@track(backend='npu', save=True, metadata={'description': 'Batch Size 4', 'model': 'llama-7b'})
def run_llama_inference(prompt, batch_size=4):
    return model.generate(prompt, batch_size=batch_size)

Each run saves with its metadata, so you can quickly find or compare experiments afterward.

Save Tracking Data

@track(backend='gpu', save=True, save_dir='my_experiments')
def benchmark_inference(test_data, iterations=100):
    for i in range(iterations):
        result = model(test_data)
    return result

Visualization

from rbln_mon.visualize import plot_metrics, compare_runs, summary_table, list_saved_runs

# List all saved runs
runs = list_saved_runs('rbln_mon_logs')

# Compare multiple runs
compare_runs(*runs)
summary_table(*runs)

Saved Data Format

When you enable save=True, the tracking data is stored as JSON in this format:

{
  "function": "inference",
  "args": {"batch_size": 4, "max_tokens": 64},
  "metadata": {"description": "Batch Size 4"},
  "backend": {"type": "npu", "version": "1.3.73"},
  "devices": [{"device_id": 0, "name": "RBLN-CA22", "mem_total_gb": 15.7}],
  "duration": 5.93,
  "samples": [{"device_id": 0, "t": 0.1, "util": 45.2, "mem_used_gb": 1.2, "power_w": 38.5, "temp_c": 42}],
  "summary": {"device_0": {"util_avg": 39.2, "util_peak": 76.9, "mem_peak_gb": 2.08}}
}

Parameters

  • backend: 'auto' | 'gpu' | 'npu' (default: 'auto')
  • device_ids: List of devices to track (default: [0])
  • interval: Time between polls in seconds (default: 0.2)
  • save: Whether to save tracking data as JSON (default: False)
  • save_dir: Folder to save tracking data (default: 'rbln_mon_logs')
  • metadata: Extra info to store with each run (default: None)

Troubleshooting

Common Issues

  1. No supported backend found Make sure you have either nvidia-smi (for GPU) or rbln-stat (for NPU) installed and in your PATH.

  2. Permission denied Some systems need elevated privileges to read GPU or NPU stats.

  3. ImportError for visualization Install visualization dependencies with pip install -e .[viz].

  4. Device not found To check available devices:

    from rbln_mon.backends import NvidiaBackend, RebellionsBackend
    print("GPU devices:", NvidiaBackend.list_devices())
    print("NPU devices:", RebellionsBackend.list_devices())

Examples

You'll find more examples in the notebooks/ folder:

  • 01_npu_example.ipynb - NPU inference tracking
  • 02_gpu_example.ipynb - GPU inference tracking
  • 03_visualization.ipynb - Data visualization samples

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors