A simple way to track resource use for NPU and GPU during inference. Built to monitor Rebellions NPU performance and compare it directly with GPU results.
- Tracks NPU and GPU usage, memory, power, and temperature
- Works with both Rebellions NPU and NVIDIA GPU
- Uses a clean decorator interface
- Handles multiple devices
- Has zero dependencies (relies only on system commands)
git clone https://github.com/IGWPark/rbln-mon.git
cd rbln-mon
pip install -e .pip install -e .[viz]- Python 3.10 or higher
- For NPU:
rbln-stat(Rebellions drivers) - For GPU:
nvidia-smi(NVIDIA drivers) - No Python dependencies, only
subprocess
-
For visualization:
pip install rbln-mon[viz] # installs plotly -
For notebooks:
- NPU → torch, transformers, optimum-rbln
- GPU → torch, transformers
- Python 3.12
- torch: 2.7.0
- transformers: 4.51.3
- optimum-rbln: 0.8.3 (for NPU examples)
- plotly: 6.3.1 (for visualization)
- nbformat: 5.10.4 (for notebooks)
- Rebellions NPU: RBLN-CA22
- rbln-stat: 1.3.73
- Driver: 560.35.03
- CUDA: 12.6
Tracking your device performance is as easy as adding one decorator. You don''t have to change your existing code, just tag your function :D
from rbln_mon import track
@track() # Auto-detect backend
def run_model_inference():
model = load_model()
result = model(input_data)
return result@track(backend='npu', device_ids=[0], interval=0.1)
def generate_text(prompt, max_tokens=100):
tokens = tokenizer.encode(prompt)
output = model.generate(tokens, max_length=max_tokens)
return tokenizer.decode(output)@track(backend='gpu', device_ids=[0], interval=0.2)
def batch_inference(inputs, batch_size=32):
results = []
for i in range(0, len(inputs), batch_size):
batch = inputs[i:i+batch_size]
output = model(batch)
results.extend(output)
return results@track(backend='gpu', device_ids=[0, 1, 2])
def multi_gpu_inference(inputs):
model = torch.nn.DataParallel(model, device_ids=[0, 1, 2])
return model(inputs)NPU Tracking: inference(batch_size=4, max_tokens=64)
Duration: 5.93s | Samples: 29 | Backend: npu/1.3.73
Device 0 (RBLN-CA22):
Util: avg= 39.2% peak= 76.9%
Mem: peak= 2.08GB / 15.7GB
Power: avg= 35.9W peak= 53.8W
Temp: 40C -> 46C
Metadata helps label your runs for easier comparison later. You can track details like model type, batch size, or experiment name.
@track(backend='npu', save=True, metadata={'description': 'Batch Size 4', 'model': 'llama-7b'})
def run_llama_inference(prompt, batch_size=4):
return model.generate(prompt, batch_size=batch_size)Each run saves with its metadata, so you can quickly find or compare experiments afterward.
@track(backend='gpu', save=True, save_dir='my_experiments')
def benchmark_inference(test_data, iterations=100):
for i in range(iterations):
result = model(test_data)
return resultfrom rbln_mon.visualize import plot_metrics, compare_runs, summary_table, list_saved_runs
# List all saved runs
runs = list_saved_runs('rbln_mon_logs')
# Compare multiple runs
compare_runs(*runs)
summary_table(*runs)When you enable save=True, the tracking data is stored as JSON in this format:
{
"function": "inference",
"args": {"batch_size": 4, "max_tokens": 64},
"metadata": {"description": "Batch Size 4"},
"backend": {"type": "npu", "version": "1.3.73"},
"devices": [{"device_id": 0, "name": "RBLN-CA22", "mem_total_gb": 15.7}],
"duration": 5.93,
"samples": [{"device_id": 0, "t": 0.1, "util": 45.2, "mem_used_gb": 1.2, "power_w": 38.5, "temp_c": 42}],
"summary": {"device_0": {"util_avg": 39.2, "util_peak": 76.9, "mem_peak_gb": 2.08}}
}backend:'auto'|'gpu'|'npu'(default:'auto')device_ids: List of devices to track (default:[0])interval: Time between polls in seconds (default:0.2)save: Whether to save tracking data as JSON (default:False)save_dir: Folder to save tracking data (default:'rbln_mon_logs')metadata: Extra info to store with each run (default:None)
-
No supported backend found Make sure you have either
nvidia-smi(for GPU) orrbln-stat(for NPU) installed and in your PATH. -
Permission denied Some systems need elevated privileges to read GPU or NPU stats.
-
ImportError for visualization Install visualization dependencies with
pip install -e .[viz]. -
Device not found To check available devices:
from rbln_mon.backends import NvidiaBackend, RebellionsBackend print("GPU devices:", NvidiaBackend.list_devices()) print("NPU devices:", RebellionsBackend.list_devices())
You'll find more examples in the notebooks/ folder:
01_npu_example.ipynb- NPU inference tracking02_gpu_example.ipynb- GPU inference tracking03_visualization.ipynb- Data visualization samples