Releases: lablup/all-smi
v0.17.2
Release v0.17.2
New Features
None
Improvements
- Use
with_global_system()in Jetson, Tenstorrent, and NVIDIA readers for consistency with AMD, Apple Silicon, and local collector patterns (#121)
Bug Fixes
- Fix file descriptor leaks in Jetson, Tenstorrent, and NVIDIA readers by replacing per-call
System::new()with shared global system instance (#121, fixes #120)
CI/CD Improvements
None
Technical Details
- NVIDIA Jetson (
nvidia_jetson.rs):get_process_info()andget_gpu_processes()now usewith_global_system() - Tenstorrent (
tenstorrent.rs):get_process_info()now useswith_global_system() - NVIDIA (
nvidia.rs): Removedsystem: Mutex<System>field, migratedget_process_info()towith_global_system()
Dependencies
None
Breaking Changes
None
Known Issues
None
What's Changed
- fix: use with_global_system() to prevent FD leak in Jetson, Tenstorrent, and NVIDIA readers (#120) by @inureyes in #121
Full Changelog: v0.17.1...v0.17.2
v0.17.1
Release v0.17.1
New Features
None
Improvements
None
Bug Fixes
- Fix file descriptor leak in API mode by reusing resource handles (#119)
- Cache
Nvmlhandle inNvidiaGpuReaderand reuse acrossget_gpu_info()/get_process_info()calls, with graceful reinit on handle invalidation (e.g., GPU hot-unplug) - Reuse
sysinfo::Systeminstance acrossget_process_info()calls instead of creating a new one every iteration - Create
Disksinstance once before the API metrics collection loop and refresh in-place each cycle viadisks.refresh(true)
- Cache
CI/CD Improvements
None
Technical Details
- Closes #118
Dependencies
None
Breaking Changes
None
Known Issues
None
v0.17.0
Release v0.17.0
This release adds GPU process filtering and improves sort stability in the process list, along with storage API enhancements for the library.
New Features
- Add GPU process filter toggle ('f' key) to show only processes with GPU memory usage
- Add
get_storage_info()method to public library API for accessing disk/storage information - Add
StorageReadertrait andLocalStorageReaderimplementation - Export
StorageInfoandStorageReaderin prelude for convenient imports
Improvements
- Improve process list sort stability using PID as secondary sort key to prevent constant position shifts
- Update status bar to indicate filter status (
[Filter:GPU]) when active - Update help screen to show the new 'F' shortcut for GPU process filter
Bug Fixes
None
CI/CD Improvements
None
Technical Details
- Sort stability implemented by using PID as secondary sort key when primary values are equal
- Storage reader follows existing reader pattern (GpuReader, CpuReader, etc.)
- Docker-aware filtering in storage reader
Dependencies
None
Breaking Changes
None
Known Issues
None
Full Changelog: v0.16.0...v0.17.0
What's Changed
- Add get_storage_info() method to public library API by @inureyes in #116
- feat: add GPU process filter and improve sort stability by @inureyes in #117
Full Changelog: v0.16.0...v0.17.0
v0.16.0
v0.16.0
This release introduces a proper library API for external Rust projects, making it easy to integrate GPU/NPU monitoring into custom applications.
New Features
- High-level
AllSmiclient struct with ergonomic API for querying GPU/NPU, CPU, memory, and chassis information - Unified
Errortype usingthiserrorfor cleaner error handling in library usage preludemodule for convenient imports (use all_smi::prelude::*)- Comprehensive library API manual with usage examples
Improvements
- Improved thread safety documentation with
#[must_use]attributes - Better API ergonomics for library consumers
Bug Fixes
- Fix Launchpad offline builds by including
.cargo/config.tomlin source package - Add
cargo vendorstep for Launchpad builds to handle dependencies without internet access
CI/CD Improvements
- None
Technical Details
- Library API provides methods:
get_gpu_info(),get_cpu_info(),get_memory_info(),get_process_info(),get_chassis_info() AllSmistruct isSend + Syncfor thread-safe usage- Configuration via
AllSmiConfigfor sample interval and verbose mode
Dependencies
- axum: 0.8.7 → 0.8.8
- tower-http: 0.6.6 → 0.6.8
- sysinfo: 0.37.0 → 0.37.2
- regex: 1.11.2 → 1.12.2
- futures-util: 0.3.30 → 0.3.31
Breaking Changes
- None
Known Issues
- None
Full Changelog: v0.15.2...v0.16.0
What's Changed
- chore: update cargo dependencies by @inureyes in #108
- feat: add proper library API for external Rust projects by @inureyes in #109
- fix: add cargo vendor step for Launchpad offline builds by @appleparan in #111
New Contributors
- @appleparan made their first contribution in #111
Full Changelog: v0.15.2...v0.16.0
v0.15.2
Summary
Fix Rebellions NPU detection compatibility with rbln SDK 2.0.x by implementing a backward-compatible deserializer for the npu field.
New Features
None
Improvements
None
Bug Fixes
- Rebellions NPU detection: Implement backward-compatible deserializer for the
npufield to support both SDK 1.x (string format) and SDK 2.0.x (integer format)- SDK 1.x outputs
"npu": "0"(string) - SDK 2.0.x outputs
"npu": 0(integer) - Uses serde Visitor pattern to handle both types gracefully
- SDK 1.x outputs
CI/CD Improvements
None
Technical Details
- Changed
npufield type fromStringtou32for rbln SDK 2.0.x compatibility - Added custom serde deserializer using Visitor pattern that accepts both string and integer types
- No external dependencies added for the deserialization logic
Dependencies
None
Breaking Changes
None
Known Issues
None
What's Changed
- Fix Rebellions NPU detection with rbln SDK 2.0.x by @gspark-etri in #105
New Contributors
- @gspark-etri made their first contribution in #105
Full Changelog: v0.15.1...v0.15.2
v0.15.1
Summary
This patch release fixes a memory leak on Apple Silicon Macs.
New Features
None
Improvements
None
Bug Fixes
- Fix memory leak in IOReportIterator by releasing delta CFDictionaryRef (#104)
CI/CD Improvements
None
Technical Details
- On Apple Silicon, the IOReportIterator now properly releases CFDictionaryRef objects returned by IOReportCreateSamplesDelta to prevent memory accumulation during metrics collection
Dependencies
None
Breaking Changes
None
Known Issues
None
What's Changed
Full Changelog: v0.15.0...v0.15.1
Release v0.15.0
This release adds Unix Domain Socket support for API mode, improves Windows CPU temperature monitoring with a fallback chain, optimizes binary size, and completes the repository organization migration.
New Features
- Unix Domain Socket Support: API mode now supports Unix Domain Sockets (UDS) for local IPC on Linux and macOS
- Platform-specific default paths:
/tmp/all-smi.sock(macOS),/var/run/all-smi.sockor/tmp/all-smi.sock(Linux) - Socket permissions set to
0600for security (owner-only access) - Dual listener mode: TCP and UDS can run simultaneously using
--portand--socketoptions - UDS-only mode supported with
--port 0 --socket
- Platform-specific default paths:
Improvements
- Windows CPU Temperature Fallback Chain: Graceful handling of
WBEM_E_NOT_FOUNDerrors when WMI thermal zones are unavailable- Multiple temperature sources: ACPI Thermal Zones, AMD Ryzen Master SDK, Intel WMI, LibreHardwareMonitor
- CPU vendor detection for optimal source selection
- OnceCell caching for source availability status
- Silent fallback when all sources unavailable (no error spam)
- Binary Size Optimization: Added release profile with optimizations
- Strip debug symbols (
strip = true) - Link-Time Optimization (
lto = true) - Single codegen unit for better optimization
- Abort on panic (removes unwind code)
- Optimize for size (
opt-level = "z")
- Strip debug symbols (
Bug Fixes
None
CI/CD Improvements
None
Technical Details
- Repository organization changed from
inureyestolablup - Unix socket file is automatically cleaned up on graceful shutdown
- Signal handlers for SIGTERM/SIGINT cleanup
Dependencies
- Various dependency updates via
cargo update
Breaking Changes
None
Known Issues
- Unix Domain Socket support is currently Unix-only (Linux, macOS); Windows support pending Rust ecosystem maturity
Full Changelog: v0.14.0...v0.15.0
What's Changed
- feat: Add Unix Domain Socket support for API mode by @inureyes in #100
- fix: Add Windows CPU temperature fallback chain to handle WBEM_E_NOT_FOUND by @inureyes in #103
Full Changelog: v0.14.0...v0.15.0
v0.15.0
Release v0.15.0
This release adds Unix Domain Socket support for API mode, improves Windows CPU temperature monitoring with a fallback chain, optimizes binary size, and completes the repository organization migration.
New Features
- Unix Domain Socket Support: API mode now supports Unix Domain Sockets (UDS) for local IPC on Linux and macOS
- Platform-specific default paths:
/tmp/all-smi.sock(macOS),/var/run/all-smi.sockor/tmp/all-smi.sock(Linux) - Socket permissions set to
0600for security (owner-only access) - Dual listener mode: TCP and UDS can run simultaneously using
--portand--socketoptions - UDS-only mode supported with
--port 0 --socket
- Platform-specific default paths:
Improvements
- Windows CPU Temperature Fallback Chain: Graceful handling of
WBEM_E_NOT_FOUNDerrors when WMI thermal zones are unavailable- Multiple temperature sources: ACPI Thermal Zones, AMD Ryzen Master SDK, Intel WMI, LibreHardwareMonitor
- CPU vendor detection for optimal source selection
- OnceCell caching for source availability status
- Silent fallback when all sources unavailable (no error spam)
- Binary Size Optimization: Added release profile with optimizations
- Strip debug symbols (
strip = true) - Link-Time Optimization (
lto = true) - Single codegen unit for better optimization
- Abort on panic (removes unwind code)
- Optimize for size (
opt-level = "z")
- Strip debug symbols (
Bug Fixes
None
CI/CD Improvements
None
Technical Details
- Repository organization changed from
inureyestolablup - Unix socket file is automatically cleaned up on graceful shutdown
- Signal handlers for SIGTERM/SIGINT cleanup
Dependencies
- Various dependency updates via
cargo update
Breaking Changes
None
Known Issues
- Unix Domain Socket support is currently Unix-only (Linux, macOS); Windows support pending Rust ecosystem maturity
Full Changelog: v0.14.0...v0.15.0
What's Changed
- feat: Add Unix Domain Socket support for API mode by @inureyes in #100
- fix: Add Windows CPU temperature fallback chain to handle WBEM_E_NOT_FOUND by @inureyes in #103
Full Changelog: v0.14.0...v0.15.0
v0.14.0
v0.14.0 Release
This release adds Windows x64 build target, native macOS APIs for Apple Silicon (no sudo required), chassis/node-level power monitoring, and removes the legacy powermetrics implementation.
New Features
- Windows x64 Build Target: Added Windows x64 as a build target for cross-platform support (#98)
- Native macOS APIs: Added native macOS APIs (IOReport, SMC, NSProcessInfo) for Apple Silicon metrics collection without requiring sudo (#89)
- Chassis/Node-level Monitoring: Added comprehensive node-level monitoring with per-node power consumption tracking and BMC metrics support (#86)
- Total Power Metrics: Exposed total power metrics for Apple Silicon (CPU+GPU+ANE combined) (#83)
Improvements
- CPU Usage Optimization: Cached expensive system calls to reduce CPU utilization during monitoring (#96)
- Legacy Powermetrics Removal: Removed legacy powermetrics implementation in favor of native macOS APIs (#93)
- Repository Migration: Updated repository URLs from inureyes to lablup organization (#85)
Bug Fixes
- Fixed missing build.rs in Dockerfile for protobuf compilation
- Fixed missing proto directory in Dockerfile for TPU support
- Fixed Ubuntu PPA builds to use rust-1.85-all package for Cargo.lock v4 compatibility (#95)
- Added missing build dependencies for Ubuntu PPA workflow (#92)
CI/CD Improvements
- Enhanced Dockerfile configuration for TPU and protobuf support
- Improved Ubuntu PPA build workflow with proper Rust 1.85 support
Technical Details
- Native macOS integration using IOReport API for power/residency metrics
- Apple SMC integration for actual temperature readings
- NSProcessInfo.thermalState binding for thermal pressure monitoring
- Connection staggering and pooling optimizations maintained
Dependencies
None
Breaking Changes
- Legacy powermetrics mode has been removed; native macOS APIs are now the default
- The
--features powermetricsbuild flag is no longer available
Known Issues
None
What's Changed
- feat: expose total power metrics for Apple Silicon by @inureyes in #83
- feat: Add Chassis/Node-level monitoring with per-node power and BMC metrics by @inureyes in #86
- feat: Add native macOS APIs for Apple Silicon metrics (no sudo required) by @inureyes in #89
- fix: Add missing build dependencies for Ubuntu PPA workflow by @inureyes in #92
- fix: Use Ubuntu's rust-1.85-all package for PPA builds to handle Cargo.lock v4 by @inureyes in #95
- refactor: remove legacy powermetrics implementation by @inureyes in #93
- Optimize CPU usage by caching expensive system calls by @inureyes in #96
- feat: Add Windows x64 build target by @inureyes in #98
Full Changelog: v0.13.1...v0.14.0
v0.13.1
Release v0.13.1
This release adds comprehensive Google Cloud TPU monitoring support and significantly reduces CPU utilization through optimized polling and rendering.
New Features
- Google Cloud TPU Monitoring Support: Full support for monitoring Google Cloud TPU accelerators
- Support for all TPU generations: v2, v3, v4, v5e, v5p, v6e, v6 Trillium, v7 Ironwood
- Native gRPC client for TPU runtime metrics (localhost:8431) with adaptive polling
- TensorCore utilization and HLO (High-Level Operations) metrics display
- Multiple detection methods: sysfs, PCI scanning, PJRT C API, tpu-info CLI
- API mode JSON output with detailed TPU metrics
- Background metrics collection with streaming tpu-info runner
Improvements
- CPU Utilization Optimization: Reduced CPU usage by up to 90% in idle conditions and 70% during active monitoring
- Reduced UI poll frequency from 20 Hz (50ms) to 10 Hz (100ms)
- Reduced render frequency from 30 FPS to 10 FPS
- Content hash-based skip rendering using FNV-1a hashing
- Data version tracking to skip re-renders when data is unchanged
- Memory Optimization: Reduced memory churn from ~10MB/sec to ~640KB/sec (16x reduction)
- BufferWriter pre-allocation reduced from 1MB to 64KB
- Eliminated 3 HashMap clones per frame by updating in-place
- Process content lines directly with iterator instead of collecting to Vec
- Reuse String allocations via clear()+push_str()
- Scroll & Tab Tracking: Restored smooth text scrolling animation and added tab_scroll_offset tracking for immediate tab bar re-renders
Bug Fixes
None
CI/CD Improvements
None
Technical Details
- Robust TPU detection with fallback chain: sysfs → PJRT → gRPC → CLI
- TUI display with TPU info line showing memory, TensorCore utilization, HLO queue status
- Full metrics export in JSON format for programmatic access
Dependencies
- Added tonic and prost for gRPC TPU metrics client
- tonic: 0.13 → 0.14
- tonic-prost: 0.13 → 0.14
- prost: 0.13 → 0.14
- prost-types: 0.13 → 0.14
- tonic-prost-build: 0.13 → 0.14
- wmi: 0.17 → 0.18 (Windows)
- libloading: 0.8 → 0.9 (Linux)
Breaking Changes
None
Known Issues
None
What's Changed
- fix: reduce CPU utilization with optimized polling and rendering by @inureyes in #78
- feat: add comprehensive Google TPU monitoring support by @inureyes in #79
Release v0.13.0 - v0.13.1
This release upgrades key dependencies including tonic/prost to 0.14 for improved gRPC support and optimizes build dependencies.
Technical Details
- Upgrade tonic to 0.14, prost to 0.14 for improved gRPC performance
- Upgrade wmi to 0.18 for Windows monitoring
- Upgrade libloading to 0.9 for dynamic library loading
- Upgrade tonic-build to 0.14 and optimize build dependencies
Full Changelog: v0.12.0...v0.13.1
v0.13.0
Release v0.13.0
This release adds comprehensive Google Cloud TPU monitoring support and significantly reduces CPU utilization through optimized polling and rendering.
New Features
- Google Cloud TPU Monitoring Support: Full support for monitoring Google Cloud TPU accelerators
- Support for all TPU generations: v2, v3, v4, v5e, v5p, v6e, v6 Trillium, v7 Ironwood
- Native gRPC client for TPU runtime metrics (localhost:8431) with adaptive polling
- TensorCore utilization and HLO (High-Level Operations) metrics display
- Multiple detection methods: sysfs, PCI scanning, PJRT C API, tpu-info CLI
- API mode JSON output with detailed TPU metrics
- Background metrics collection with streaming tpu-info runner
Improvements
- CPU Utilization Optimization: Reduced CPU usage by up to 90% in idle conditions and 70% during active monitoring
- Reduced UI poll frequency from 20 Hz (50ms) to 10 Hz (100ms)
- Reduced render frequency from 30 FPS to 10 FPS
- Content hash-based skip rendering using FNV-1a hashing
- Data version tracking to skip re-renders when data is unchanged
- Memory Optimization: Reduced memory churn from ~10MB/sec to ~640KB/sec (16x reduction)
- BufferWriter pre-allocation reduced from 1MB to 64KB
- Eliminated 3 HashMap clones per frame by updating in-place
- Process content lines directly with iterator instead of collecting to Vec
- Reuse String allocations via clear()+push_str()
- Scroll & Tab Tracking: Restored smooth text scrolling animation and added tab_scroll_offset tracking for immediate tab bar re-renders
Bug Fixes
None
CI/CD Improvements
None
Technical Details
- Robust TPU detection with fallback chain: sysfs → PJRT → gRPC → CLI
- TUI display with TPU info line showing memory, TensorCore utilization, HLO queue status
- Full metrics export in JSON format for programmatic access
Dependencies
- Added tonic and prost for gRPC TPU metrics client
Breaking Changes
None
Known Issues
None
What's Changed
- fix: reduce CPU utilization with optimized polling and rendering by @inureyes in #78
- feat: add comprehensive Google TPU monitoring support by @inureyes in #79
Full Changelog: v0.12.0...v0.13.0