Skip to content

Releases: lablup/all-smi

v0.17.2

08 Feb 10:02

Choose a tag to compare

Release v0.17.2

New Features

None

Improvements

  • Use with_global_system() in Jetson, Tenstorrent, and NVIDIA readers for consistency with AMD, Apple Silicon, and local collector patterns (#121)

Bug Fixes

  • Fix file descriptor leaks in Jetson, Tenstorrent, and NVIDIA readers by replacing per-call System::new() with shared global system instance (#121, fixes #120)

CI/CD Improvements

None

Technical Details

  • NVIDIA Jetson (nvidia_jetson.rs): get_process_info() and get_gpu_processes() now use with_global_system()
  • Tenstorrent (tenstorrent.rs): get_process_info() now uses with_global_system()
  • NVIDIA (nvidia.rs): Removed system: Mutex<System> field, migrated get_process_info() to with_global_system()

Dependencies

None

Breaking Changes

None

Known Issues

None

What's Changed

  • fix: use with_global_system() to prevent FD leak in Jetson, Tenstorrent, and NVIDIA readers (#120) by @inureyes in #121

Full Changelog: v0.17.1...v0.17.2

v0.17.1

08 Feb 07:52

Choose a tag to compare

v0.17.1 Pre-release
Pre-release

Release v0.17.1

New Features

None

Improvements

None

Bug Fixes

  • Fix file descriptor leak in API mode by reusing resource handles (#119)
    • Cache Nvml handle in NvidiaGpuReader and reuse across get_gpu_info() / get_process_info() calls, with graceful reinit on handle invalidation (e.g., GPU hot-unplug)
    • Reuse sysinfo::System instance across get_process_info() calls instead of creating a new one every iteration
    • Create Disks instance once before the API metrics collection loop and refresh in-place each cycle via disks.refresh(true)

CI/CD Improvements

None

Technical Details

Dependencies

None

Breaking Changes

None

Known Issues

None

v0.17.0

13 Jan 10:47

Choose a tag to compare

Release v0.17.0

This release adds GPU process filtering and improves sort stability in the process list, along with storage API enhancements for the library.

New Features

  • Add GPU process filter toggle ('f' key) to show only processes with GPU memory usage
  • Add get_storage_info() method to public library API for accessing disk/storage information
  • Add StorageReader trait and LocalStorageReader implementation
  • Export StorageInfo and StorageReader in prelude for convenient imports

Improvements

  • Improve process list sort stability using PID as secondary sort key to prevent constant position shifts
  • Update status bar to indicate filter status ([Filter:GPU]) when active
  • Update help screen to show the new 'F' shortcut for GPU process filter

Bug Fixes

None

CI/CD Improvements

None

Technical Details

  • Sort stability implemented by using PID as secondary sort key when primary values are equal
  • Storage reader follows existing reader pattern (GpuReader, CpuReader, etc.)
  • Docker-aware filtering in storage reader

Dependencies

None

Breaking Changes

None

Known Issues

None

Full Changelog: v0.16.0...v0.17.0

What's Changed

  • Add get_storage_info() method to public library API by @inureyes in #116
  • feat: add GPU process filter and improve sort stability by @inureyes in #117

Full Changelog: v0.16.0...v0.17.0

v0.16.0

04 Jan 09:20

Choose a tag to compare

v0.16.0

This release introduces a proper library API for external Rust projects, making it easy to integrate GPU/NPU monitoring into custom applications.

New Features

  • High-level AllSmi client struct with ergonomic API for querying GPU/NPU, CPU, memory, and chassis information
  • Unified Error type using thiserror for cleaner error handling in library usage
  • prelude module for convenient imports (use all_smi::prelude::*)
  • Comprehensive library API manual with usage examples

Improvements

  • Improved thread safety documentation with #[must_use] attributes
  • Better API ergonomics for library consumers

Bug Fixes

  • Fix Launchpad offline builds by including .cargo/config.toml in source package
  • Add cargo vendor step for Launchpad builds to handle dependencies without internet access

CI/CD Improvements

  • None

Technical Details

  • Library API provides methods: get_gpu_info(), get_cpu_info(), get_memory_info(), get_process_info(), get_chassis_info()
  • AllSmi struct is Send + Sync for thread-safe usage
  • Configuration via AllSmiConfig for sample interval and verbose mode

Dependencies

  • axum: 0.8.7 → 0.8.8
  • tower-http: 0.6.6 → 0.6.8
  • sysinfo: 0.37.0 → 0.37.2
  • regex: 1.11.2 → 1.12.2
  • futures-util: 0.3.30 → 0.3.31

Breaking Changes

  • None

Known Issues

  • None

Full Changelog: v0.15.2...v0.16.0

What's Changed

  • chore: update cargo dependencies by @inureyes in #108
  • feat: add proper library API for external Rust projects by @inureyes in #109
  • fix: add cargo vendor step for Launchpad offline builds by @appleparan in #111

New Contributors

Full Changelog: v0.15.2...v0.16.0

v0.15.2

02 Jan 01:57

Choose a tag to compare

Summary

Fix Rebellions NPU detection compatibility with rbln SDK 2.0.x by implementing a backward-compatible deserializer for the npu field.

New Features

None

Improvements

None

Bug Fixes

  • Rebellions NPU detection: Implement backward-compatible deserializer for the npu field to support both SDK 1.x (string format) and SDK 2.0.x (integer format)
    • SDK 1.x outputs "npu": "0" (string)
    • SDK 2.0.x outputs "npu": 0 (integer)
    • Uses serde Visitor pattern to handle both types gracefully

CI/CD Improvements

None

Technical Details

  • Changed npu field type from String to u32 for rbln SDK 2.0.x compatibility
  • Added custom serde deserializer using Visitor pattern that accepts both string and integer types
  • No external dependencies added for the deserialization logic

Dependencies

None

Breaking Changes

None

Known Issues

None

What's Changed

New Contributors

Full Changelog: v0.15.1...v0.15.2

v0.15.1

30 Dec 16:43

Choose a tag to compare

Summary

This patch release fixes a memory leak on Apple Silicon Macs.

New Features

None

Improvements

None

Bug Fixes

  • Fix memory leak in IOReportIterator by releasing delta CFDictionaryRef (#104)

CI/CD Improvements

None

Technical Details

  • On Apple Silicon, the IOReportIterator now properly releases CFDictionaryRef objects returned by IOReportCreateSamplesDelta to prevent memory accumulation during metrics collection

Dependencies

None

Breaking Changes

None

Known Issues

None

What's Changed

  • fix: prevent memory leak in IOReportIterator on Apple Silicon by @inureyes in #104

Full Changelog: v0.15.0...v0.15.1


Release v0.15.0

This release adds Unix Domain Socket support for API mode, improves Windows CPU temperature monitoring with a fallback chain, optimizes binary size, and completes the repository organization migration.

New Features

  • Unix Domain Socket Support: API mode now supports Unix Domain Sockets (UDS) for local IPC on Linux and macOS
    • Platform-specific default paths: /tmp/all-smi.sock (macOS), /var/run/all-smi.sock or /tmp/all-smi.sock (Linux)
    • Socket permissions set to 0600 for security (owner-only access)
    • Dual listener mode: TCP and UDS can run simultaneously using --port and --socket options
    • UDS-only mode supported with --port 0 --socket

Improvements

  • Windows CPU Temperature Fallback Chain: Graceful handling of WBEM_E_NOT_FOUND errors when WMI thermal zones are unavailable
    • Multiple temperature sources: ACPI Thermal Zones, AMD Ryzen Master SDK, Intel WMI, LibreHardwareMonitor
    • CPU vendor detection for optimal source selection
    • OnceCell caching for source availability status
    • Silent fallback when all sources unavailable (no error spam)
  • Binary Size Optimization: Added release profile with optimizations
    • Strip debug symbols (strip = true)
    • Link-Time Optimization (lto = true)
    • Single codegen unit for better optimization
    • Abort on panic (removes unwind code)
    • Optimize for size (opt-level = "z")

Bug Fixes

None

CI/CD Improvements

None

Technical Details

  • Repository organization changed from inureyes to lablup
  • Unix socket file is automatically cleaned up on graceful shutdown
  • Signal handlers for SIGTERM/SIGINT cleanup

Dependencies

  • Various dependency updates via cargo update

Breaking Changes

None

Known Issues

  • Unix Domain Socket support is currently Unix-only (Linux, macOS); Windows support pending Rust ecosystem maturity

Full Changelog: v0.14.0...v0.15.0

What's Changed

  • feat: Add Unix Domain Socket support for API mode by @inureyes in #100
  • fix: Add Windows CPU temperature fallback chain to handle WBEM_E_NOT_FOUND by @inureyes in #103

Full Changelog: v0.14.0...v0.15.0

v0.15.0

30 Dec 15:35

Choose a tag to compare

v0.15.0 Pre-release
Pre-release

Release v0.15.0

This release adds Unix Domain Socket support for API mode, improves Windows CPU temperature monitoring with a fallback chain, optimizes binary size, and completes the repository organization migration.

New Features

  • Unix Domain Socket Support: API mode now supports Unix Domain Sockets (UDS) for local IPC on Linux and macOS
    • Platform-specific default paths: /tmp/all-smi.sock (macOS), /var/run/all-smi.sock or /tmp/all-smi.sock (Linux)
    • Socket permissions set to 0600 for security (owner-only access)
    • Dual listener mode: TCP and UDS can run simultaneously using --port and --socket options
    • UDS-only mode supported with --port 0 --socket

Improvements

  • Windows CPU Temperature Fallback Chain: Graceful handling of WBEM_E_NOT_FOUND errors when WMI thermal zones are unavailable
    • Multiple temperature sources: ACPI Thermal Zones, AMD Ryzen Master SDK, Intel WMI, LibreHardwareMonitor
    • CPU vendor detection for optimal source selection
    • OnceCell caching for source availability status
    • Silent fallback when all sources unavailable (no error spam)
  • Binary Size Optimization: Added release profile with optimizations
    • Strip debug symbols (strip = true)
    • Link-Time Optimization (lto = true)
    • Single codegen unit for better optimization
    • Abort on panic (removes unwind code)
    • Optimize for size (opt-level = "z")

Bug Fixes

None

CI/CD Improvements

None

Technical Details

  • Repository organization changed from inureyes to lablup
  • Unix socket file is automatically cleaned up on graceful shutdown
  • Signal handlers for SIGTERM/SIGINT cleanup

Dependencies

  • Various dependency updates via cargo update

Breaking Changes

None

Known Issues

  • Unix Domain Socket support is currently Unix-only (Linux, macOS); Windows support pending Rust ecosystem maturity

Full Changelog: v0.14.0...v0.15.0

What's Changed

  • feat: Add Unix Domain Socket support for API mode by @inureyes in #100
  • fix: Add Windows CPU temperature fallback chain to handle WBEM_E_NOT_FOUND by @inureyes in #103

Full Changelog: v0.14.0...v0.15.0

v0.14.0

24 Dec 15:23

Choose a tag to compare

v0.14.0 Release

This release adds Windows x64 build target, native macOS APIs for Apple Silicon (no sudo required), chassis/node-level power monitoring, and removes the legacy powermetrics implementation.

New Features

  • Windows x64 Build Target: Added Windows x64 as a build target for cross-platform support (#98)
  • Native macOS APIs: Added native macOS APIs (IOReport, SMC, NSProcessInfo) for Apple Silicon metrics collection without requiring sudo (#89)
  • Chassis/Node-level Monitoring: Added comprehensive node-level monitoring with per-node power consumption tracking and BMC metrics support (#86)
  • Total Power Metrics: Exposed total power metrics for Apple Silicon (CPU+GPU+ANE combined) (#83)

Improvements

  • CPU Usage Optimization: Cached expensive system calls to reduce CPU utilization during monitoring (#96)
  • Legacy Powermetrics Removal: Removed legacy powermetrics implementation in favor of native macOS APIs (#93)
  • Repository Migration: Updated repository URLs from inureyes to lablup organization (#85)

Bug Fixes

  • Fixed missing build.rs in Dockerfile for protobuf compilation
  • Fixed missing proto directory in Dockerfile for TPU support
  • Fixed Ubuntu PPA builds to use rust-1.85-all package for Cargo.lock v4 compatibility (#95)
  • Added missing build dependencies for Ubuntu PPA workflow (#92)

CI/CD Improvements

  • Enhanced Dockerfile configuration for TPU and protobuf support
  • Improved Ubuntu PPA build workflow with proper Rust 1.85 support

Technical Details

  • Native macOS integration using IOReport API for power/residency metrics
  • Apple SMC integration for actual temperature readings
  • NSProcessInfo.thermalState binding for thermal pressure monitoring
  • Connection staggering and pooling optimizations maintained

Dependencies

None

Breaking Changes

  • Legacy powermetrics mode has been removed; native macOS APIs are now the default
  • The --features powermetrics build flag is no longer available

Known Issues

None

What's Changed

  • feat: expose total power metrics for Apple Silicon by @inureyes in #83
  • feat: Add Chassis/Node-level monitoring with per-node power and BMC metrics by @inureyes in #86
  • feat: Add native macOS APIs for Apple Silicon metrics (no sudo required) by @inureyes in #89
  • fix: Add missing build dependencies for Ubuntu PPA workflow by @inureyes in #92
  • fix: Use Ubuntu's rust-1.85-all package for PPA builds to handle Cargo.lock v4 by @inureyes in #95
  • refactor: remove legacy powermetrics implementation by @inureyes in #93
  • Optimize CPU usage by caching expensive system calls by @inureyes in #96
  • feat: Add Windows x64 build target by @inureyes in #98

Full Changelog: v0.13.1...v0.14.0

v0.13.1

22 Dec 15:55

Choose a tag to compare

Release v0.13.1

This release adds comprehensive Google Cloud TPU monitoring support and significantly reduces CPU utilization through optimized polling and rendering.

New Features

  • Google Cloud TPU Monitoring Support: Full support for monitoring Google Cloud TPU accelerators
    • Support for all TPU generations: v2, v3, v4, v5e, v5p, v6e, v6 Trillium, v7 Ironwood
    • Native gRPC client for TPU runtime metrics (localhost:8431) with adaptive polling
    • TensorCore utilization and HLO (High-Level Operations) metrics display
    • Multiple detection methods: sysfs, PCI scanning, PJRT C API, tpu-info CLI
    • API mode JSON output with detailed TPU metrics
    • Background metrics collection with streaming tpu-info runner

Improvements

  • CPU Utilization Optimization: Reduced CPU usage by up to 90% in idle conditions and 70% during active monitoring
    • Reduced UI poll frequency from 20 Hz (50ms) to 10 Hz (100ms)
    • Reduced render frequency from 30 FPS to 10 FPS
    • Content hash-based skip rendering using FNV-1a hashing
    • Data version tracking to skip re-renders when data is unchanged
  • Memory Optimization: Reduced memory churn from ~10MB/sec to ~640KB/sec (16x reduction)
    • BufferWriter pre-allocation reduced from 1MB to 64KB
    • Eliminated 3 HashMap clones per frame by updating in-place
    • Process content lines directly with iterator instead of collecting to Vec
    • Reuse String allocations via clear()+push_str()
  • Scroll & Tab Tracking: Restored smooth text scrolling animation and added tab_scroll_offset tracking for immediate tab bar re-renders

Bug Fixes

None

CI/CD Improvements

None

Technical Details

  • Robust TPU detection with fallback chain: sysfs → PJRT → gRPC → CLI
  • TUI display with TPU info line showing memory, TensorCore utilization, HLO queue status
  • Full metrics export in JSON format for programmatic access

Dependencies

  • Added tonic and prost for gRPC TPU metrics client
  • tonic: 0.13 → 0.14
  • tonic-prost: 0.13 → 0.14
  • prost: 0.13 → 0.14
  • prost-types: 0.13 → 0.14
  • tonic-prost-build: 0.13 → 0.14
  • wmi: 0.17 → 0.18 (Windows)
  • libloading: 0.8 → 0.9 (Linux)

Breaking Changes

None

Known Issues

None

What's Changed

  • fix: reduce CPU utilization with optimized polling and rendering by @inureyes in #78
  • feat: add comprehensive Google TPU monitoring support by @inureyes in #79

Release v0.13.0 - v0.13.1

This release upgrades key dependencies including tonic/prost to 0.14 for improved gRPC support and optimizes build dependencies.

Technical Details

  • Upgrade tonic to 0.14, prost to 0.14 for improved gRPC performance
  • Upgrade wmi to 0.18 for Windows monitoring
  • Upgrade libloading to 0.9 for dynamic library loading
  • Upgrade tonic-build to 0.14 and optimize build dependencies

Full Changelog: v0.12.0...v0.13.1

v0.13.0

22 Dec 15:26

Choose a tag to compare

v0.13.0 Pre-release
Pre-release

Release v0.13.0

This release adds comprehensive Google Cloud TPU monitoring support and significantly reduces CPU utilization through optimized polling and rendering.

New Features

  • Google Cloud TPU Monitoring Support: Full support for monitoring Google Cloud TPU accelerators
    • Support for all TPU generations: v2, v3, v4, v5e, v5p, v6e, v6 Trillium, v7 Ironwood
    • Native gRPC client for TPU runtime metrics (localhost:8431) with adaptive polling
    • TensorCore utilization and HLO (High-Level Operations) metrics display
    • Multiple detection methods: sysfs, PCI scanning, PJRT C API, tpu-info CLI
    • API mode JSON output with detailed TPU metrics
    • Background metrics collection with streaming tpu-info runner

Improvements

  • CPU Utilization Optimization: Reduced CPU usage by up to 90% in idle conditions and 70% during active monitoring
    • Reduced UI poll frequency from 20 Hz (50ms) to 10 Hz (100ms)
    • Reduced render frequency from 30 FPS to 10 FPS
    • Content hash-based skip rendering using FNV-1a hashing
    • Data version tracking to skip re-renders when data is unchanged
  • Memory Optimization: Reduced memory churn from ~10MB/sec to ~640KB/sec (16x reduction)
    • BufferWriter pre-allocation reduced from 1MB to 64KB
    • Eliminated 3 HashMap clones per frame by updating in-place
    • Process content lines directly with iterator instead of collecting to Vec
    • Reuse String allocations via clear()+push_str()
  • Scroll & Tab Tracking: Restored smooth text scrolling animation and added tab_scroll_offset tracking for immediate tab bar re-renders

Bug Fixes

None

CI/CD Improvements

None

Technical Details

  • Robust TPU detection with fallback chain: sysfs → PJRT → gRPC → CLI
  • TUI display with TPU info line showing memory, TensorCore utilization, HLO queue status
  • Full metrics export in JSON format for programmatic access

Dependencies

  • Added tonic and prost for gRPC TPU metrics client

Breaking Changes

None

Known Issues

None

What's Changed

  • fix: reduce CPU utilization with optimized polling and rendering by @inureyes in #78
  • feat: add comprehensive Google TPU monitoring support by @inureyes in #79

Full Changelog: v0.12.0...v0.13.0