Skip to content

omnipkg v2.0.0 - The Python Hypervisor Release

Choose a tag to compare

@1minds3t 1minds3t released this 08 Dec 20:26
· 484 commits to main since this release

Release Date: 2025-12-08

This release marks a fundamental paradigm shift from "Package Loader" to "Distributed Runtime Architecture." OmniPkg 2.0 introduces a persistent daemon kernel, universal GPU IPC, and hardware-level isolation, effectively functioning as an Operating System for Python environments.

We have shattered the performance barrier. What once took 2 seconds now takes 60 milliseconds. What once crashed due to ABI conflicts now runs concurrently on the same GPU.

🚀 Major Architectural Breakthroughs

  • Universal GPU IPC (Pure Python/ctypes):

    • Implemented a custom, framework-agnostic CUDA IPC protocol (UniversalGpuIpc) using raw ctypes.
    • Performance: Achieved ~1.5ms latency for tensor handoffs, beating PyTorch's native IPC by ~30% and Hybrid SHM by 800%.
    • Enables true zero-copy data transfer between isolated processes without relying on framework-specific hooks.
  • Persistent Worker Daemon ("The Kernel"):

    • Replaced ad-hoc subprocess spawning with a persistent, self-healing worker pool (WorkerPoolDaemon).
    • Reduces environment context switching time from ~2000ms (process spawn) to ~60ms (warm activation).
    • Implements an "Elastic Lung" architecture: Workers morph into required environments on-demand and purge themselves back to a clean slate.
  • Selective Hardware Virtualization (CUDA Hotswapping):

    • Implemented dynamic LD_LIBRARY_PATH injection at the worker level.
    • The daemon now scans active bubbles to inject the exact CUDA runtime libraries required by the specific framework version (e.g., loading CUDA 11 libs for TF 2.13 while the host runs CUDA 12).
    • Result: Successfully ran TensorFlow 2.12 (CPU), TF 2.13 (CPU), and TF 2.20 (GPU) simultaneously in a single orchestration flow without crashing.

⚡ Core Enhancements

  • Fail-Safe Cloaking: Added _force_restore_owned_cloaks() to guarantee filesystem restoration even during catastrophic process failures or OOM events. No more "zombie" cloaked files.
  • Global Shutdown Silencer: Implemented an atexit hook that synchronizes CUDA contexts and redirects stderr to /dev/null during final interpreter shutdown, eliminating harmless but noisy C++ "driver shutting down" warnings.
  • Composite Bubble Injection: The loader now automatically constructs "Meta-Bubbles" at runtime, merging the requested package bubble with its binary dependencies (NVIDIA libs, Triton) on the fly.

🐛 Critical Fixes

  • PyTorch 1.13+ Compatibility: Patched the worker daemon to handle TypedStorage serialization changes in newer PyTorch versions, preventing crashes during native IPC.
  • Deadlock Prevention: Implemented ThreadPoolExecutor in the daemon manager to allow recursive worker calls (Worker A calling Worker B) without deadlocking the socket.
  • Lazy Loading: Made psutil and torch imports lazy within the daemon to prevent "poisoning" the process with default environment versions before isolation takes effect.

📊 Benchmarks (vs v1.x)

Metric v1.x (Hybrid) v2.0 (Universal) Speedup
IPC Tensor Handoff 14ms 1.5ms 9.3x
Context Switch (Cold) ~2500ms ~2500ms 1.0x
Context Switch (Warm) ~2000ms ~0.06s 33x
Recursive Depth 5 levels Unlimited

📦 Upgrade

# Via pip
pip install --upgrade omnipkg

# Via omnipkg itself
8pkg upgrade

Welcome to the Singularity.