26 Dec 19:50

athrva98

e7316a8

v0.2.0 IREE Vulkan GPU-Specific Targets Latest

Latest

Release Notes

v0.2.0

IREE Backend Enhancements

Vulkan GPU-Specific Targets

Added comprehensive Vulkan GPU target support for optimized inference across multiple vendors:

import polyinfer as pi

# AMD RX 7000 series
model = pi.load("model.onnx", device="vulkan", vulkan_target="rdna3")

# NVIDIA RTX 40 series
model = pi.load("model.onnx", device="vulkan", vulkan_target="ada")

# Intel Arc
model = pi.load("model.onnx", device="vulkan", vulkan_target="arc")

Supported GPU Targets:

Vendor	Targets	GPUs
AMD	`rdna3`, `rdna2`, `gfx1100`, `rx7900xtx`	RX 7000/6000 series
NVIDIA	`ada`, `ampere`, `turing`, `sm_89`, `rtx4090`	RTX 40/30/20 series
Intel	`arc`, `arc_a770`, `arc_a750`	Arc A-series
ARM	`valhall`, `valhall4`, `mali_g715`	Mali GPUs
Qualcomm	`adreno`	Snapdragon mobile GPUs

New Compilation Options

Added IREECompileOptions dataclass for fine-grained control over compilation:

from polyinfer.backends.iree import IREECompileOptions

opts = IREECompileOptions(
    opt_level=3,           # Optimization level (0-3)
    strip_debug=True,      # Strip debug info for smaller binaries
    data_tiling=True,      # Enable cache-friendly tiling
    vulkan_target="rdna3", # GPU-specific optimizations
    opset_version=17,      # Upgrade ONNX opset
    extra_flags=["--custom-flag"],
)

Improved Error Handling

IREECompilationError now provides actionable suggestions for common failures
Proper exception chaining (from e) for better debugging
Clear error messages for missing dependencies and unsupported operators

MLIR Export with Options

Enhanced MLIR export workflow with compilation options:

# Export with opset upgrade
backend = pi.get_backend("iree")
mlir = backend.emit_mlir("model.onnx", "model.mlir", opset_version=17)

# Compile with GPU-specific target
vmfb = backend.compile_mlir(
    "model.mlir",
    device="vulkan",
    vulkan_target="rdna3",
    opt_level=3,
    data_tiling=True,
)

New Exports

The following are now exported from polyinfer.backends.iree:

VulkanTarget - Dataclass for Vulkan GPU targets
VulkanGPUVendor - Enum for GPU vendors (AMD, NVIDIA, Intel, ARM, Qualcomm)
VULKAN_TARGETS - Dict of all predefined GPU targets
IREECompileOptions - Compilation options dataclass
IREECompilationError - Exception with actionable suggestions

Testing

Added comprehensive test coverage for new IREE features:

TestIREECompileOptions - Compile options and flag generation
TestVulkanTargets - GPU target presets and resolution
TestIREEErrorHandling - Error handling and exception chaining
TestMLIRExportWithOptions - MLIR export with compilation options

Bug Fixes

Fixed nested if statements in Vulkan target resolution (ruff SIM102)
Added proper exception chaining in 4 compilation error handlers (ruff B904)
Fixed tempfile handling with context manager (ruff SIM115)
Fixed OpenVINO backend name matching in tests

Dependencies

No new dependencies required. Existing IREE installation (iree-base-compiler[onnx], iree-base-runtime) provides all functionality.

Assets 2

20 Dec 19:57

athrva98

v0.1.0

adb423c

v0.1.0 - First ever release

v0.1.0

PolyInfer is a unified ML inference library that automatically selects the fastest backend for your hardware.

Highlights

Zero Configuration: Just pip install polyinfer[nvidia] - CUDA, cuDNN, and TensorRT are auto-installed and configured
Auto Backend Selection: Automatically picks the fastest backend for your device
450+ FPS on YOLOv8n: TensorRT acceleration with no manual optimization needed
Cross-Platform: Windows, Linux, WSL2, macOS, and Google Colab support

Features

Multi-Backend Support

ONNX Runtime: CPU, CUDA, TensorRT EP, DirectML, ROCm, CoreML
OpenVINO: Intel CPU, integrated GPU, NPU (AI Boost)
TensorRT: Native TensorRT for maximum NVIDIA performance
IREE: CPU, Vulkan, CUDA with MLIR export capability

Unified API

import polyinfer as pi

# Auto-select fastest backend
model = pi.load("model.onnx", device="cuda")
output = model(input_data)

# Compare all backends
pi.compare("model.onnx", input_shape=(1, 3, 640, 640))

Quantization Support

Dynamic/Static INT8, UINT8, INT4, FP16 quantization
ONNX Runtime and OpenVINO (NNCF) backends
TensorRT FP16/INT8 via load options

pi.quantize("model.onnx", "model_int8.onnx", method="dynamic")
pi.convert_to_fp16("model.onnx", "model_fp16.onnx")

MLIR Export

Export models to MLIR for custom hardware targets and advanced optimizations:

mlir = pi.export_mlir("model.onnx", "model.mlir")
vmfb = pi.compile_mlir("model.mlir", device="vulkan")

Structured Logging

Configurable verbosity levels for debugging and production use.

CLI Tools

polyinfer info           # Show system info and backends
polyinfer benchmark model.onnx --device tensorrt
polyinfer run model.onnx --device cuda

Installation

pip install polyinfer[nvidia]   # NVIDIA GPU
pip install polyinfer[intel]    # Intel CPU/GPU/NPU
pip install polyinfer[amd]      # AMD GPU (Windows DirectML)
pip install polyinfer[cpu]      # CPU only
pip install polyinfer[all]      # Everything

Performance

YOLOv8n @ 640x640 (RTX 5060)

Backend	Latency	FPS
TensorRT	2.2 ms	450
CUDA	6.6 ms	151
OpenVINO (CPU)	16.2 ms	62
ONNX Runtime (CPU)	22.6 ms	44

Requirements

Python 3.10, 3.11, or 3.12
numpy >= 1.24
onnx >= 1.14

What's Next

PyTorch model direct loading
More quantization options
Additional backend integrations

Full documentation: https://github.com/athrva98/polyinfer

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Release Notes

v0.2.0

IREE Backend Enhancements

Vulkan GPU-Specific Targets

New Compilation Options

Improved Error Handling

MLIR Export with Options

New Exports

Testing

Bug Fixes

Dependencies

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

v0.1.0

Highlights

Features

Multi-Backend Support

Unified API

Quantization Support

MLIR Export

Structured Logging

CLI Tools

Installation

Performance

YOLOv8n @ 640x640 (RTX 5060)

Requirements

What's Next

Uh oh!

Releases: athrva98/polyinfer

v0.2.0 IREE Vulkan GPU-Specific Targets

Release Notes

v0.2.0

IREE Backend Enhancements

Vulkan GPU-Specific Targets

New Compilation Options

Improved Error Handling

MLIR Export with Options

New Exports

Testing

Bug Fixes

Dependencies

Uh oh!

v0.1.0 - First ever release

v0.1.0

Highlights

Features

Multi-Backend Support

Unified API

Quantization Support

MLIR Export

Structured Logging

CLI Tools

Installation

Performance

YOLOv8n @ 640x640 (RTX 5060)

Requirements

What's Next

Uh oh!