Releases: offerrall/pyimagecuda
Releases · offerrall/pyimagecuda
v0.1.3
0.1.2
0.1.1
[0.1.1] - 2026-01-14
Added
- Hue adjustment (
Adjust.hue()): Shift all colors around the color wheel- Rotates hue in HSV color space while preserving brightness and saturation
- Useful for color correction, creative effects, and global color transformations
- Example:
Adjust.hue(image, 180)creates complementary colors
- Vibrance adjustment (
Adjust.vibrance()): Smart saturation that protects already-saturated colors- Boosts muted colors while protecting vibrant tones (especially skin tones)
- Intelligently adjusts saturation inversely proportional to current color saturation
- Includes skin tone protection for natural-looking portraits (hue range 0-50° and 330-360°)
- Example:
Adjust.vibrance(image, 0.5)enhances colors without oversaturating skin
- Chroma key (
Effect.chroma_key()): Remove specific color ranges (e.g., green screen)- Configurable target color, threshold, smoothness, and spill suppression
- Example:
Effect.chroma_key(image, target_color=(0,255,0), threshold=0.2)
Internal
- Added RGB↔HSV color space conversion helpers in
common.hfor future color operations - Optimized adjustments with early returns for no-op values and achromatic pixels
- Implemented proper bidirectional clamping for safe value ranges
Documentation
- Added hue adjustment documentation with examples and use cases
- Added vibrance adjustment documentation with examples and use cases
- Added chroma key documentation with examples and use cases
v0.1.0
[0.1.0] - 2026-01-07
Added
- Transform.rotate() now supports multiple interpolation methods:
nearest,bilinear,bicubic,lanczos - Transform.zoom() for viewport-based zooming with interpolation support
- GLResource.copy_from() now supports optional
syncparameter for async GPU-to-GPU transferssync=True(default): Blocks until copy completes (safe)sync=False: Returns immediately for high-performance pipelines (advanced)
Performance
- Lanczos resize: 2.7× faster
- Bicubic resize: 1.1× faster
- Bilinear resize: 1.09× faster
Fixed
- Critical bug in Lanczos kernel causing incorrect weights in edge cases
Changed
- Refactored sampling functions to
common.h(eliminated code duplication) - Added
fmaf(),#pragma unroll, and register optimizations - Simplified resize kernels from ~275 to ~100 lines
Transform.rotate()now uses shared sampling functions fromcommon.h
Benchmarks (1080p → 4K)
| Method | Before | After | Gain |
|---|---|---|---|
| Lanczos | 10.96ms | 4.06ms | 2.7× |
| Bicubic | 0.53ms | 0.48ms | 1.1× |
| Bilinear | 0.47ms | 0.43ms | 1.09× |
v0.9.10
v0.0.10 - Improved Memory Management
Highlights
This release improves the buffer sizing system to provide more flexibility and better memory utilization on the GPU.
New Features
Dynamic Buffer Capacity System
- Buffers now track total pixel capacity instead of fixed width×height dimensions
- Reuse the same buffer for different aspect ratios within capacity
buffer = Image(4096, 4096) # 16M pixel capacity
buffer.resize(8192, 2048) # Landscape (16M pixels)
buffer.resize(2048, 8192) # Portrait (16M pixels)
buffer.resize(4096, 4096) # Square (16M pixels)New resize() Method
- Atomic dimension updates - change width and height together safely
- Prevents validation order issues
img = Image(1920, 1080)
img.resize(3840, 540) # Works smoothlyChanges
API Changes
Image.get_max_capacity()now returnsint(total pixels) instead oftuple[int, int]
# Before
max_w, max_h = img.get_max_capacity() # (1920, 1080)
# After
max_pixels = img.get_max_capacity() # 2,073,600Internal Improvements
- Removed internal
ensure_capacity()utility function - All operations now use
buffer.resize()directly - Cleaner buffer reuse patterns across all modules
Documentation
- Updated buffer sizing examples
- Clarified capacity concepts in Image & Memory guide
- All examples updated to reflect new API
Migration Guide
If you used get_max_capacity():
# Before
max_w, max_h = img.get_max_capacity()
if width * height > max_w * max_h:
raise ValueError("Too large")
# After
max_pixels = img.get_max_capacity()
if width * height > max_pixels:
raise ValueError("Too large")If you set dimensions individually:
# Before (could fail depending on order)
img.width = 3840
img.height = 540
# After (recommended)
img.resize(3840, 540)
# Individual setters still work for simple cases
img.width = 1920
img.height = 1080Breaking Changes
Image.get_max_capacity()return type changed fromtuple[int, int]toint- Impact: Low - This method is rarely used directly in user code
- Fix: Use total pixel capacity instead of individual dimensions (see migration guide)
Technical Details
- All C/CUDA kernels remain unchanged and 100% compatible
- Buffer indexing works identically
- Zero performance impact - same GPU operations, cleaner Python API
0.0.9
New Features
OpenGL Integration (Alpha)
- Added
GLResourcefor GPU-to-GPU display via CUDA-OpenGL interop - Sub-millisecond preview updates (~1ms for 2K images)
- Zero CPU overhead for real-time rendering applications
- See docs
Direct U8 Save
- Added
save_u8()to save ImageU8 buffers without F32 conversion - Useful for display+save workflows (convert once, use twice)
Example
# Convert once
u8_buffer = ImageU8(1920, 1080)
convert_float_to_u8(u8_buffer, f32_image)
# Use for both display and save
gl_resource.copy_from(u8_buffer) # Display
save_u8(u8_buffer, "output.png") # SaveInstallation
pip install --upgrade pyimagecuda0.0.8
Performance Optimizations
Added early-exit optimizations to avoid unnecessary GPU operations:
Operations that now skip processing when no change is needed:
- Resize: Skips if dimensions are already the target size (copies instead)
- Crop: Skips if cropping entire image (copies instead)
- Gaussian Blur: Skips if radius=0 or sigma≈0 (copies instead)
- Sharpen: Skips if strength≈0 (copies instead)
- Emboss: Skips if strength≈0 (copies instead)
- Brightness: Skips if factor≈0
- Contrast: Skips if factor≈1.0
- Saturation: Skips if factor≈1.0
- Gamma: Skips if gamma≈1.0
- Opacity: Skips if factor≈1.0
- Sepia: Skips if intensity≈0
- Rounded Corners: Skips if radius=0
- All Blend modes: Skips if opacity≈0
These optimizations reduce GPU load and improve performance in conditional image processing pipelines.
0.0.7
New Features
- NumPy Bridge: Added
from_numpy()andto_numpy()for seamless integration with OpenCV, Pillow, Matplotlib, and the entire Python image processing ecosystem.
Improvements
- Documentation: Expanded IO docs with NumPy integration examples and best practices.
- Testing: Added general test suite (will be expanded in future releases).
- Benchmarks: Added comprehensive performance comparisons against Pillow and OpenCV across multiple operations (blur, resize, rotate, flip, crop, blend).
NumPy Bridge Example
import cv2
from pyimagecuda import from_numpy, to_numpy, blur
# Load with OpenCV
frame = cv2.imread("photo.jpg")
# Process on GPU
img = from_numpy(frame)
blur(img, 10)
# Back to CPU
result = to_numpy(img)
cv2.imwrite("output.jpg", result)0.0.6
New Features
Text Rendering Module
Added text rendering with professional typography control.
from pyimagecuda import Text, save
text_img = Text.create(
"Hello World",
font="Arial Bold",
size=48,
color=(1.0, 1.0, 1.0, 1.0),
bg_color=(0.0, 0.0, 0.0, 1.0)
)
save(text_img, 'output.png')Features:
- System font support with weight/style variations
- Pango markup for rich text (bold, italic, colors, superscript/subscript)
- Letter spacing and line spacing control
- Text alignment (left, center, right)
- Independent text and background colors
- Buffer reuse for batch processing
Rich text example:
text_img = Text.create(
'Normal <b>Bold</b> <i>Italic</i>\n'
'<span foreground="orange">Orange</span> and <sub>subscript</sub>',
size=40
)API
Text.create()
Text.create(
text: str,
font: str = "Sans",
size: float = 12.0,
color: tuple[float, float, float, float] = (0.0, 0.0, 0.0, 1.0),
bg_color: tuple[float, float, float, float] = (0.0, 0.0, 0.0, 0.0),
align: Literal['left', 'centre', 'right'] = 'left',
justify: bool = False,
spacing: int = 0,
letter_spacing: float = 0.0,
dst_buffer: Image | None = None,
u8_buffer: ImageU8 | None = None
) -> Image | NoneReturns new Image with rendered text (or None if dst_buffer provided).
Documentation
New page: Text Rendering Guide
Installation
pip install --upgrade pyimagecuda0.0.5
v0.0.5: Linux Support
This release introduces native support for Linux distributions.
Features
- Linux Support: Added
manylinuxwheels compatible with modern Linux distributions (Ubuntu 20.04+, Fedora, RHEL 8+, WSL2). - Zero Dependencies: CUDA runtimes are statically linked. No local CUDA Toolkit installation is required on Linux or Windows.
- Size: Library remains lightweight (~1.2 MB on Linux, ~600 KB on Windows).
Compatibility
- Python: 3.10, 3.11, 3.12, 3.13
- Platforms: Windows (x64), Linux (x86_64)
- Requirements: Standard NVIDIA GPU Drivers.
Installation
pip install pyimagecudaChangelog
- Implemented CMake configuration for Linux shared objects (.so).
- Configured CI/CD pipeline for manylinux_2_28 builds.
- Updated documentation with cross-platform requirements.