Releases: ebowwa/ai-proxy-core
v0.4.42: Complete Fix for Gemini Video Validation
What's Changed
- Complete fix for Gemini video validation - consistent field naming (#37)
Bug Fixes (Completes v0.4.41 fix)
- Fixed inconsistent camelCase/snake_case in file path video processing
- File path inputs now correctly use
inline_dataandmime_type(snake_case) - Ensures consistency with base64 input handling from v0.4.41
- Fully resolves Issue #35: Video input fails with Gemini
Important Note
v0.4.41 only included partial fix (PR #36). This release includes the complete fix with PR #37.
Credits
Full Changelog: v0.4.41...v0.4.42
v0.4.41: Fix Gemini Video Content Validation
What's Changed
- Fixed Gemini video content validation error (#36)
Bug Fixes
- Video content now properly wrapped in
types.Partobjects withinline_data - Prevents "Extra inputs are not permitted" Pydantic validation errors
- Maintains full compatibility with base64 and file path video inputs
- Preserves existing MIME type detection logic
Credits
Full Changelog: v0.4.40...v0.4.41
v0.4.40
Release 0.4.40
- README: Add Gemini 2.5 Flash Image docs (text-to-image, edit, fusion), aliases, return_images usage, availability notes
- Version bump to 0.4.40 in pyproject.toml, setup.py, src/init.py
- PyPI: https://pypi.org/project/ai-proxy-core/0.4.40/
- PR #34: #34
Link to Devin run: https://app.devin.ai/sessions/ac6ad5798ea7497ba0758163a0425b10
Requested by: Elijah Arbee (@ebowwa)
v0.4.3 - Universal System Instruction Abstraction
✨ Enhancement Release
This release provides universal system_instruction abstraction across all AI providers, building on the fix from v0.4.2.
What's New
- Universal system_instruction: Same parameter now works seamlessly across all providers
- OpenAI/Ollama: Automatically converts to system message in messages array
- Gemini: Passes as native system_instruction parameter
- Anthropic: Converts to system parameter
- Zero code changes needed: Use the same
system_instructionparameter everywhere - Backward compatible: All existing code continues to work
Example Usage
from ai_proxy_core import CompletionClient
client = CompletionClient()
# Same system_instruction works for all providers\!
response = await client.create_completion(
messages=[{"role": "user", "content": "Hello"}],
model="gpt-4", # or "gemini-1.5-flash", "claude-3", "llama2"
system_instruction="You are a helpful pirate. Speak like a pirate."
)Installation
pip install --upgrade ai-proxy-core==0.4.3Links
- 📦 PyPI Package
- 🐛 Related Issue #32
- 💻 Commit: 2ef184e
Full Changelog: v0.4.2...v0.4.3
v0.4.2 - OpenAI Provider Fix
🐛 Bug Fix Release
This release fixes a critical issue with the OpenAI provider that was causing failures when using the unified CompletionClient.
What's Fixed
- Issue #32: OpenAI provider no longer fails with
unexpected keyword argument 'system_instruction'error - Added parameter filtering to exclude Gemini-specific parameters (
system_instruction,safety_settings) - Ensures seamless cross-provider compatibility
Installation
pip install --upgrade ai-proxy-core==0.4.2Testing
The fix has been verified with comprehensive tests included in test_openai_fix_v0.4.2.py
Links
- 📦 PyPI Package
- 🐛 Issue #32
- 💻 Commit: fff44dd
Full Changelog: v0.4.1...v0.4.2
v0.4.1: Explicit Model Selection for Image Generation
🔧 Breaking Changes
This release corrects the image generation API to require explicit model selection. No automatic fallback or model selection.
Migration Required
# Old (v0.4.0)
provider = GPT4oImageProvider(api_key="...")
response = provider.generate(prompt="...")
# New (v0.4.1)
provider = OpenAIImageProvider(api_key="...")
response = provider.generate(
model=ImageModel.DALLE_3, # REQUIRED
prompt="..."
)✨ What's New
- Explicit Model Selection: Every request must specify which model to use
- Three Models Available:
dall-e-2: Multiple images, editing, 256x256 to 1024x1024dall-e-3: Styles, HD quality, up to 1792x1024, revised promptsgpt-image-1: Token pricing, 4K resolution (4096x4096), better instruction following
- Model-Specific Validation: Each model validates its specific parameters
- Token Usage Tracking: GPT-Image-1 returns token usage information
📦 Installation
pip install ai-proxy-core==0.4.1🚀 Quick Start
from ai_proxy_core import OpenAIImageProvider, ImageModel
provider = OpenAIImageProvider(api_key="your-key")
# Must specify model explicitly
response = provider.generate(
model=ImageModel.DALLE_3,
prompt="Modern app icon",
size="1024x1024",
quality="hd"
)🔍 Available Models
| Model | Max Size | Features | Special |
|---|---|---|---|
| dall-e-2 | 1024x1024 | Edit, Multiple images (n≤10) | Lowest cost |
| dall-e-3 | 1792x1024 | Styles, HD quality | Best for art |
| gpt-image-1 | 4096x4096 | Token pricing, Better instructions | Best quality |
🐛 Fixes
- Fixed GPT-Image-1 response_format parameter handling
- Corrected model-specific size validation
- Improved error messages for invalid models
Full Changelog: v0.4.0...v0.4.1
v0.4.0: GPT-4o Image Generation
🎨 Image Generation Support
This release adds comprehensive image generation capabilities to ai-proxy-core using OpenAI's DALL-E 3 API, with an abstract provider pattern for future GPT-4o native generation.
✨ Features
- Image Generation Provider: New
GPT4oImageProviderfor DALL-E 3 image generation - Multiple Sizes: Support for SQUARE (1024x1024), LANDSCAPE (1792x1024), and PORTRAIT (1024x1792)
- Quality Options: STANDARD and HD quality settings
- Style Options: VIVID and NATURAL generation styles
- Image Editing: Edit existing images with natural language prompts and optional masks
- Azure Support: Full Azure OpenAI integration with
AzureGPT4oImageProvider - Localization: Generate images with localized text for app internationalization
- C2PA Metadata: Extract content authenticity metadata from generated images
📦 Installation
pip install ai-proxy-core==0.4.0With OpenAI support:
pip install "ai-proxy-core[openai]==0.4.0"🚀 Quick Start
from ai_proxy_core import GPT4oImageProvider, ImageSize, ImageQuality
provider = GPT4oImageProvider(api_key="your-openai-key")
response = provider.generate(
prompt="A modern app icon with turquoise background",
size=ImageSize.SQUARE,
quality=ImageQuality.HD
)
with open("icon.png", "wb") as f:
f.write(response["image"])🌍 Localized Generation
# Generate app icons with localized text
locales = {"en": "CleanShots", "ja": "クリーンショット"}
for locale, text in locales.items():
response = provider.generate(
prompt=f"App icon with text '{text}'",
size=ImageSize.SQUARE
)📝 Notes
- DALL-E 3 may struggle with accurate non-Latin text rendering
- The provider uses an abstract pattern for easy extension to other services
- Full integration with existing
BaseCompletionsarchitecture
🔗 Links
Full Changelog: v0.3.9...v0.4.0
What's Changed
New Contributors
Full Changelog: v0.3.9...v0.4.0
ai-proxy-core v0.3.9
ai-proxy-core v0.3.9
Patch release with audio support improvements and docs.
Highlights
- Gemini Live WebSocket: Allow 16-bit PCM audio passthrough; non-PCM inputs return a clear message (WebM→PCM conversion not yet implemented).
- POST /chat/completions (Google/Gemini): Accept audio inputs via data URL audio_url and OpenAI-style input_audio objects (base64 + format).
- README updated to document the above features.
Pull Requests
- #24 Enable PCM audio passthrough for Gemini Live WS; accurate error for non-PCM (fixes #23)
#24 - #25 docs: update README for Gemini Live PCM and /chat/completions audio support (refs #23)
#25
Install
- PyPI: https://pypi.org/project/ai-proxy-core/0.3.9/
- pip install -U ai-proxy-core
Notes
- Tag: v0.3.9
- Thanks @ebowwa