Skip to content

Releases: ebowwa/ai-proxy-core

v0.4.42: Complete Fix for Gemini Video Validation

28 Aug 20:01

Choose a tag to compare

What's Changed

  • Complete fix for Gemini video validation - consistent field naming (#37)

Bug Fixes (Completes v0.4.41 fix)

  • Fixed inconsistent camelCase/snake_case in file path video processing
  • File path inputs now correctly use inline_data and mime_type (snake_case)
  • Ensures consistency with base64 input handling from v0.4.41
  • Fully resolves Issue #35: Video input fails with Gemini

Important Note

v0.4.41 only included partial fix (PR #36). This release includes the complete fix with PR #37.

Credits

Full Changelog: v0.4.41...v0.4.42

v0.4.41: Fix Gemini Video Content Validation

28 Aug 09:11

Choose a tag to compare

What's Changed

  • Fixed Gemini video content validation error (#36)

Bug Fixes

  • Video content now properly wrapped in types.Part objects with inline_data
  • Prevents "Extra inputs are not permitted" Pydantic validation errors
  • Maintains full compatibility with base64 and file path video inputs
  • Preserves existing MIME type detection logic

Credits

Full Changelog: v0.4.40...v0.4.41

v0.4.40

27 Aug 03:19

Choose a tag to compare

Release 0.4.40

Link to Devin run: https://app.devin.ai/sessions/ac6ad5798ea7497ba0758163a0425b10
Requested by: Elijah Arbee (@ebowwa)

v0.4.3 - Universal System Instruction Abstraction

21 Aug 19:44

Choose a tag to compare

✨ Enhancement Release

This release provides universal system_instruction abstraction across all AI providers, building on the fix from v0.4.2.

What's New

  • Universal system_instruction: Same parameter now works seamlessly across all providers
    • OpenAI/Ollama: Automatically converts to system message in messages array
    • Gemini: Passes as native system_instruction parameter
    • Anthropic: Converts to system parameter
  • Zero code changes needed: Use the same system_instruction parameter everywhere
  • Backward compatible: All existing code continues to work

Example Usage

from ai_proxy_core import CompletionClient

client = CompletionClient()

# Same system_instruction works for all providers\!
response = await client.create_completion(
    messages=[{"role": "user", "content": "Hello"}],
    model="gpt-4",  # or "gemini-1.5-flash", "claude-3", "llama2"
    system_instruction="You are a helpful pirate. Speak like a pirate."
)

Installation

pip install --upgrade ai-proxy-core==0.4.3

Links

Full Changelog: v0.4.2...v0.4.3

v0.4.2 - OpenAI Provider Fix

21 Aug 19:28

Choose a tag to compare

🐛 Bug Fix Release

This release fixes a critical issue with the OpenAI provider that was causing failures when using the unified CompletionClient.

What's Fixed

  • Issue #32: OpenAI provider no longer fails with unexpected keyword argument 'system_instruction' error
  • Added parameter filtering to exclude Gemini-specific parameters (system_instruction, safety_settings)
  • Ensures seamless cross-provider compatibility

Installation

pip install --upgrade ai-proxy-core==0.4.2

Testing

The fix has been verified with comprehensive tests included in test_openai_fix_v0.4.2.py

Links

Full Changelog: v0.4.1...v0.4.2

v0.4.1: Explicit Model Selection for Image Generation

19 Aug 23:17

Choose a tag to compare

🔧 Breaking Changes

This release corrects the image generation API to require explicit model selection. No automatic fallback or model selection.

Migration Required

# Old (v0.4.0)
provider = GPT4oImageProvider(api_key="...")
response = provider.generate(prompt="...")

# New (v0.4.1)  
provider = OpenAIImageProvider(api_key="...")
response = provider.generate(
    model=ImageModel.DALLE_3,  # REQUIRED
    prompt="..."
)

✨ What's New

  • Explicit Model Selection: Every request must specify which model to use
  • Three Models Available:
    • dall-e-2: Multiple images, editing, 256x256 to 1024x1024
    • dall-e-3: Styles, HD quality, up to 1792x1024, revised prompts
    • gpt-image-1: Token pricing, 4K resolution (4096x4096), better instruction following
  • Model-Specific Validation: Each model validates its specific parameters
  • Token Usage Tracking: GPT-Image-1 returns token usage information

📦 Installation

pip install ai-proxy-core==0.4.1

🚀 Quick Start

from ai_proxy_core import OpenAIImageProvider, ImageModel

provider = OpenAIImageProvider(api_key="your-key")

# Must specify model explicitly
response = provider.generate(
    model=ImageModel.DALLE_3,
    prompt="Modern app icon",
    size="1024x1024",
    quality="hd"
)

🔍 Available Models

Model Max Size Features Special
dall-e-2 1024x1024 Edit, Multiple images (n≤10) Lowest cost
dall-e-3 1792x1024 Styles, HD quality Best for art
gpt-image-1 4096x4096 Token pricing, Better instructions Best quality

🐛 Fixes

  • Fixed GPT-Image-1 response_format parameter handling
  • Corrected model-specific size validation
  • Improved error messages for invalid models

Full Changelog: v0.4.0...v0.4.1

v0.4.0: GPT-4o Image Generation

19 Aug 22:54
a330c06

Choose a tag to compare

🎨 Image Generation Support

This release adds comprehensive image generation capabilities to ai-proxy-core using OpenAI's DALL-E 3 API, with an abstract provider pattern for future GPT-4o native generation.

✨ Features

  • Image Generation Provider: New GPT4oImageProvider for DALL-E 3 image generation
  • Multiple Sizes: Support for SQUARE (1024x1024), LANDSCAPE (1792x1024), and PORTRAIT (1024x1792)
  • Quality Options: STANDARD and HD quality settings
  • Style Options: VIVID and NATURAL generation styles
  • Image Editing: Edit existing images with natural language prompts and optional masks
  • Azure Support: Full Azure OpenAI integration with AzureGPT4oImageProvider
  • Localization: Generate images with localized text for app internationalization
  • C2PA Metadata: Extract content authenticity metadata from generated images

📦 Installation

pip install ai-proxy-core==0.4.0

With OpenAI support:

pip install "ai-proxy-core[openai]==0.4.0"

🚀 Quick Start

from ai_proxy_core import GPT4oImageProvider, ImageSize, ImageQuality

provider = GPT4oImageProvider(api_key="your-openai-key")

response = provider.generate(
    prompt="A modern app icon with turquoise background",
    size=ImageSize.SQUARE,
    quality=ImageQuality.HD
)

with open("icon.png", "wb") as f:
    f.write(response["image"])

🌍 Localized Generation

# Generate app icons with localized text
locales = {"en": "CleanShots", "ja": "クリーンショット"}

for locale, text in locales.items():
    response = provider.generate(
        prompt=f"App icon with text '{text}'",
        size=ImageSize.SQUARE
    )

📝 Notes

  • DALL-E 3 may struggle with accurate non-Latin text rendering
  • The provider uses an abstract pattern for easy extension to other services
  • Full integration with existing BaseCompletions architecture

🔗 Links


Full Changelog: v0.3.9...v0.4.0

What's Changed

  • feat: GPT-4o Image Generation Provider v0.4.0 by @ebowwa in #31

New Contributors

Full Changelog: v0.3.9...v0.4.0

ai-proxy-core v0.3.9

14 Aug 08:05
68b586e

Choose a tag to compare

ai-proxy-core v0.3.9

Patch release with audio support improvements and docs.

Highlights

  • Gemini Live WebSocket: Allow 16-bit PCM audio passthrough; non-PCM inputs return a clear message (WebM→PCM conversion not yet implemented).
  • POST /chat/completions (Google/Gemini): Accept audio inputs via data URL audio_url and OpenAI-style input_audio objects (base64 + format).
  • README updated to document the above features.

Pull Requests

  • #24 Enable PCM audio passthrough for Gemini Live WS; accurate error for non-PCM (fixes #23)
    #24
  • #25 docs: update README for Gemini Live PCM and /chat/completions audio support (refs #23)
    #25

Install

Notes