Skip to content

[3.13] Update AI Proxy documentation for audio, video, and image editing/generation and rerank across providers #3352

@tomek-labuk

Description

@tomek-labuk

Jobs to be done

In 3.13 we are expanding coverage beyond text to multimodal AI capabilities. Update ai-proxy and ai-proxy-advanced provider feature matrix and associated examples to cover:

  • Audio input and audio output support across providers (AWS Bedrock, Gemini, Azure)
  • Video support notes and examples where applicable (upload and processing flows)
  • Image generation and image understanding support across providers
  • Gemini rerank endpoint support and example

Implementation tickets:

Definition of Done

  • Provider capability matrix updated in

  • New plugin examples include:

    • Audio-to-audio example (Gemini)
    • Audio input and transcription example (Bedrock, Azure)
    • Image input example (Gemini, Azure)
    • Image generation example (Bedrock, Azure)
    • Video usage notes and supported paths
    • Gemini rerank request and response example
  • Provider differences and API capabilities clearly documented

    • Format requirements (audio types, image formats, etc.)
    • Streaming vs non-streaming availability
    • Model naming examples

Information

Person of contact: Wangchong

Size

L

Metadata

Metadata

Assignees

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions