Skip to content
Merged
Show file tree
Hide file tree
Changes from 122 commits
Commits
Show all changes
136 commits
Select commit Hold shift + click to select a range
d2a8238
site: fix dependency versions and config for site (#1286)
missBerg Oct 6, 2025
d437517
ci: format Go with golangci-lint (#1289)
anuraaga Oct 7, 2025
f14724b
fix(site/docs): helm CRD installation section heading spelling error …
tiswanso Oct 7, 2025
57221d1
docs: add pyhton agent tracing example (#1295)
nacx Oct 7, 2025
dba1989
ci: skip test_e2e_aigw on doc chaneg & fix docker job requirement (#1…
mathetake Oct 7, 2025
cffb9cd
fix: makes gRPC's MaxRecvMsgSize for the extension server (#1291)
owayss Oct 7, 2025
c041ff1
ci: format markdown files with prettier (#1290)
anuraaga Oct 7, 2025
89bc99d
mcp: relaxes the tight backendRef count limit on MCPRoute (#1284)
mathetake Oct 7, 2025
2ae8746
fix: make mcp span names informative (#1288)
codefromthecrypt Oct 7, 2025
fe43a82
fix: response_format schema type (#1299)
xiaolin593 Oct 8, 2025
309ef40
ci: migrate from codespell to misspell (#1300)
anuraaga Oct 8, 2025
0344926
translator: add marshal/unmarshal for tool choice (#1297)
aabchoo Oct 8, 2025
32ce458
feat: add gemini safety ratings to ChatCompletion responses (#1287)
sukumargaonkar Oct 8, 2025
cd8fef4
fix(site/docs): grammar fixes and k8s sidecar ref for concepts sectio…
tiswanso Oct 8, 2025
8533378
docs: fix mcp agent example (#1308)
nacx Oct 8, 2025
1b96d5d
mcp: configure access logs for upstream mcp servers (#1302)
nacx Oct 8, 2025
2d01dee
mcp: configure heartbeats to be less chatty (#1294)
nacx Oct 8, 2025
70613ad
test(mcp): drops learn-microsoft as it's flaky (#1310)
mathetake Oct 8, 2025
3df1d05
deps: upgrade EG Go dependency (#1309)
mathetake Oct 8, 2025
b5ed37c
feat: cleanly implement HEALTHCHECK for aigw docker (#1314)
codefromthecrypt Oct 9, 2025
d41fc34
feat: imagepull secrets for extproc image (#1311)
johnugeorge Oct 9, 2025
bb0b50c
aigw: refactors logging so there is basic status output (#1316)
codefromthecrypt Oct 9, 2025
8b6a8ce
fix(controller): add cases for Azure apikey/credentials to backendSec…
tiswanso Oct 9, 2025
9acb6e0
fix: remove aws knowledge MCP server (#1321)
codefromthecrypt Oct 9, 2025
9d64bdf
deps: update EG Go dependency (#1319)
codefromthecrypt Oct 9, 2025
10680e0
fix: request memory corruption on fallback (#1322)
johnugeorge Oct 10, 2025
3f10c30
fix: check sidecar container for rolling update (#1323)
yuzisun Oct 10, 2025
6519927
mcp: fix github public server tests after they renamed their tools (#…
nacx Oct 10, 2025
d3ff4ff
feat: support inferencepool v1 (#1033)
Xunzhuo Oct 10, 2025
0d73e09
feat: update cached token usage stats from cloud providers (#1276)
yuzisun Oct 10, 2025
ef4e0b1
cli: allow setting the Envoy version in standalone mode (#1156)
nacx Oct 10, 2025
09bf4c7
chore: remove flakey test for test forwarder (#1329)
codefromthecrypt Oct 10, 2025
7b55cbc
api: enforce the limitation on k8s service at CRD (#1328)
mathetake Oct 10, 2025
89c83cb
fix: show envoy stdout/stderr in aigw run (#1327)
codefromthecrypt Oct 10, 2025
d5d89ca
docs(api): ensure optional attributes on optional fields for MCP CRDs…
mathetake Oct 10, 2025
37b358b
fix: correctly URL encode ARNs when using inference profiles on Bedro…
adam-weber Oct 11, 2025
1c30a22
docs: add tracing implementation to the mcp design (#1339)
nacx Oct 11, 2025
9d73117
fix: fix very bad poll interval which causes test flakes (#1343)
codefromthecrypt Oct 12, 2025
36e4793
feat: azure-openai embeddings translator (#1257)
ion-elgreco Oct 12, 2025
91ad9bf
feat: add OpenAI image generation endpoint support
PatilHrushikesh Sep 22, 2025
70d09c0
feat: add metrics and tracing support for image generation
PatilHrushikesh Sep 22, 2025
2d22b18
chore: integrate image generation endpoint into main application
PatilHrushikesh Sep 22, 2025
da2622b
feat(aigw): consolidate admin server into a single port (#1236)
codefromthecrypt Sep 29, 2025
a4e904a
feat: create embeddings tracing implementation (#1240)
codefromthecrypt Sep 29, 2025
88b2465
feat: add metrics and tracing support for image generation
nutanix-Hrushikesh Sep 22, 2025
c723caa
extproc: update image generation processor metrics and handling
nutanix-Hrushikesh Sep 25, 2025
0abf064
metrics: refine image generation metrics and labels
nutanix-Hrushikesh Sep 25, 2025
f36b7d2
feat(extproc): implement image generation processor with OpenAI integ…
nutanix-Hrushikesh Oct 1, 2025
bc72988
feat(translator): add OpenAI image generation translation layer
nutanix-Hrushikesh Oct 1, 2025
35a2434
feat(observability): add comprehensive monitoring for image generation
nutanix-Hrushikesh Oct 1, 2025
0221b8a
refactor(extproc): enhance utilities and testing infrastructure
nutanix-Hrushikesh Oct 1, 2025
f348973
chore(deps): update dependencies and clean up legacy code
nutanix-Hrushikesh Oct 1, 2025
e4dffe6
test: add comprehensive tests for image generation tracing
nutanix-Hrushikesh Oct 1, 2025
23236c8
test: enhance image generation processor tests
nutanix-Hrushikesh Oct 1, 2025
1e826f6
feat: enhance image generation metrics
nutanix-Hrushikesh Oct 1, 2025
4ae452f
test: improve OpenInference image generation tests
nutanix-Hrushikesh Oct 1, 2025
e2d32a1
feat: add OpenAI image generation endpoint support
nutanix-Hrushikesh Oct 1, 2025
d1274f0
fix: remove duplicate waitUntilKubectl function
nutanix-Hrushikesh Oct 6, 2025
98b9cd0
feat: add SetOriginalModel method to ImageGenerationMetrics interface
nutanix-Hrushikesh Oct 6, 2025
d117c71
feat: update image generation translator constructor for tracing support
nutanix-Hrushikesh Oct 6, 2025
b2738f2
feat: align image generation processor with embeddings model handling…
nutanix-Hrushikesh Oct 6, 2025
cf5362f
chore(lint): fix revive, unconvert, and testifylint in image generati…
nutanix-Hrushikesh Oct 6, 2025
ccf4363
tests(testopenai): add image-generation cassette and request scaffolding
nutanix-Hrushikesh Oct 8, 2025
0afc3b8
tests(openinference): add cached span for image-generation-basic tests
nutanix-Hrushikesh Oct 8, 2025
7f57409
tests(openinference): refine span recording tests and proxy behavior
nutanix-Hrushikesh Oct 8, 2025
a290878
extproc: implement image generation processor and tests
nutanix-Hrushikesh Oct 8, 2025
40f195a
apischema(openai): extend schema to support image generation
nutanix-Hrushikesh Oct 8, 2025
b99b923
docs/compose: update docker-compose and .env for image features
nutanix-Hrushikesh Oct 8, 2025
11cdf26
docs(aigw): README and docker-compose updates for image generation (g…
nutanix-Hrushikesh Oct 9, 2025
f42a731
tests+api: use smaller model gpt-image-1-mini instead of dall-e-2; up…
nutanix-Hrushikesh Oct 9, 2025
1cac2de
tests(testopenai): add wait before server close to ensure cassette re…
nutanix-Hrushikesh Oct 9, 2025
d297900
fix: resolve linting issues
nutanix-Hrushikesh Oct 10, 2025
dbc5000
feat: update image generation model configuration
nutanix-Hrushikesh Oct 10, 2025
ee25037
refactor: consolidate error handling in image generation tracing
nutanix-Hrushikesh Oct 10, 2025
3fb95f0
fix: set response model from actual response body in image generation…
nutanix-Hrushikesh Oct 13, 2025
498172d
fix: move requireMCPSpan to a non-test file (#1336)
abolishgenocidenow Oct 13, 2025
e7d71af
docs: fixes broken supported-endpoints table (#1345)
mathetake Oct 13, 2025
9d38e2d
ci: add missing EG_VERSION env var in inference e2e (#1352)
nacx Oct 13, 2025
61185f3
feat: allow to configure the namespaces to watch (#1351)
nacx Oct 13, 2025
1d2839f
chore(deps): bump the go group across 1 directory with 21 updates (#1…
dependabot[bot] Oct 13, 2025
9ed5b2e
test: use aigw also for llm in goose e2e (#1354)
codefromthecrypt Oct 13, 2025
7d3200c
docs: mention the commit-hash-tagged helm chart (#1355)
mathetake Oct 13, 2025
55b00b7
docs: convert the goose example from an E2E to an example (#1356)
codefromthecrypt Oct 13, 2025
3633b8f
feat: implement otel tracing and metrics for completions endpoint (#1…
codefromthecrypt Oct 13, 2025
7c95a1d
fix: use internal id for routerProcessorsPerReqID map (#1344)
yuzisun Oct 13, 2025
a73b70f
mcp: properly handle missing session ID header (#1366)
nacx Oct 14, 2025
3ab4ec5
chore(controller_test): add OIDCtoken cases for SecurityPolicyIndexFu…
tiswanso Oct 14, 2025
790508f
test: consolidates e2e-ish aigw cli tests into e2e-aigw (#1359)
mathetake Oct 14, 2025
dd65197
fix gitignore, remove bedrock test, remove unnessesary debug logs, re…
nutanix-Hrushikesh Oct 14, 2025
f3386f7
feat(aigw): replace custom admin monitor with one in func-e (#1341)
codefromthecrypt Oct 14, 2025
7e8c0bb
mcp: make session encryption seed rotatable (#1357)
mathetake Oct 14, 2025
90b4eb8
feat: include model information in ChatCompletion responses for GCP A…
sukumargaonkar Oct 14, 2025
dccbce2
mcp: add configured header attribtues to metrics and spans (#1342)
nacx Oct 14, 2025
e6c1b62
fix: add stream value in gcp anthropic body (#1370)
alexagriffith Oct 15, 2025
6014aad
feat: support "CachedInputToken" type in "llmRequestCosts" (#1315)
everpeace Oct 15, 2025
0e28416
feat: first party Anthropic (api.anthropic.com) support (#1369)
mathetake Oct 15, 2025
1f58846
fix: distinguish /messages endpoint metrics from /chat/completions (#…
mathetake Oct 15, 2025
b6fafe9
chore(deps): bump the go group across 1 directory with 21 updates (#1…
dependabot[bot] Oct 13, 2025
522b36f
test: use aigw also for llm in goose e2e (#1354)
codefromthecrypt Oct 13, 2025
4331f2c
docs: convert the goose example from an E2E to an example (#1356)
codefromthecrypt Oct 13, 2025
74bf967
feat: implement otel tracing and metrics for completions endpoint (#1…
codefromthecrypt Oct 13, 2025
3b94cce
feat(aigw): replace custom admin monitor with one in func-e (#1341)
codefromthecrypt Oct 14, 2025
d9ef8fd
tests(extproc): add VCR test for OpenAI image generation
nutanix-Hrushikesh Oct 16, 2025
618505e
Merge branch 'main' into image-generation
nutanix-Hrushikesh Oct 17, 2025
6b537a3
fix(metrics): correctly instantiate per-request scope metrics
nutanix-Hrushikesh Oct 17, 2025
7314a8a
fix(metrics): set model before recording image metrics
nutanix-Hrushikesh Oct 17, 2025
1ec80f5
fix: clarify request model attribution and error wrapping
nutanix-Hrushikesh Oct 17, 2025
7d8bc39
docs: add and clarify supported endpoints page for image generation
nutanix-Hrushikesh Oct 17, 2025
27dfe48
tests(extproc): add OTEL image generation metrics and tracing
nutanix-Hrushikesh Oct 17, 2025
4b38b27
fix: only fail-fast on unexpected 5xx in testupstream_test.go
nutanix-Hrushikesh Oct 17, 2025
67bd662
extproc: images(OpenAI->OpenAI): wrap non-JSON upstream errors; strea…
nutanix-Hrushikesh Oct 19, 2025
48d6abb
docs: add Image Generation column to provider compatibility table
nutanix-Hrushikesh Oct 20, 2025
4e24c3c
Merge remote-tracking branch 'upstream/main' into image-generation
nutanix-Hrushikesh Oct 20, 2025
55a34f5
chore: remove unnecessary debug logging
nutanix-Hrushikesh Oct 21, 2025
6afd3ed
fix: clear env vars for config tests
nutanix-Hrushikesh Oct 21, 2025
aef452b
test: add test coverage for image generation
nutanix-Hrushikesh Oct 21, 2025
01a6cc1
Merge branch 'main' of github.com:envoyproxy/ai-gateway into image-ge…
nutanix-Hrushikesh Oct 21, 2025
a0edec0
fix: add missing attributes to image generation span
nutanix-Hrushikesh Oct 21, 2025
11acc32
fix: rename embeddings container
nutanix-Hrushikesh Oct 22, 2025
c3e078e
fix: update README to remove completion service
nutanix-Hrushikesh Oct 22, 2025
e81223a
chore: remove unnecessary debug logging and add defer cancel for context
nutanix-Hrushikesh Oct 22, 2025
39bd095
Merge branch 'main' into image-generation
nutanix-Hrushikesh Oct 22, 2025
7fa70f0
chore: run precommit to fix formatting issues
nutanix-Hrushikesh Oct 22, 2025
6f5f8d5
fix: add completion service to docker-compose-otel.yaml
nutanix-Hrushikesh Oct 22, 2025
716c31d
chore: undo unintended edits
nutanix-Hrushikesh Oct 22, 2025
3889088
fix: update docker-compose to use separate AIGW for image generation
nutanix-Hrushikesh Oct 22, 2025
5ea089e
fix: remove image generation support from docker-compose files and re…
nutanix-Hrushikesh Oct 22, 2025
66eb0b7
chore: remove redundant ImageGenerationError implementation and use o…
nutanix-Hrushikesh Oct 22, 2025
3fb4f0e
chore: revert GenAI metric constants to original structure
nutanix-Hrushikesh Oct 23, 2025
b7af1cc
fix: remove image generation specific tracing attributes
nutanix-Hrushikesh Oct 23, 2025
b96aa89
fix: remove .env.images file
nutanix-Hrushikesh Oct 23, 2025
e0ff580
Merge branch 'main' into image-generation
nutanix-Hrushikesh Oct 23, 2025
949a70a
fix: remove gen_ai.operation.name attribute from image generation span
nutanix-Hrushikesh Oct 23, 2025
e88071d
updates metrics.md
mathetake Oct 23, 2025
c955e6f
Merge remote-tracking branch 'origin/main' into image-generation
mathetake Oct 23, 2025
2d25398
more
mathetake Oct 23, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .env.ollama
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,4 @@ CHAT_MODEL=qwen2.5:0.5b
THINKING_MODEL=qwen3:1.7b
COMPLETION_MODEL=qwen2.5:0.5b
EMBEDDINGS_MODEL=all-minilm:33m
IMAGE_GENERATION_MODEL=dall-e-2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how is this Ollama?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this wont work, but there is no image gen model available with ollma

1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -49,3 +49,4 @@ inference-extension-conformance-test-report.yaml
.mcp.json

.goose
/aigw
31 changes: 29 additions & 2 deletions cmd/aigw/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,12 +56,38 @@ Here are values we use for Ollama:
```

- MCP (Model Context Protocol) tool call:

```bash
docker compose run --rm mcp
```

This calls the kiwi MCP server through aigw's MCP Gateway at `/mcp`.

4. **Shutdown the example stack**:
- Image generation:
- Using service:
```bash
docker compose run --rm image-generation
```
- Using curl (save to file):
```bash
curl -s \
-X POST http://localhost:1975/v1/images/generations \
-H "Authorization: Bearer unused" \
-H "Content-Type: application/json" \
-d '{"model":"gpt-image-1-mini","prompt":"A watercolor painting of a red fox in a birch forest","size":"1024x1024","quality":"low"}' \
| jq -r '.data[0].b64_json' | base64 -d > image.png
```

4. **Create embeddings**:

The `embeddings` service uses `curl` to send an embeddings request
to the AI Gateway CLI (aigw) which routes it to Ollama.

```bash
docker compose run --rm embeddings
```

5. **Shutdown the example stack**:

`down` stops the containers and removes the volumes used by the stack.

Expand Down Expand Up @@ -156,8 +182,9 @@ This configures the OTLP endpoint to otel-tui on port 4318.
```bash
COMPOSE_PROFILES="<profile>" docker compose -f docker-compose-otel.yaml run --build --rm chat-completion
COMPOSE_PROFILES="<profile>" docker compose -f docker-compose-otel.yaml run --build --rm create-embeddings
COMPOSE_PROFILES="<profile>" docker compose -f docker-compose-otel.yaml run --build --rm completion
COMPOSE_PROFILES="<profile>" docker compose -f docker-compose-otel.yaml run --build --rm mcp
# Image generation
COMPOSE_PROFILES="<profile>" docker compose -f docker-compose-otel.yaml run --build --rm image-generation
```

3. **Check telemetry output**:
Expand Down
4 changes: 4 additions & 0 deletions cmd/aigw/config_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,7 @@ func TestReadConfig(t *testing.T) {
// Clear any existing env vars
t.Setenv("OPENAI_API_KEY", "")
t.Setenv("OPENAI_BASE_URL", "")
t.Setenv("AZURE_OPENAI_API_KEY", "")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how is this relevant to this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tests were failing locally, to fix that i added this, but ill remove.


for k, v := range tt.envVars {
t.Setenv(k, v)
Expand All @@ -104,6 +105,9 @@ func TestReadConfig(t *testing.T) {
}

t.Run("error when file and no OPENAI_API_KEY", func(t *testing.T) {
// Ensure both OpenAI and Azure keys are unset so readConfig errors
t.Setenv("OPENAI_API_KEY", "")
t.Setenv("AZURE_OPENAI_API_KEY", "")
_, err := readConfig("", nil, false)
require.Error(t, err)
require.EqualError(t, err, "you must supply at least OPENAI_API_KEY or AZURE_OPENAI_API_KEY or a config file path")
Expand Down
36 changes: 18 additions & 18 deletions cmd/aigw/docker-compose-otel.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -123,24 +123,6 @@ services:
- OPENAI_BASE_URL=http://aigw:1975/v1
- OPENAI_API_KEY=unused

# completion is the standard OpenAI client (`openai` in pip), instrumented
# with the following OpenTelemetry instrumentation libraries:
# - openinference-instrumentation-openai (completions spans)
# - opentelemetry-instrumentation-httpx (HTTP client spans and trace headers)
completion:
build:
context: ../../tests/internal/testopeninference
dockerfile: Dockerfile.openai_client
target: completion
container_name: completion
profiles: ["test"]
env_file:
- ../../.env.ollama
- .env.otel.${COMPOSE_PROFILES:-console}
environment:
- OPENAI_BASE_URL=http://aigw:1975/v1
- OPENAI_API_KEY=unused
Comment on lines 126 to 142
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do not delete irrelevant thing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you not lie ?

# completion is the standard OpenAI client (`openai` in pip), instrumented
# with the following OpenTelemetry instrumentation libraries:
# - openinference-instrumentation-openai (completions spans)
# - opentelemetry-instrumentation-httpx (HTTP client spans and trace headers)
completion:
build:
context: ../../tests/internal/testopeninference
dockerfile: Dockerfile.openai_client
target: completion
container_name: completion
profiles: ["test"]
env_file:
- ../../.env.ollama
- .env.otel.${COMPOSE_PROFILES:-console}
environment:
- OPENAI_BASE_URL=http://aigw:1975/v1
- OPENAI_API_KEY=unused

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my bad, i think i was referring old file


# mcp is a test client for calling MCP tools through aigw.
# TODO: add client tracing + mcp propagation to this
mcp:
Expand All @@ -162,3 +144,21 @@ services:
- flyTo=LAX
- --tool-arg
- departureDate=15/12/2025

# image-generation is a simple curl-based test client for sending image
# generation requests to aigw.
image-generation:
image: golang:1.25
container_name: image-generation
profiles: ["test"]
env_file:
- ../../.env.ollama
command:
- sh
- -c
- |
curl -s -w %{http_code} \
-X POST http://aigw:1975/v1/images/generations \
-H "Authorization: Bearer unused" \
-H "Content-Type: application/json" \
-d "{\"model\":\"$$IMAGE_GENERATION_MODEL\",\"prompt\":\"A watercolor painting of a red fox in a birch forest\",\"size\":\"1024x1024\",\"quality\":\"low\"}"
20 changes: 20 additions & 0 deletions cmd/aigw/docker-compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -135,3 +135,23 @@ services:
- flyTo=LAX
- --tool-arg
- departureDate=15/12/2025

# image-generation is a simple curl-based test client for sending image
# generation requests to aigw.
image-generation:
image: golang:1.25
container_name: image-generation
profiles: ["test"]
env_file:
- ../../.env.ollama
command:
- sh
- -c
- |
curl -s -w %{http_code} \
-X POST http://aigw:1975/v1/images/generations \
-H "Authorization: Bearer unused" \
-H "Content-Type: application/json" \
-d "{\"model\":\"$$IMAGE_GENERATION_MODEL\",\"prompt\":\"A watercolor painting of a red fox in a birch forest\",\"size\":\"1024x1024\",\"quality\":\"low\"}"
extra_hosts: # localhost:host-gateway trick doesn't work with aigw
- "host.docker.internal:host-gateway"
4 changes: 4 additions & 0 deletions cmd/aigw/main_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,7 @@ Flags:
{
name: "run no arg",
args: []string{"run"},
env: map[string]string{"OPENAI_API_KEY": "", "AZURE_OPENAI_API_KEY": ""},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

rf: func(context.Context, cmdRun, runOpts, io.Writer, io.Writer) error { return nil },
expPanicCode: ptr.To(80),
},
Expand Down Expand Up @@ -190,6 +191,9 @@ func TestCmdRun_Validate(t *testing.T) {

for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
// Ensure a clean environment for validation.
t.Setenv("OPENAI_API_KEY", "")
t.Setenv("AZURE_OPENAI_API_KEY", "")
for k, v := range tt.envVars {
t.Setenv(k, v)
}
Expand Down
3 changes: 3 additions & 0 deletions cmd/extproc/mainlib/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -234,6 +234,8 @@ func Main(ctx context.Context, args []string, stderr io.Writer) (err error) {
messagesMetrics := metrics.NewMessagesFactory(meter, metricsRequestHeaderAttributes)
completionMetrics := metrics.NewCompletionFactory(meter, metricsRequestHeaderAttributes)
embeddingsMetrics := metrics.NewEmbeddingsFactory(meter, metricsRequestHeaderAttributes)
imageGenerationMetrics := metrics.NewImageGenerationFactory(meter, metricsRequestHeaderAttributes)()

mcpMetrics := metrics.NewMCP(meter, metricsRequestHeaderAttributes)

tracing, err := tracing.NewTracingFromEnv(ctx, os.Stdout, spanRequestHeaderAttributes)
Expand All @@ -248,6 +250,7 @@ func Main(ctx context.Context, args []string, stderr io.Writer) (err error) {
server.Register(path.Join(flags.rootPrefix, "/v1/chat/completions"), extproc.ChatCompletionProcessorFactory(chatCompletionMetrics))
server.Register(path.Join(flags.rootPrefix, "/v1/completions"), extproc.CompletionsProcessorFactory(completionMetrics))
server.Register(path.Join(flags.rootPrefix, "/v1/embeddings"), extproc.EmbeddingsProcessorFactory(embeddingsMetrics))
server.Register(path.Join(flags.rootPrefix, "/v1/images/generations"), extproc.ImageGenerationProcessorFactory(imageGenerationMetrics))
server.Register(path.Join(flags.rootPrefix, "/v1/models"), extproc.NewModelsProcessor)
server.Register(path.Join(flags.rootPrefix, "/anthropic/v1/messages"), extproc.MessagesProcessorFactory(messagesMetrics))

Expand Down
1 change: 1 addition & 0 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ require (
github.com/Azure/azure-sdk-for-go/sdk/azidentity v1.13.0
github.com/a8m/envsubst v1.4.3
github.com/alecthomas/kong v1.12.1
github.com/andybalholm/brotli v1.2.0
github.com/anthropics/anthropic-sdk-go v1.14.0
github.com/aws/aws-sdk-go-v2 v1.39.3
github.com/aws/aws-sdk-go-v2/aws/protocol/eventstream v1.7.2
Expand Down
4 changes: 4 additions & 0 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,8 @@ github.com/alecthomas/kong v1.12.1 h1:iq6aMJDcFYP9uFrLdsiZQ2ZMmcshduyGv4Pek0MQPW
github.com/alecthomas/kong v1.12.1/go.mod h1:p2vqieVMeTAnaC83txKtXe8FLke2X07aruPWXyMPQrU=
github.com/alecthomas/repr v0.4.0 h1:GhI2A8MACjfegCPVq9f1FLvIBS+DrQ2KQBFZP1iFzXc=
github.com/alecthomas/repr v0.4.0/go.mod h1:Fr0507jx4eOXV7AlPV6AVZLYrLIuIeSOWtW57eE/O/4=
github.com/andybalholm/brotli v1.2.0 h1:ukwgCxwYrmACq68yiUqwIWnGY0cTPox/M94sVwToPjQ=
github.com/andybalholm/brotli v1.2.0/go.mod h1:rzTDkvFWvIrjDXZHkuS16NPggd91W3kUSvPlQ1pLaKY=
github.com/anthropics/anthropic-sdk-go v1.14.0 h1:EzNQvnZlaDHe2UPkoUySDz3ixRgNbwKdH8KtFpv7pi4=
github.com/anthropics/anthropic-sdk-go v1.14.0/go.mod h1:WTz31rIUHUHqai2UslPpw5CwXrQP3geYBioRV4WOLvE=
github.com/antlr4-go/antlr/v4 v4.13.1 h1:SqQKkuVZ+zWkMMNkjy5FZe5mr5WURWnlpmOuzYWrPrQ=
Expand Down Expand Up @@ -448,6 +450,8 @@ github.com/x448/float16 v0.8.4 h1:qLwI1I70+NjRFUR3zs1JPUCgaCXSh3SW62uAKT1mSBM=
github.com/x448/float16 v0.8.4/go.mod h1:14CWIYCyZA/cWjXOioeEpHeN/83MdbZDRQHoFcYsOfg=
github.com/xiang90/probing v0.0.0-20221125231312-a49e3df8f510 h1:S2dVYn90KE98chqDkyE9Z4N61UnQd+KOfgp5Iu53llk=
github.com/xiang90/probing v0.0.0-20221125231312-a49e3df8f510/go.mod h1:UETIi67q53MR2AWcXfiuqkDkRtnGDLqkBTpCHuJHxtU=
github.com/xyproto/randomstring v1.0.5 h1:YtlWPoRdgMu3NZtP45drfy1GKoojuR7hmRcnhZqKjWU=
github.com/xyproto/randomstring v1.0.5/go.mod h1:rgmS5DeNXLivK7YprL0pY+lTuhNQW3iGxZ18UQApw/E=
github.com/yosida95/uritemplate/v3 v3.0.2 h1:Ed3Oyj9yrmi9087+NczuL5BwkIc4wvTb5zIM+UJPGz4=
github.com/yosida95/uritemplate/v3 v3.0.2/go.mod h1:ILOh0sOhIJR3+L/8afwt/kE++YT040gmv5BQTMR2HP4=
github.com/yuin/goldmark v1.1.27/go.mod h1:3hX8gzYuyVAZsxl0MRgGTJEmQBFcNTphYh9decYSb74=
Expand Down
4 changes: 4 additions & 0 deletions internal/apischema/openai/openai.go
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,10 @@ const (

// ModelTextEmbedding3Small is the cheapest model usable with /embeddings.
ModelTextEmbedding3Small = "text-embedding-3-small"

// ModelGPTImage1Mini is the smallest/cheapest Images model usable with
// /v1/images/generations. Use with size "1024x1024" and quality "low".
ModelGPTImage1Mini = "gpt-image-1-mini"
)

// ChatCompletionContentPartRefusalType The type of the content part.
Expand Down
Loading
Loading