Commit a0fae87
authored
feat: add completion types and optimize union unmarshal (envoyproxy#1242)
**Description**
This PR adds OpenAI's legacy `/completions` endpoint support to Envoy AI
Gateway, enabling compatibility with models like `babbage-002` and
`gpt-3.5-turbo-instruct`. Ideal for cost-optimized or fine-tuned setups
(e.g., vLLM/llama.cpp) needing suffix params or token prompts.
Refactored unions (`ContentUnion`, `EmbeddingRequestInput`,
`PromptUnion`) with custom unmarshaling for **1.6x–10x speedups**,
**60%–95% less memory**, and **52%–93% fewer allocations** vs.
openai-go. Gains are critical for batch/token-heavy workloads.
*Key Changes*
- New types: `CompletionRequest`, `CompletionResponse`, `PromptUnion`.
- Model constants for legacy testing.
- Unified `Usage` across APIs.
- New Benchmarks!
**Related Issues/PRs**
Starts envoyproxy#1231
---------
Signed-off-by: Adrian Cole <[email protected]>1 parent cf48588 commit a0fae87
File tree
40 files changed
+7207
-4978
lines changed- cmd/aigw
- internal
- apischema/openai
- extproc
- translator
- tracing
- openinference/openai
- tests
- extproc
- internal
- testopenai
- cassettes
- testopeninference
- spans
40 files changed
+7207
-4978
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
11 | 11 | | |
12 | 12 | | |
13 | 13 | | |
| 14 | + | |
| 15 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
77 | 77 | | |
78 | 78 | | |
79 | 79 | | |
80 | | - | |
| 80 | + | |
81 | 81 | | |
82 | 82 | | |
Large diffs are not rendered by default.
0 commit comments