Skip to content

Commit ec504d6

Browse files
authored
feat/ai proxy vertex ai compatible (#3324)
1 parent b22aa7b commit ec504d6

File tree

7 files changed

+802
-1
lines changed

7 files changed

+802
-1
lines changed

plugins/wasm-go/extensions/ai-proxy/README.md

Lines changed: 82 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -331,6 +331,20 @@ Express Mode 是 Vertex AI 推出的简化访问模式,只需 API Key 即可
331331
| `apiTokens` | array of string | 必填 | - | Express Mode 使用的 API Key,从 Google Cloud Console 的 API & Services > Credentials 获取 |
332332
| `geminiSafetySetting` | map of string | 非必填 | - | Gemini AI 内容过滤和安全级别设定。参考[Safety settings](https://ai.google.dev/gemini-api/docs/safety-settings) |
333333

334+
**OpenAI 兼容模式**(使用 Vertex AI Chat Completions API):
335+
336+
Vertex AI 提供了 OpenAI 兼容的 Chat Completions API 端点,可以直接使用 OpenAI 格式的请求和响应,无需进行协议转换。详见 [Vertex AI OpenAI 兼容性文档](https://cloud.google.com/vertex-ai/generative-ai/docs/migrate/openai/overview)
337+
338+
| 名称 | 数据类型 | 填写要求 | 默认值 | 描述 |
339+
|-----------------------------|---------------|--------|--------|-------------------------------------------------------------------------------|
340+
| `vertexOpenAICompatible` | boolean | 非必填 | false | 启用 OpenAI 兼容模式。启用后将使用 Vertex AI 的 OpenAI-compatible Chat Completions API |
341+
| `vertexAuthKey` | string | 必填 | - | 用于认证的 Google Service Account JSON Key |
342+
| `vertexRegion` | string | 必填 | - | Google Cloud 区域(如 us-central1, europe-west4 等) |
343+
| `vertexProjectId` | string | 必填 | - | Google Cloud 项目 ID |
344+
| `vertexAuthServiceName` | string | 必填 | - | 用于 OAuth2 认证的服务名称 |
345+
346+
**注意**:OpenAI 兼容模式与 Express Mode 互斥,不能同时配置 `apiTokens``vertexOpenAICompatible`
347+
334348
#### AWS Bedrock
335349

336350
AWS Bedrock 所对应的 type 为 bedrock。它支持两种认证方式:
@@ -2082,6 +2096,74 @@ provider:
20822096
}
20832097
```
20842098

2099+
### 使用 OpenAI 协议代理 Google Vertex 服务(OpenAI 兼容模式)
2100+
2101+
OpenAI 兼容模式使用 Vertex AI 的 OpenAI-compatible Chat Completions API,请求和响应都使用 OpenAI 格式,无需进行协议转换。
2102+
2103+
**配置信息**
2104+
2105+
```yaml
2106+
provider:
2107+
type: vertex
2108+
vertexOpenAICompatible: true
2109+
vertexAuthKey: |
2110+
{
2111+
"type": "service_account",
2112+
"project_id": "your-project-id",
2113+
"private_key_id": "your-private-key-id",
2114+
"private_key": "-----BEGIN PRIVATE KEY-----\n...\n-----END PRIVATE KEY-----\n",
2115+
"client_email": "your-service-account@your-project.iam.gserviceaccount.com",
2116+
"token_uri": "https://oauth2.googleapis.com/token"
2117+
}
2118+
vertexRegion: us-central1
2119+
vertexProjectId: your-project-id
2120+
vertexAuthServiceName: your-auth-service-name
2121+
modelMapping:
2122+
"gpt-4": "gemini-2.0-flash"
2123+
"*": "gemini-1.5-flash"
2124+
```
2125+
2126+
**请求示例**
2127+
2128+
```json
2129+
{
2130+
"model": "gpt-4",
2131+
"messages": [
2132+
{
2133+
"role": "user",
2134+
"content": "你好,你是谁?"
2135+
}
2136+
],
2137+
"stream": false
2138+
}
2139+
```
2140+
2141+
**响应示例**
2142+
2143+
```json
2144+
{
2145+
"id": "chatcmpl-abc123",
2146+
"choices": [
2147+
{
2148+
"index": 0,
2149+
"message": {
2150+
"role": "assistant",
2151+
"content": "你好!我是由 Google 开发的 Gemini 模型。我可以帮助回答问题、提供信息和进行对话。有什么我可以帮您的吗?"
2152+
},
2153+
"finish_reason": "stop"
2154+
}
2155+
],
2156+
"created": 1729986750,
2157+
"model": "gemini-2.0-flash",
2158+
"object": "chat.completion",
2159+
"usage": {
2160+
"prompt_tokens": 12,
2161+
"completion_tokens": 35,
2162+
"total_tokens": 47
2163+
}
2164+
}
2165+
```
2166+
20852167
### 使用 OpenAI 协议代理 AWS Bedrock 服务
20862168

20872169
AWS Bedrock 支持两种认证方式:

plugins/wasm-go/extensions/ai-proxy/README_EN.md

Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -277,6 +277,20 @@ Express Mode is a simplified access mode introduced by Vertex AI. You can quickl
277277
| `apiTokens` | array of string | Required | - | API Key for Express Mode, obtained from Google Cloud Console under API & Services > Credentials |
278278
| `vertexGeminiSafetySetting` | map of string | Optional | - | Gemini model content safety filtering settings. |
279279

280+
**OpenAI Compatible Mode** (using Vertex AI Chat Completions API):
281+
282+
Vertex AI provides an OpenAI-compatible Chat Completions API endpoint, allowing you to use OpenAI format requests and responses directly without protocol conversion. See [Vertex AI OpenAI Compatibility documentation](https://cloud.google.com/vertex-ai/generative-ai/docs/migrate/openai/overview).
283+
284+
| Name | Data Type | Requirement | Default | Description |
285+
|-----------------------------|------------------|---------------| ------ |-------------------------------------------------------------------------------------------------------------------------------------------------------------|
286+
| `vertexOpenAICompatible` | boolean | Optional | false | Enable OpenAI compatible mode. When enabled, uses Vertex AI's OpenAI-compatible Chat Completions API |
287+
| `vertexAuthKey` | string | Required | - | Google Service Account JSON Key for authentication |
288+
| `vertexRegion` | string | Required | - | Google Cloud region (e.g., us-central1, europe-west4) |
289+
| `vertexProjectId` | string | Required | - | Google Cloud Project ID |
290+
| `vertexAuthServiceName` | string | Required | - | Service name for OAuth2 authentication |
291+
292+
**Note**: OpenAI Compatible Mode and Express Mode are mutually exclusive. You cannot configure both `apiTokens` and `vertexOpenAICompatible` at the same time.
293+
280294
#### AWS Bedrock
281295

282296
For AWS Bedrock, the corresponding `type` is `bedrock`. It supports two authentication methods:
@@ -1848,6 +1862,71 @@ provider:
18481862
}
18491863
```
18501864

1865+
### Utilizing OpenAI Protocol Proxy for Google Vertex Services (OpenAI Compatible Mode)
1866+
1867+
OpenAI Compatible Mode uses Vertex AI's OpenAI-compatible Chat Completions API. Both requests and responses use OpenAI format, requiring no protocol conversion.
1868+
1869+
**Configuration Information**
1870+
```yaml
1871+
provider:
1872+
type: vertex
1873+
vertexOpenAICompatible: true
1874+
vertexAuthKey: |
1875+
{
1876+
"type": "service_account",
1877+
"project_id": "your-project-id",
1878+
"private_key_id": "your-private-key-id",
1879+
"private_key": "-----BEGIN PRIVATE KEY-----\n...\n-----END PRIVATE KEY-----\n",
1880+
"client_email": "your-service-account@your-project.iam.gserviceaccount.com",
1881+
"token_uri": "https://oauth2.googleapis.com/token"
1882+
}
1883+
vertexRegion: us-central1
1884+
vertexProjectId: your-project-id
1885+
vertexAuthServiceName: your-auth-service-name
1886+
modelMapping:
1887+
"gpt-4": "gemini-2.0-flash"
1888+
"*": "gemini-1.5-flash"
1889+
```
1890+
1891+
**Request Example**
1892+
```json
1893+
{
1894+
"model": "gpt-4",
1895+
"messages": [
1896+
{
1897+
"role": "user",
1898+
"content": "Hello, who are you?"
1899+
}
1900+
],
1901+
"stream": false
1902+
}
1903+
```
1904+
1905+
**Response Example**
1906+
```json
1907+
{
1908+
"id": "chatcmpl-abc123",
1909+
"choices": [
1910+
{
1911+
"index": 0,
1912+
"message": {
1913+
"role": "assistant",
1914+
"content": "Hello! I am Gemini, an AI model developed by Google. I can help answer questions, provide information, and engage in conversations. How can I assist you today?"
1915+
},
1916+
"finish_reason": "stop"
1917+
}
1918+
],
1919+
"created": 1729986750,
1920+
"model": "gemini-2.0-flash",
1921+
"object": "chat.completion",
1922+
"usage": {
1923+
"prompt_tokens": 12,
1924+
"completion_tokens": 35,
1925+
"total_tokens": 47
1926+
}
1927+
}
1928+
```
1929+
18511930
### Utilizing OpenAI Protocol Proxy for AWS Bedrock Services
18521931

18531932
AWS Bedrock supports two authentication methods:

plugins/wasm-go/extensions/ai-proxy/provider/provider.go

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -387,6 +387,9 @@ type ProviderConfig struct {
387387
// @Title zh-CN Vertex token刷新提前时间
388388
// @Description zh-CN 用于Google服务账号认证,access token过期时间判定提前刷新,单位为秒,默认值为60秒
389389
vertexTokenRefreshAhead int64 `required:"false" yaml:"vertexTokenRefreshAhead" json:"vertexTokenRefreshAhead"`
390+
// @Title zh-CN Vertex AI OpenAI兼容模式
391+
// @Description zh-CN 启用后将使用Vertex AI的OpenAI兼容API,请求和响应均使用OpenAI格式,无需协议转换。与Express Mode(apiTokens)互斥。
392+
vertexOpenAICompatible bool `required:"false" yaml:"vertexOpenAICompatible" json:"vertexOpenAICompatible"`
390393
// @Title zh-CN 翻译服务需指定的目标语种
391394
// @Description zh-CN 翻译结果的语种,目前仅适用于DeepL服务。
392395
targetLang string `required:"false" yaml:"targetLang" json:"targetLang"`
@@ -540,6 +543,7 @@ func (c *ProviderConfig) FromJson(json gjson.Result) {
540543
if c.vertexTokenRefreshAhead == 0 {
541544
c.vertexTokenRefreshAhead = 60
542545
}
546+
c.vertexOpenAICompatible = json.Get("vertexOpenAICompatible").Bool()
543547
c.targetLang = json.Get("targetLang").String()
544548

545549
if schemaValue, ok := json.Get("responseJsonSchema").Value().(map[string]interface{}); ok {

plugins/wasm-go/extensions/ai-proxy/provider/vertex.go

Lines changed: 95 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@ import (
2121
"github.com/higress-group/wasm-go/pkg/log"
2222
"github.com/higress-group/wasm-go/pkg/wrapper"
2323
"github.com/tidwall/gjson"
24+
"github.com/tidwall/sjson"
2425
)
2526

2627
const (
@@ -32,13 +33,17 @@ const (
3233
// Express Mode 路径模板 (不含 project/location)
3334
vertexExpressPathTemplate = "/v1/publishers/google/models/%s:%s"
3435
vertexExpressPathAnthropicTemplate = "/v1/publishers/anthropic/models/%s:%s"
36+
// OpenAI-compatible endpoint 路径模板
37+
// /v1beta1/projects/{PROJECT_ID}/locations/{LOCATION}/endpoints/openapi/chat/completions
38+
vertexOpenAICompatiblePathTemplate = "/v1beta1/projects/%s/locations/%s/endpoints/openapi/chat/completions"
3539
vertexChatCompletionAction = "generateContent"
3640
vertexChatCompletionStreamAction = "streamGenerateContent?alt=sse"
3741
vertexAnthropicMessageAction = "rawPredict"
3842
vertexAnthropicMessageStreamAction = "streamRawPredict"
3943
vertexEmbeddingAction = "predict"
4044
vertexGlobalRegion = "global"
4145
contextClaudeMarker = "isClaudeRequest"
46+
contextOpenAICompatibleMarker = "isOpenAICompatibleRequest"
4247
vertexAnthropicVersion = "vertex-2023-10-16"
4348
)
4449

@@ -47,10 +52,28 @@ type vertexProviderInitializer struct{}
4752
func (v *vertexProviderInitializer) ValidateConfig(config *ProviderConfig) error {
4853
// Express Mode: 如果配置了 apiTokens,则使用 API Key 认证
4954
if len(config.apiTokens) > 0 {
55+
// Express Mode 与 OpenAI 兼容模式互斥
56+
if config.vertexOpenAICompatible {
57+
return errors.New("vertexOpenAICompatible is not compatible with Express Mode (apiTokens)")
58+
}
5059
// Express Mode 不需要其他配置
5160
return nil
5261
}
5362

63+
// OpenAI 兼容模式: 需要 OAuth 认证配置
64+
if config.vertexOpenAICompatible {
65+
if config.vertexAuthKey == "" {
66+
return errors.New("missing vertexAuthKey in vertex provider config for OpenAI compatible mode")
67+
}
68+
if config.vertexRegion == "" || config.vertexProjectId == "" {
69+
return errors.New("missing vertexRegion or vertexProjectId in vertex provider config for OpenAI compatible mode")
70+
}
71+
if config.vertexAuthServiceName == "" {
72+
return errors.New("missing vertexAuthServiceName in vertex provider config for OpenAI compatible mode")
73+
}
74+
return nil
75+
}
76+
5477
// 标准模式: 保持原有验证逻辑
5578
if config.vertexAuthKey == "" {
5679
return errors.New("missing vertexAuthKey in vertex provider config")
@@ -101,6 +124,12 @@ func (v *vertexProvider) isExpressMode() bool {
101124
return len(v.config.apiTokens) > 0
102125
}
103126

127+
// isOpenAICompatibleMode 检测是否启用 OpenAI 兼容模式
128+
// 使用 Vertex AI 的 OpenAI-compatible Chat Completions API
129+
func (v *vertexProvider) isOpenAICompatibleMode() bool {
130+
return v.config.vertexOpenAICompatible
131+
}
132+
104133
type vertexProvider struct {
105134
client wrapper.HttpClient
106135
config ProviderConfig
@@ -184,7 +213,30 @@ func (v *vertexProvider) OnRequestBody(ctx wrapper.HttpContext, apiName ApiName,
184213
if v.config.IsOriginal() {
185214
return types.ActionContinue, nil
186215
}
216+
187217
headers := util.GetRequestHeaders()
218+
219+
// OpenAI 兼容模式: 不转换请求体,只设置路径和进行模型映射
220+
if v.isOpenAICompatibleMode() {
221+
ctx.SetContext(contextOpenAICompatibleMarker, true)
222+
body, err := v.onOpenAICompatibleRequestBody(ctx, apiName, body, headers)
223+
headers.Set("Content-Length", fmt.Sprint(len(body)))
224+
util.ReplaceRequestHeaders(headers)
225+
_ = proxywasm.ReplaceHttpRequestBody(body)
226+
if err != nil {
227+
return types.ActionContinue, err
228+
}
229+
// OpenAI 兼容模式需要 OAuth token
230+
cached, err := v.getToken()
231+
if cached {
232+
return types.ActionContinue, nil
233+
}
234+
if err == nil {
235+
return types.ActionPause, nil
236+
}
237+
return types.ActionContinue, err
238+
}
239+
188240
body, err := v.TransformRequestBodyHeaders(ctx, apiName, body, headers)
189241
headers.Set("Content-Length", fmt.Sprint(len(body)))
190242

@@ -220,6 +272,32 @@ func (v *vertexProvider) TransformRequestBodyHeaders(ctx wrapper.HttpContext, ap
220272
}
221273
}
222274

275+
// onOpenAICompatibleRequestBody 处理 OpenAI 兼容模式的请求
276+
// 不转换请求体格式,只进行模型映射和路径设置
277+
func (v *vertexProvider) onOpenAICompatibleRequestBody(ctx wrapper.HttpContext, apiName ApiName, body []byte, headers http.Header) ([]byte, error) {
278+
if apiName != ApiNameChatCompletion {
279+
return nil, fmt.Errorf("OpenAI compatible mode only supports chat completions API")
280+
}
281+
282+
// 解析请求进行模型映射
283+
request := &chatCompletionRequest{}
284+
if err := v.config.parseRequestAndMapModel(ctx, request, body); err != nil {
285+
return nil, err
286+
}
287+
288+
// 设置 OpenAI 兼容端点路径
289+
path := v.getOpenAICompatibleRequestPath()
290+
util.OverwriteRequestPathHeader(headers, path)
291+
292+
// 如果模型被映射,需要更新请求体中的模型字段
293+
if request.Model != "" {
294+
body, _ = sjson.SetBytes(body, "model", request.Model)
295+
}
296+
297+
// 保持 OpenAI 格式,直接返回(可能更新了模型字段)
298+
return body, nil
299+
}
300+
223301
func (v *vertexProvider) onChatCompletionRequestBody(ctx wrapper.HttpContext, body []byte, headers http.Header) ([]byte, error) {
224302
request := &chatCompletionRequest{}
225303
err := v.config.parseRequestAndMapModel(ctx, request, body)
@@ -261,6 +339,12 @@ func (v *vertexProvider) onEmbeddingsRequestBody(ctx wrapper.HttpContext, body [
261339
}
262340

263341
func (v *vertexProvider) OnStreamingResponseBody(ctx wrapper.HttpContext, name ApiName, chunk []byte, isLastChunk bool) ([]byte, error) {
342+
// OpenAI 兼容模式: 透传响应,但需要解码 Unicode 转义序列
343+
// Vertex AI OpenAI-compatible API 返回 ASCII-safe JSON,将非 ASCII 字符编码为 \uXXXX
344+
if ctx.GetContext(contextOpenAICompatibleMarker) != nil && ctx.GetContext(contextOpenAICompatibleMarker).(bool) {
345+
return util.DecodeUnicodeEscapesInSSE(chunk), nil
346+
}
347+
264348
if ctx.GetContext(contextClaudeMarker) != nil && ctx.GetContext(contextClaudeMarker).(bool) {
265349
return v.claude.OnStreamingResponseBody(ctx, name, chunk, isLastChunk)
266350
}
@@ -301,6 +385,12 @@ func (v *vertexProvider) OnStreamingResponseBody(ctx wrapper.HttpContext, name A
301385
}
302386

303387
func (v *vertexProvider) TransformResponseBody(ctx wrapper.HttpContext, apiName ApiName, body []byte) ([]byte, error) {
388+
// OpenAI 兼容模式: 透传响应,但需要解码 Unicode 转义序列
389+
// Vertex AI OpenAI-compatible API 返回 ASCII-safe JSON,将非 ASCII 字符编码为 \uXXXX
390+
if ctx.GetContext(contextOpenAICompatibleMarker) != nil && ctx.GetContext(contextOpenAICompatibleMarker).(bool) {
391+
return util.DecodeUnicodeEscapes(body), nil
392+
}
393+
304394
if ctx.GetContext(contextClaudeMarker) != nil && ctx.GetContext(contextClaudeMarker).(bool) {
305395
return v.claude.TransformResponseBody(ctx, apiName, body)
306396
}
@@ -510,6 +600,11 @@ func (v *vertexProvider) getRequestPath(apiName ApiName, modelId string, stream
510600
return path
511601
}
512602

603+
// getOpenAICompatibleRequestPath 获取 OpenAI 兼容模式的请求路径
604+
func (v *vertexProvider) getOpenAICompatibleRequestPath() string {
605+
return fmt.Sprintf(vertexOpenAICompatiblePathTemplate, v.config.vertexProjectId, v.config.vertexRegion)
606+
}
607+
513608
func (v *vertexProvider) buildVertexChatRequest(request *chatCompletionRequest) *vertexChatRequest {
514609
safetySettings := make([]vertexChatSafetySetting, 0)
515610
for category, threshold := range v.config.geminiSafetySetting {

0 commit comments

Comments
 (0)