Add Google Model Garden's Anthropic support to Inference Plugin #134080

Jan-Kazlouski-elastic · 2025-09-03T18:28:58Z

Update of the existing Google Vertex AI inference provider integration allowing performing completion (both streaming and non-streaming) and chat_completion (only streaming) of Anthropic provider models withing Google Model Garden.

Changes were tested locally against next anthropic models:

claude-3-5-haiku

Create Completion Endpoint

Success:

PUT {{base-url}}/_inference/completion/google-model-garden-anthropic-completion
RQ
{
    "service": "googlevertexai",
    "service_settings": {
        "provider": "anthropic",
        "service_account_json": {{service_account_config}},
        "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict",
        "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict"
    }
}
RS
{
    "inference_id": "google-model-garden-anthropic-completion",
    "task_type": "completion",
    "service": "googlevertexai",
    "service_settings": {
        "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict",
        "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict",
        "provider": "ANTHROPIC",
        "rate_limit": {
            "requests_per_minute": 1000
        }
    }
}

With max_tokens in task settings:

PUT {{base-url}}/_inference/completion/google-model-garden-anthropic-completion
RQ
{
    "service": "googlevertexai",
    "service_settings": {
        "provider": "anthropic",
        "service_account_json": {{service_account_config}},
        "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict",
        "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict"
    },
    "task_settings": {
        "max_tokens": 10
    }
}
RS
{
    "inference_id": "google-model-garden-anthropic-completion",
    "task_type": "completion",
    "service": "googlevertexai",
    "service_settings": {
        "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict",
        "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict",
        "provider": "ANTHROPIC",
        "rate_limit": {
            "requests_per_minute": 1000
        }
    }
}

Unknown Provider:

PUT {{base-url}}/_inference/completion/google-model-garden-anthropic-completion
RQ
{
    "service": "googlevertexai",
    "service_settings": {
        "provider": "unknown",
        "service_account_json": {{service_account_config}},
        "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict",
        "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict"
    }
}
RS
{
    "error": {
        "root_cause": [
            {
                "type": "validation_exception",
                "reason": "Validation Failed: 1: [service_settings] Invalid value [unknown] received. [provider] must be one of [anthropic];"
            }
        ],
        "type": "validation_exception",
        "reason": "Validation Failed: 1: [service_settings] Invalid value [unknown] received. [provider] must be one of [anthropic];"
    },
    "status": 400
}

No Provider + No Google Vertex AI parameters:

PUT {{base-url}}/_inference/completion/google-model-garden-anthropic-completion
RQ
{
    "service": "googlevertexai",
    "service_settings": {
        "service_account_json": {{service_account_config}},
        "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict",
        "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict"
    }
}
RS
{
    "error": {
        "root_cause": [
            {
                "type": "validation_exception",
                "reason": "Validation Failed: 1: For Google Model Garden, you must provide either provider with uri and/or streaming_uri. For Google Vertex AI models, you must provide location, project_id, and model. Provided values: location=null, project_id=null, model=null, provider=null, url=https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict, streaming_url=https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict.;"
            }
        ],
        "type": "validation_exception",
        "reason": "Validation Failed: 1: For Google Model Garden, you must provide either provider with url and/or streaming_url. For Google Vertex AI models, you must provide location, project_id, and model. Provided values: location=null, project_id=null, model=null, provider=null, url=https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict, streaming_url=https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict.;"
    },
    "status": 400
}

No URL + No Streaming URL + No Google Vertex AI parameters:

PUT {{base-url}}/_inference/completion/google-model-garden-anthropic-completion
RQ
{
    "service": "googlevertexai",
    "service_settings": {
        "provider": "anthropic",
        "service_account_json": {{service_account_config}}
    }
}
RS
{
    "error": {
        "root_cause": [
            {
                "type": "validation_exception",
                "reason": "Validation Failed: 1: For Google Model Garden, you must provide either provider with uri and/or streaming_uri. For Google Vertex AI models, you must provide location, project_id, and model. Provided values: location=null, project_id=null, model=null, provider=anthropic, url=null, streaming_url=null.;"
            }
        ],
        "type": "validation_exception",
        "reason": "Validation Failed: 1: For Google Model Garden, you must provide either provider with uri and/or streaming_uri. For Google Vertex AI models, you must provide location, project_id, and model. Provided values: location=null, project_id=null, model=null, provider=anthropic, url=null, streaming_url=null.;"
    },
    "status": 400
}

URL + No Streaming URL (URL is default for both streaming/non-streaming):

PUT {{base-url}}/_inference/completion/google-model-garden-anthropic-completion
RQ
{
    "service": "googlevertexai",
    "service_settings": {
        "provider": "anthropic",
        "service_account_json": {{service_account_config}},
        "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict"
    }
}
RS
{
    "inference_id": "google-model-garden-anthropic-completion",
    "task_type": "completion",
    "service": "googlevertexai",
    "service_settings": {
        "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict",
        "provider": "ANTHROPIC",
        "rate_limit": {
            "requests_per_minute": 1000
        }
    }
}

No URL + Streaming URL (Streaming URL is default for both streaming/non-streaming):

PUT {{base-url}}/_inference/completion/google-model-garden-anthropic-completion
RQ
{
    "service": "googlevertexai",
    "service_settings": {
        "provider": "anthropic",
        "service_account_json": {{service_account_config}},
        "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict"
    }
}
RS
{
    "inference_id": "google-model-garden-anthropic-completion",
    "task_type": "completion",
    "service": "googlevertexai",
    "service_settings": {
        "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict",
        "provider": "ANTHROPIC",
        "rate_limit": {
            "requests_per_minute": 1000
        }
    }
}

Not Found:

PUT {{base-url}}/_inference/completion/google-model-garden-anthropic-completion
RQ
{
    "service": "googlevertexai",
    "service_settings": {
        "provider": "anthropic",
        "service_account_json": {{service_account_config}},
        "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict2",
        "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict2"
    }
}
RS
{
    "error": {
        "root_cause": [
            {
                "type": "status_exception",
                "reason": "Received an unsuccessful status code for request from inference entity id [google-model-garden-anthropic-completion] status [404]"
            }
        ],
        "type": "status_exception",
        "reason": "Could not complete inference endpoint creation as validation call to service threw an exception.",
        "caused_by": {
            "type": "status_exception",
            "reason": "Received an unsuccessful status code for request from inference entity id [google-model-garden-anthropic-completion] status [404]"
        }
    },
    "status": 400
}

Perform Completion

Success Non Streaming:

POST {{base-url}}/_inference/completion/google-model-garden-anthropic-completion
RQ
{
    "input": "The sky above the port was the color of television tuned to a dead channel."
}
RS
{
    "completion": [
        {
            "result": "That's the opening line from William Gibson's groun"
        }
    ]
}

Success Streaming:

POST {{base-url}}/_inference/completion/google-model-garden-anthropic-completion/_stream
RQ
{
    "input": "The sky above the port was the color of television tuned to a dead channel."
}
RS
event: message
data: {"completion":[{"delta":"This"},{"delta":" is"},{"delta":" the"},{"delta":" opening"}]}

event: message
data: {"completion":[{"delta":" line of William"},{"delta":" Gibson's sem"}]}

event: message
data: [DONE]

Success Non Streaming with task_settings max_tokens:

POST {{base-url}}/_inference/completion/google-model-garden-anthropic-completion
RQ
{
    "input": "The sky above the port was the color of television tuned to a dead channel.",
    "task_settings": {
        "max_tokens": 20
    }
}
RS
{
    "completion": [
        {
            "result": "That's the opening line from William Gibson's groun"
        }
    ]
}

Success Streaming with task_settings max_tokens:

POST {{base-url}}/_inference/completion/google-model-garden-anthropic-completion/_stream
RQ
{
    "input": "The sky above the port was the color of television tuned to a dead channel.",
    "task_settings": {
        "max_tokens": 20
    }
}
RS
event: message
data: {"completion":[{"delta":"This"},{"delta":" is"},{"delta":" the"},{"delta":" opening"}]}

event: message
data: {"completion":[{"delta":" line of William"},{"delta":" Gibson's sem"}]}

event: message
data: [DONE]

Create Chat Completion Endpoint

Success:

PUT {{base-url}}/_inference/chat_completion/google-model-garden-anthropic-chat-completion
RQ:
{
    "service": "googlevertexai",
    "service_settings": {
        "provider": "anthropic",
        "service_account_json": {{service_account_config}},
        "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict",
        "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict"
    }
}
RS:
{
    "inference_id": "google-model-garden-anthropic-chat-completion",
    "task_type": "chat_completion",
    "service": "googlevertexai",
    "service_settings": {
        "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict",
        "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict",
        "provider": "ANTHROPIC",
        "rate_limit": {
            "requests_per_minute": 1000
        }
    }
}

Success with task_settings max_tokens:

PUT {{base-url}}/_inference/chat_completion/google-model-garden-anthropic-chat-completion
RQ:
{
    "service": "googlevertexai",
    "service_settings": {
        "provider": "anthropic",
        "service_account_json": {{service_account_config}},
        "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict",
        "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict"
    },
    "task_settings": {
        "max_tokens": 5
    }
}
RS:
{
    "inference_id": "google-model-garden-anthropic-chat-completion",
    "task_type": "chat_completion",
    "service": "googlevertexai",
    "service_settings": {
        "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict",
        "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict",
        "provider": "ANTHROPIC",
        "rate_limit": {
            "requests_per_minute": 1000
        }
    }
}

Unknown Provider:

PUT {{base-url}}/_inference/chat_completion/google-model-garden-anthropic-chat-completion
RQ:
{
    "service": "googlevertexai",
    "service_settings": {
        "provider": "unknown",
        "service_account_json": {{service_account_config}},
        "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict",
        "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict"
    }
}
RS:
{
    "error": {
        "root_cause": [
            {
                "type": "validation_exception",
                "reason": "Validation Failed: 1: [service_settings] Invalid value [unknown] received. [provider] must be one of [anthropic];"
            }
        ],
        "type": "validation_exception",
        "reason": "Validation Failed: 1: [service_settings] Invalid value [unknown] received. [provider] must be one of [anthropic];"
    },
    "status": 400
}

No url/streaming_url:

PUT {{base-url}}/_inference/chat_completion/google-model-garden-anthropic-chat-completion
RQ:
{
    "service": "googlevertexai",
    "service_settings": {
        "provider": "anthropic",
        "service_account_json": {{service_account_config}}
    }
}
RS:
{
    "error": {
        "root_cause": [
            {
                "type": "validation_exception",
                "reason": "Validation Failed: 1: For Google Model Garden, you must provide either provider with uri and/or streaming_uri. For Google Vertex AI models, you must provide location, project_id, and model. Provided values: location=null, project_id=null, model=null, provider=anthropic, url=null, streaming_url=null.;"
            }
        ],
        "type": "validation_exception",
        "reason": "Validation Failed: 1: For Google Model Garden, you must provide either provider with uri and/or streaming_uri. For Google Vertex AI models, you must provide location, project_id, and model. Provided values: location=null, project_id=null, model=null, provider=anthropic, url=null, streaming_url=null.;"
    },
    "status": 400
}

Not found:

PUT {{base-url}}/_inference/chat_completion/google-model-garden-anthropic-chat-completion
RQ:
{
    "service": "googlevertexai",
    "service_settings": {
        "provider": "anthropic",
        "service_account_json": {{service_account_config}},
        "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict2",
        "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict2"
    }
}
RS:
{
    "error": {
        "root_cause": [
            {
                "type": "unified_chat_completion_exception",
                "reason": "Received an unsuccessful status code for request from inference entity id [google-model-garden-anthropic-chat-completion] status [404]. Error message: [<!DOCTYPE html>\n<html lang=en>\n  <meta charset=utf-8>\n  <meta name=viewport content=\"initial-scale=1, minimum-scale=1, width=device-width\">\n  <title>Error 404 (Not Found)!!1</title>\n  <style>\n    *{margin:0;padding:0}html,code{font:15px/22px arial,sans-serif}html{background:#fff;color:#222;padding:15px}body{margin:7% auto 0;max-width:390px;min-height:180px;padding:30px 0 15px}* > body{background:url(//www.google.com/images/errors/robot.png) 100% 5px no-repeat;padding-right:205px}p{margin:11px 0 22px;overflow:hidden}ins{color:#777;text-decoration:none}a img{border:0}@media screen and (max-width:772px){body{background:none;margin-top:0;max-width:none;padding-right:0}}#logo{background:url(//www.google.com/images/branding/googlelogo/1x/googlelogo_color_150x54dp.png) no-repeat;margin-left:-5px}@media only screen and (min-resolution:192dpi){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat 0% 0%/100% 100%;-moz-border-image:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) 0}}@media only screen and (-webkit-min-device-pixel-ratio:2){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat;-webkit-background-size:100% 100%}}#logo{display:inline-block;height:54px;width:150px}\n  </style>\n  <a href=//www.google.com/><span id=logo aria-label=Google></span></a>\n  <p><b>404.</b> <ins>That’s an error.</ins>\n  <p>The requested URL <code>/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict2</code> was not found on this server.  <ins>That’s all we know.</ins>\n]"
            }
        ],
        "type": "status_exception",
        "reason": "Could not complete inference endpoint creation as validation call to service threw an exception.",
        "caused_by": {
            "type": "unified_chat_completion_exception",
            "reason": "Received an unsuccessful status code for request from inference entity id [google-model-garden-anthropic-chat-completion] status [404]. Error message: [<!DOCTYPE html>\n<html lang=en>\n  <meta charset=utf-8>\n  <meta name=viewport content=\"initial-scale=1, minimum-scale=1, width=device-width\">\n  <title>Error 404 (Not Found)!!1</title>\n  <style>\n    *{margin:0;padding:0}html,code{font:15px/22px arial,sans-serif}html{background:#fff;color:#222;padding:15px}body{margin:7% auto 0;max-width:390px;min-height:180px;padding:30px 0 15px}* > body{background:url(//www.google.com/images/errors/robot.png) 100% 5px no-repeat;padding-right:205px}p{margin:11px 0 22px;overflow:hidden}ins{color:#777;text-decoration:none}a img{border:0}@media screen and (max-width:772px){body{background:none;margin-top:0;max-width:none;padding-right:0}}#logo{background:url(//www.google.com/images/branding/googlelogo/1x/googlelogo_color_150x54dp.png) no-repeat;margin-left:-5px}@media only screen and (min-resolution:192dpi){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat 0% 0%/100% 100%;-moz-border-image:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) 0}}@media only screen and (-webkit-min-device-pixel-ratio:2){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat;-webkit-background-size:100% 100%}}#logo{display:inline-block;height:54px;width:150px}\n  </style>\n  <a href=//www.google.com/><span id=logo aria-label=Google></span></a>\n  <p><b>404.</b> <ins>That’s an error.</ins>\n  <p>The requested URL <code>/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict2</code> was not found on this server.  <ins>That’s all we know.</ins>\n]"
        }
    },
    "status": 400
}

No streaming_url (url is default for both streaming/non-streaming):

PUT {{base-url}}/_inference/chat_completion/google-model-garden-anthropic-chat-completion
RQ:
{
    "service": "googlevertexai",
    "service_settings": {
        "provider": "anthropic",
        "service_account_json": {{service_account_config}},
        "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict"
    }
}
RS:
{
    "inference_id": "google-model-garden-anthropic-chat-completion",
    "task_type": "chat_completion",
    "service": "googlevertexai",
    "service_settings": {
        "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict",
        "provider": "ANTHROPIC",
        "rate_limit": {
            "requests_per_minute": 1000
        }
    }
}

No url (steraming_url is default for both streaming/non-streaming):

PUT {{base-url}}/_inference/chat_completion/google-model-garden-anthropic-chat-completion
RQ:
{
    "service": "googlevertexai",
    "service_settings": {
        "provider": "anthropic",
        "service_account_json": {{service_account_config}},
        "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict"
    }
}
RS:
{
    "inference_id": "google-model-garden-anthropic-chat-completion",
    "task_type": "chat_completion",
    "service": "googlevertexai",
    "service_settings": {
        "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict",
        "provider": "ANTHROPIC",
        "rate_limit": {
            "requests_per_minute": 1000
        }
    }
}

Perform Chat Completion

Basic:

POST {{base-url}}/_inference/chat_completion/google-model-garden-anthropic-chat-completion/_stream
RQ
{
    "messages": [
        {
            "role": "user",
            "content": "What is deep learning?"
        }
    ],
    "max_completion_tokens": 100
}
RS
event: message
data: {"id":"msg_vrtx_01X7rphiRiVbTsKBiEJ2ejjR","choices":[{"delta":{"role":"assistant"},"index":0}],"model":"claude-3-5-haiku-20241022","object":null,"usage":{"completion_tokens":5,"prompt_tokens":12,"total_tokens":17}}

event: message
data: {"id":null,"choices":[{"delta":{"content":""},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"Deep learning is a subset"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" of machine learning an"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"d artificial intelligence that uses"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" artificial neural networks with"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" multiple layers ("},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"hence \""},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"deep\") to progress"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"ively extract"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" higher-level features from"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" raw input"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":". Here"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" are key characteristics"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":":\n\n1"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":". Core Components"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":":\n-"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" Neural networks"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" with multiple hidden"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" layers"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"\n- Ability"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" to learn complex patterns from"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" large datasets"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"\n-"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" Inspired by the"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" structure of"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" human brain"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" neurons\n\n2. Key"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" Techniques:\n-"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" Convolutional Neural"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" Networks (CNNs"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":")\n-"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{},"finish_reason":"max_tokens","index":0}],"model":null,"object":null,"usage":{"completion_tokens":100,"prompt_tokens":0,"total_tokens":100}}

event: message
data: [DONE]

Complex

POST {{base-url}}/_inference/chat_completion/google-model-garden-anthropic-chat-completion/_stream
RQ
{
    "messages": [
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "What is the price of scarf?"
                }
            ]
        }
    ],
    "max_completion_tokens": 100,
    "temperature": 0.2,
    "top_p": 0.2,
    "tools": [
        {
            "type": "auto",
            "function": {
                "name": "get_current_price",
                "description": "Get the current price of a item",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "item": {
                            "id": "123"
                        }
                    }
                }
            }
        }
    ],
    "tool_choice": {
        "type": "auto",
        "function": {
            "name": "get_current_price"
        }
    }
}
RS
event: message
data: {"id":"msg_vrtx_01C6WmQ5QCsxv9WZvSwniHiw","choices":[{"delta":{"role":"assistant"},"index":0}],"model":"claude-3-5-haiku-20241022","object":null,"usage":{"completion_tokens":2,"prompt_tokens":330,"total_tokens":332}}

event: message
data: {"id":null,"choices":[{"delta":{"content":""},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"I'll"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" help"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" you find the current price"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" of a scarf."},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" I"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"'ll"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" use the get_current"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"_price function"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" to"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" retrieve"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" this"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" information."},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"tool_calls":[{"index":0,"id":"toolu_vrtx_01Dug9z7HJSRPfAExfssoQ1z","function":{"arguments":"{}","name":"get_current_price"},"type":null}]},"index":1}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"tool_calls":[{"index":0,"function":{"arguments":""},"type":null}]},"index":1}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"tool_calls":[{"index":0,"function":{"arguments":"{\""},"type":null}]},"index":1}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"tool_calls":[{"index":0,"function":{"arguments":"item\": \""},"type":null}]},"index":1}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"tool_calls":[{"index":0,"function":{"arguments":"scarf\"}"},"type":null}]},"index":1}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{},"finish_reason":"tool_use","index":0}],"model":null,"object":null,"usage":{"completion_tokens":85,"prompt_tokens":0,"total_tokens":85}}

event: message
data: [DONE]

…ntegration # Conflicts: # server/src/main/java/org/elasticsearch/TransportVersions.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/services/googlevertexai/action/GoogleVertexAiActionCreator.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/services/googlevertexai/request/completion/GoogleVertexAiUnifiedChatCompletionRequest.java # x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/services/googlevertexai/action/GoogleVertexAiUnifiedChatCompletionActionTests.java # x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/services/googlevertexai/completion/GoogleVertexAiChatCompletionModelTests.java # x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/services/googlevertexai/request/completion/GoogleVertexAiUnifiedChatCompletionRequestTests.java

…ntegration

…ntegration # Conflicts: # server/src/main/java/org/elasticsearch/TransportVersions.java

…al parameters based on transport version

…ntegration # Conflicts: # server/src/main/java/org/elasticsearch/TransportVersions.java

…ests

…ntegration

… to support new content block types and improve parsing logic

… parser and add unit tests for response validation

…ate response parsing and error handling

…ity to validate serialization of user fields

…n model and update related tests

…ntegration # Conflicts: # server/src/main/java/org/elasticsearch/TransportVersions.java

…equirements

Jan-Kazlouski-elastic · 2025-09-11T17:22:19Z

Hello @jonathan-buttner @dan-rubinstein
Could you please take a look at this PR? It is out of the draft state.

Jan-Kazlouski-elastic · 2025-09-11T17:23:44Z

server/src/main/java/org/elasticsearch/TransportVersions.java

    public static final TransportVersion ESQL_DOCUMENTS_FOUND_AND_VALUES_LOADED_8_19 = def(8_841_0_61);
    public static final TransportVersion ESQL_PROFILE_INCLUDE_PLAN_8_19 = def(8_841_0_62);
    public static final TransportVersion INITIAL_ELASTICSEARCH_8_19_4 = def(8_841_0_68);
+    public static final TransportVersion ML_INFERENCE_GOOGLE_MODEL_GARDEN_ADDED_8_19 = def(8_841_0_69);


Let me know if this needs to be removed. I haven't seen backports in a while. But Google Vertex AI is there for quite some time, so probably we'd require one.

Let's remove this, we won't be backporting the changes.

…ntegration # Conflicts: # server/src/main/java/org/elasticsearch/TransportVersions.java

…ntegration # Conflicts: # server/src/main/java/org/elasticsearch/TransportVersions.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/services/googlevertexai/completion/GoogleVertexAiChatCompletionServiceSettings.java

Jan-Kazlouski-elastic · 2025-09-25T10:13:01Z

@jonathan-buttner your comments are addressed. Could you please take a look at the PR once more?

…ntegration # Conflicts: # server/src/main/resources/transport/upper_bounds/9.2.csv

jonathan-buttner

Looking good, couple more changes

jonathan-buttner · 2025-09-26T19:33:51Z

...k/inference/services/googlevertexai/completion/GoogleVertexAiChatCompletionTaskSettings.java

    }

    public GoogleVertexAiChatCompletionTaskSettings(StreamInput in) throws IOException {
        thinkingConfig = new ThinkingConfig(in);
        TransportVersion version = in.getTransportVersion();
        if (GoogleVertexAiUtils.supportsModelGarden(version)) {
-            maxTokens = Objects.requireNonNullElse(in.readOptionalInt(), DEFAULT_MAX_TOKENS);
+            maxTokens = in.readOptionalInt();


Can we use readOptionalVInt?

Good thinking. Done.

jonathan-buttner · 2025-09-26T19:34:33Z

...k/inference/services/googlevertexai/completion/GoogleVertexAiChatCompletionTaskSettings.java

@@ -124,7 +124,9 @@ public TransportVersion getMinimalSupportedVersion() {
    @Override
    public void writeTo(StreamOutput out) throws IOException {
        thinkingConfig.writeTo(out);
-        out.writeOptionalInt(maxTokens);
+        if (GoogleVertexAiUtils.supportsModelGarden(out.getTransportVersion())) {
+            out.writeOptionalInt(maxTokens);


Let's use writeOptionalVInt

DonalEvans · 2025-09-26T23:10:43Z

...sticsearch/xpack/inference/services/anthropic/AnthropicChatCompletionStreamingProcessor.java

+                delta = new StreamingUnifiedChatCompletionResults.ChatCompletionChunk.Choice.Delta(
+                    null,
+                    null,
+                    null,
+                    List.of(
+                        new StreamingUnifiedChatCompletionResults.ChatCompletionChunk.Choice.Delta.ToolCall(
+                            0,
+                            id,
+                            new StreamingUnifiedChatCompletionResults.ChatCompletionChunk.Choice.Delta.ToolCall.Function(
+                                input != null ? input.toString() : null,
+                                name
+                            ),
+                            null
+                        )
+                    )
+                );


For readability, this might be better as:

var function = new StreamingUnifiedChatCompletionResults.ChatCompletionChunk.Choice.Delta.ToolCall.Function( input != null ? input.toString() : null, name ); var toolCall = new StreamingUnifiedChatCompletionResults.ChatCompletionChunk.Choice.Delta.ToolCall(0, id, function, null); delta = new StreamingUnifiedChatCompletionResults.ChatCompletionChunk.Choice.Delta(null, null, null, List.of(toolCall));

Similar changes can be made in the parseContentBlockDelta() method.

Thanks. Done!

…ntegration # Conflicts: # server/src/main/resources/transport/upper_bounds/9.2.csv

…CompletionStreamingProcessor readability

…ntegration # Conflicts: # server/src/main/resources/transport/upper_bounds/9.2.csv

Jan-Kazlouski-elastic · 2025-09-29T11:31:06Z

@jonathan-buttner @DonalEvans

Your comments are addressed. Could you please review the fixes?

jonathan-buttner

Thanks for the changes!

Jan-Kazlouski-elastic · 2025-09-29T12:24:08Z

Create Completion Endpoint

No Provider No URLs:

PUT {{base-url}}/_inference/completion/google-model-garden-anthropic-completion
RQ
{
    "service": "googlevertexai",
    "service_settings": {
        "service_account_json": {{service_account_config}},
        "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict",
        "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict"
    },
    "task_settings": {
        "max_tokens": 10
    }
}
RS
{
    "error": {
        "root_cause": [
            {
                "type": "validation_exception",
                "reason": "Validation Failed: 1: 'provider' is either GOOGLE or null. For Google Vertex AI models 'uri' and 'streaming_uri' must not be provided. Remove 'url' and 'streaming_url' fields. Provided values: uri=https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict, streaming_uri=https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict;"
            }
        ],
        "type": "validation_exception",
        "reason": "Validation Failed: 1: 'provider' is either GOOGLE or null. For Google Vertex AI models 'uri' and 'streaming_uri' must not be provided. Remove 'url' and 'streaming_url' fields. Provided values: uri=https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict, streaming_uri=https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict;"
    },
    "status": 400
}

Google Provider With URLs:

PUT {{base-url}}/_inference/completion/google-model-garden-anthropic-completion
RQ
{
    "service": "googlevertexai",
    "service_settings": {
        "provider": "google",
        "service_account_json": {{service_account_config}},
        "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict",
        "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict"
    },
    "task_settings": {
        "max_tokens": 10
    }
}
RS
{
    "error": {
        "root_cause": [
            {
                "type": "validation_exception",
                "reason": "Validation Failed: 1: 'provider' is either GOOGLE or null. For Google Vertex AI models 'uri' and 'streaming_uri' must not be provided. Remove 'url' and 'streaming_url' fields. Provided values: uri=https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict, streaming_uri=https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict;"
            }
        ],
        "type": "validation_exception",
        "reason": "Validation Failed: 1: 'provider' is either GOOGLE or null. For Google Vertex AI models 'uri' and 'streaming_uri' must not be provided. Remove 'url' and 'streaming_url' fields. Provided values: uri=https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict, streaming_uri=https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict;"
    },
    "status": 400
}

Google Provider No URLs:

PUT {{base-url}}/_inference/completion/google-model-garden-anthropic-completion
RQ
{
    "service": "googlevertexai",
    "service_settings": {
        "provider": "google",
        "service_account_json": {{service_account_config}}
    },
    "task_settings": {
        "max_tokens": 10
    }
}
RS
{
    "error": {
        "root_cause": [
            {
                "type": "validation_exception",
                "reason": "Validation Failed: 1: For Google Vertex AI models, you must provide 'location', 'project_id', and 'model_id'. Provided values: location=null, project_id=null, model_id=null;"
            }
        ],
        "type": "validation_exception",
        "reason": "Validation Failed: 1: For Google Vertex AI models, you must provide 'location', 'project_id', and 'model_id'. Provided values: location=null, project_id=null, model_id=null;"
    },
    "status": 400
}

No URLs:

PUT {{base-url}}/_inference/completion/google-model-garden-anthropic-completion
RQ
{
    "service": "googlevertexai",
    "service_settings": {
        "provider": "anthropic",
        "service_account_json": {{service_account_config}}
    },
    "task_settings": {
        "max_tokens": 10
    }
}
RS
{
    "error": {
        "root_cause": [
            {
                "type": "validation_exception",
                "reason": "Validation Failed: 1: Google Model Garden provider=anthropic selected. Either 'uri' or 'streaming_uri' must be provided;"
            }
        ],
        "type": "validation_exception",
        "reason": "Validation Failed: 1: Google Model Garden provider=anthropic selected. Either 'uri' or 'streaming_uri' must be provided;"
    },
    "status": 400
}

Both URLs:


PUT {{base-url}}/_inference/completion/google-model-garden-anthropic-completion-1
RQ
{
    "service": "googlevertexai",
    "service_settings": {
        "provider": "anthropic",
        "service_account_json": {{service_account_config}},
        "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict",
        "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict"
    },
    "task_settings": {
        "max_tokens": 10
    }
}
RS
{
    "inference_id": "google-model-garden-anthropic-completion-1",
    "task_type": "completion",
    "service": "googlevertexai",
    "service_settings": {
        "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict",
        "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict",
        "provider": "ANTHROPIC",
        "rate_limit": {
            "requests_per_minute": 1000
        }
    },
    "task_settings": {
        "max_tokens": 10
    }
}

Only Non-Streaming URL:

PUT {{base-url}}/_inference/completion/google-model-garden-anthropic-completion-2
RQ
{
    "service": "googlevertexai",
    "service_settings": {
        "provider": "anthropic",
        "service_account_json": {{service_account_config}},
        "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict"
    },
    "task_settings": {
        "max_tokens": 10
    }
}
RS
{
    "inference_id": "google-model-garden-anthropic-completion-2",
    "task_type": "completion",
    "service": "googlevertexai",
    "service_settings": {
        "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict",
        "provider": "ANTHROPIC",
        "rate_limit": {
            "requests_per_minute": 1000
        }
    },
    "task_settings": {
        "max_tokens": 10
    }
}

Only Streaming URL:

PUT {{base-url}}/_inference/completion/google-model-garden-anthropic-completion-3
RQ
{
    "service": "googlevertexai",
    "service_settings": {
        "provider": "anthropic",
        "service_account_json": {{service_account_config}},
        "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict"
    },
    "task_settings": {
        "max_tokens": 10
    }
}
RS
{
    "inference_id": "google-model-garden-anthropic-completion-3",
    "task_type": "completion",
    "service": "googlevertexai",
    "service_settings": {
        "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict",
        "provider": "ANTHROPIC",
        "rate_limit": {
            "requests_per_minute": 1000
        }
    },
    "task_settings": {
        "max_tokens": 10
    }
}

No Task Parameters:

PUT {{base-url}}/_inference/completion/google-model-garden-anthropic-completion-4
RQ
{
    "service": "googlevertexai",
    "service_settings": {
        "provider": "anthropic",
        "service_account_json": {{service_account_config}},
        "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict",
        "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict"
    }
}
RS
{
    "inference_id": "google-model-garden-anthropic-completion-4",
    "task_type": "completion",
    "service": "googlevertexai",
    "service_settings": {
        "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict",
        "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict",
        "provider": "ANTHROPIC",
        "rate_limit": {
            "requests_per_minute": 1000
        }
    }
}

Not Found:

PUT {{base-url}}/_inference/completion/google-model-garden-anthropic-completion-5
RQ
{
    "service": "googlevertexai",
    "service_settings": {
        "provider": "anthropic",
        "service_account_json": {{service_account_config}},
        "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict2",
        "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict2"
    },
    "task_settings": {
        "max_tokens": 10
    }
}
RS
{
    "error": {
        "root_cause": [
            {
                "type": "status_exception",
                "reason": "Received an unsuccessful status code for request from inference entity id [google-model-garden-anthropic-completion-5] status [404]"
            }
        ],
        "type": "status_exception",
        "reason": "Could not complete inference endpoint creation as validation call to service threw an exception.",
        "caused_by": {
            "type": "status_exception",
            "reason": "Received an unsuccessful status code for request from inference entity id [google-model-garden-anthropic-completion-5] status [404]"
        }
    },
    "status": 400
}

Perform Non-Streaming Completion

Non-Streaming Both URLs


POST {{base-url}}/_inference/completion/google-model-garden-anthropic-completion-1
RQ
{
    "input": "The sky above the port was the color of television tuned to a dead channel.",
    "task_settings": {
        "max_tokens": 20
    }
}
RS
{
    "completion": [
        {
            "result": "This is the famous opening line from William Gibson's groundbreaking cyberpunk novel \"Neu"
        }
    ]
}

Non-Streaming Only Non-Streaming URL


POST {{base-url}}/_inference/completion/google-model-garden-anthropic-completion-2
RQ
{
    "input": "The sky above the port was the color of television tuned to a dead channel.",
    "task_settings": {
        "max_tokens": 20
    }
}
RS
{
    "completion": [
        {
            "result": "This is the famous opening line from William Gibson's seminal cyberpunk novel \"Neurom"
        }
    ]
}

Non-Streaming Only Streaming URL


POST {{base-url}}/_inference/completion/google-model-garden-anthropic-completion-3
RQ
{
    "input": "The sky above the port was the color of television tuned to a dead channel.",
    "task_settings": {
        "max_tokens": 20
    }
}
RS
{
    "completion": [
        {
            "result": "This is the famous opening line from William Gibson's groundbreaking cyberpunk novel \"Neu"
        }
    ]
}

Non-Streaming Without Task Settings

POST {{base-url}}/_inference/completion/google-model-garden-anthropic-completion-4
RQ
{
    "input": "The sky above the port was the color of television tuned to a dead channel."
}
RS
{
    "completion": [
        {
            "result": "This is the famous opening line from William Gibson's groundbreaking cyberpunk novel \"Neuromancer,\" published in 1984. The line is notable for its poetic description of the sky using a technological metaphor, which was characteristic of Gibson's innovative writing style.\n\nIn the era when the book was written, a dead television channel would typically display static - a gray, fuzzy, slightly shifting monochromatic screen. So the line suggests a sky that is gray, bleak, and somewhat undefined, creating an immediate atmosphere of technological melancholy.\n\nThis sentence has become one of the most famous opening lines in science fiction literature, often cited as an example of the cyberpunk genre's aesthetic: a gritty, technological world where technology and human experience are deeply intertwined.\n\nWould you like me to discuss more about the novel, its context, or cyberpunk as a literary genre?"
        }
    ]
}

Perform Streaming Completion

Streaming Both URLs


POST {{base-url}}/_inference/completion/google-model-garden-anthropic-completion-1/_stream
RQ
{
    "input": "The sky above the port was the color of television tuned to a dead channel.",
    "task_settings": {
        "max_tokens": 20
    }
}
RS
event: message
data: {"completion":[{"delta":"This"}]}

event: message
data: {"completion":[{"delta":" is the"}]}

event: message
data: {"completion":[{"delta":" opening"}]}

event: message
data: {"completion":[{"delta":" line of William"},{"delta":" Gibson"}]}

event: message
data: {"completion":[{"delta":"'s groun"}]}

event: message
data: {"completion":[{"delta":"dbreaking cyberpunk"}]}

event: message
data: {"completion":[{"delta":" novel \""}]}

event: message
data: {"completion":[{"delta":"Neurom"}]}

event: message
data: [DONE]

Streaming Only Non-Streaming URL

POST {{base-url}}/_inference/completion/google-model-garden-anthropic-completion-2/_stream
RQ
{
    "input": "The sky above the port was the color of television tuned to a dead channel.",
    "task_settings": {
        "max_tokens": 20
    }
}
RS
event: message
data: {"completion":[{"delta":"This is the"}]}

event: message
data: {"completion":[{"delta":" famous"},{"delta":" opening"},{"delta":" line from William"},{"delta":" Gibson"},{"delta":"'s sem"},{"delta":"inal cyb"},{"delta":"erpunk novel \""},{"delta":"Neurom"}]}

event: message
data: [DONE]

Streaming Only Streaming URL

POST {{base-url}}/_inference/completion/google-model-garden-anthropic-completion-3/_stream

RQ
{
    "input": "The sky above the port was the color of television tuned to a dead channel.",
    "task_settings": {
        "max_tokens": 20
    }
}
RS
event: message
data: {"completion":[{"delta":"This is the"}]}

event: message
data: {"completion":[{"delta":" famous"},{"delta":" opening"},{"delta":" line from William"},{"delta":" Gibson"},{"delta":"'s sem"},{"delta":"inal cyb"},{"delta":"erpunk novel \""},{"delta":"Neurom"}]}

event: message
data: [DONE]

Streaming Without Task Settings

POST {{base-url}}/_inference/completion/google-model-garden-anthropic-completion-4/_stream

RQ
{
    "input": "The sky above the port was the color of television tuned to a dead channel."
}
RS
event: message
data: {"completion":[{"delta":"This"},{"delta":" is"},{"delta":" the"}]}

event: message
data: {"completion":[{"delta":" opening"}]}

event: message
data: {"completion":[{"delta":" line of William"}]}

event: message
data: {"completion":[{"delta":" Gibson"},{"delta":"'s sem"}]}

event: message
data: {"completion":[{"delta":"inal cyb"},{"delta":"erpunk novel \""}]}

event: message
data: {"completion":[{"delta":"Neuromancer,\""}]}

event: message
data: {"completion":[{"delta":" published in 1984"},{"delta":". It's"}]}

event: message
data: {"completion":[{"delta":" a"},{"delta":" famous"}]}

event: message
data: {"completion":[{"delta":" an"}]}

event: message
data: {"completion":[{"delta":"d much"},{"delta":"-discusse"}]}

event: message
data: {"completion":[{"delta":"d first"}]}

event: message
data: {"completion":[{"delta":" sentence that creates"}]}

event: message
data: {"completion":[{"delta":" an"}]}

event: message
data: {"completion":[{"delta":" ev"},{"delta":"ocative, mo"}]}

event: message
data: {"completion":[{"delta":"ody image"},{"delta":"."}]}

event: message
data: {"completion":[{"delta":" \n\nIn"}]}

event: message
data: {"completion":[{"delta":" the"}]}

event: message
data: {"completion":[{"delta":" early"}]}

event: message
data: {"completion":[{"delta":" 1980s,"}]}

event: message
data: {"completion":[{"delta":" when"},{"delta":" the book was written,"},{"delta":" a"},{"delta":" television"},{"delta":" tu"}]}

event: message
data: {"completion":[{"delta":"ned to a dead channel"}]}

event: message
data: {"completion":[{"delta":" woul"}]}

event: message
data: {"completion":[{"delta":"d display"}]}

event: message
data: {"completion":[{"delta":" static"}]}

event: message
data: {"completion":[{"delta":" -"}]}

event: message
data: {"completion":[{"delta":" a gray"}]}

event: message
data: {"completion":[{"delta":", fuz"}]}

event: message
data: {"completion":[{"delta":"zy,"}]}

event: message
data: {"completion":[{"delta":" slightly"}]}

event: message
data: {"completion":[{"delta":" blu"}]}

event: message
data: {"completion":[{"delta":"ish-"},{"delta":"white noise"}]}

event: message
data: {"completion":[{"delta":". So"},{"delta":" the line"}]}

event: message
data: {"completion":[{"delta":" suggests"}]}

event: message
data: {"completion":[{"delta":" a"},{"delta":" sky"}]}

event: message
data: {"completion":[{"delta":" that"}]}

event: message
data: {"completion":[{"delta":" is"}]}

event: message
data: {"completion":[{"delta":" blank"}]}

event: message
data: {"completion":[{"delta":", neutral"}]}

event: message
data: {"completion":[{"delta":", somewhat"}]}

event: message
data: {"completion":[{"delta":" bl"},{"delta":"eak and technological"}]}

event: message
data: {"completion":[{"delta":" -"}]}

event: message
data: {"completion":[{"delta":" which"}]}

event: message
data: {"completion":[{"delta":" perfectly"},{"delta":" sets"}]}

event: message
data: {"completion":[{"delta":" the tone"}]}

event: message
data: {"completion":[{"delta":" for the cyb"},{"delta":"erpunk genre"}]}

event: message
data: {"completion":[{"delta":" Gibson"}]}

event: message
data: {"completion":[{"delta":" helpe"},{"delta":"d create"}]}

event: message
data: {"completion":[{"delta":"."}]}

event: message
data: {"completion":[{"delta":"\n\nThe"}]}

event: message
data: {"completion":[{"delta":" line"}]}

event: message
data: {"completion":[{"delta":" is"},{"delta":" considere"}]}

event: message
data: {"completion":[{"delta":"d a masterful"}]}

event: message
data: {"completion":[{"delta":" piece"},{"delta":" of descript"},{"delta":"ive writing"}]}

event: message
data: {"completion":[{"delta":","}]}

event: message
data: {"completion":[{"delta":" using"},{"delta":" a"}]}

event: message
data: {"completion":[{"delta":" then"}]}

event: message
data: {"completion":[{"delta":"-contemporary"},{"delta":" technological"}]}

event: message
data: {"completion":[{"delta":" reference"}]}

event: message
data: {"completion":[{"delta":" to create a po"}]}

event: message
data: {"completion":[{"delta":"etic an"}]}

event: message
data: {"completion":[{"delta":"d atmospheric"}]}

event: message
data: {"completion":[{"delta":" description"}]}

event: message
data: {"completion":[{"delta":" of the"}]}

event: message
data: {"completion":[{"delta":" sky"}]}

event: message
data: {"completion":[{"delta":". It immediately"}]}

event: message
data: {"completion":[{"delta":" establishes the novel"}]}

event: message
data: {"completion":[{"delta":"'s blen"},{"delta":"d of high"}]}

event: message
data: {"completion":[{"delta":"-tech imagery"}]}

event: message
data: {"completion":[{"delta":" and g"}]}

event: message
data: {"completion":[{"delta":"ritty, dystop"}]}

event: message
data: {"completion":[{"delta":"ian moo"}]}

event: message
data: {"completion":[{"delta":"d.\n\nWoul"}]}

event: message
data: {"completion":[{"delta":"d you like me to discuss"}]}

event: message
data: {"completion":[{"delta":" the"}]}

event: message
data: {"completion":[{"delta":" novel"},{"delta":","}]}

event: message
data: {"completion":[{"delta":" the"}]}

event: message
data: {"completion":[{"delta":" line"}]}

event: message
data: {"completion":[{"delta":", or cyb"}]}

event: message
data: {"completion":[{"delta":"erpunk literature"},{"delta":" further"}]}

event: message
data: {"completion":[{"delta":"?"}]}

event: message
data: [DONE]

Jan-Kazlouski-elastic · 2025-09-29T12:51:58Z

Create Chat Completion Endpoint

No Provider No URLs:

PUT {{base-url}}/_inference/chat_completion/google-model-garden-anthropic-chat-completion
RQ
{
    "service": "googlevertexai",
    "service_settings": {
        "service_account_json": {{service_account_config}},
        "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict",
        "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict"
    },
    "task_settings": {
        "max_tokens": 5
    }
}
RS
{
    "error": {
        "root_cause": [
            {
                "type": "validation_exception",
                "reason": "Validation Failed: 1: 'provider' is either GOOGLE or null. For Google Vertex AI models 'uri' and 'streaming_uri' must not be provided. Remove 'url' and 'streaming_url' fields. Provided values: uri=https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict, streaming_uri=https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict;"
            }
        ],
        "type": "validation_exception",
        "reason": "Validation Failed: 1: 'provider' is either GOOGLE or null. For Google Vertex AI models 'uri' and 'streaming_uri' must not be provided. Remove 'url' and 'streaming_url' fields. Provided values: uri=https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict, streaming_uri=https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict;"
    },
    "status": 400
}

Google Provider With URLs:

PUT {{base-url}}/_inference/chat_completion/google-model-garden-anthropic-chat-completion
RQ
{
    "service": "googlevertexai",
    "service_settings": {
        "provider": "google",
        "service_account_json": {{service_account_config}},
        "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict",
        "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict"
    },
    "task_settings": {
        "max_tokens": 5
    }
}
RS
{
    "error": {
        "root_cause": [
            {
                "type": "validation_exception",
                "reason": "Validation Failed: 1: 'provider' is either GOOGLE or null. For Google Vertex AI models 'uri' and 'streaming_uri' must not be provided. Remove 'url' and 'streaming_url' fields. Provided values: uri=https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict, streaming_uri=https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict;"
            }
        ],
        "type": "validation_exception",
        "reason": "Validation Failed: 1: 'provider' is either GOOGLE or null. For Google Vertex AI models 'uri' and 'streaming_uri' must not be provided. Remove 'url' and 'streaming_url' fields. Provided values: uri=https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict, streaming_uri=https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict;"
    },
    "status": 400
}

Google Provider No URLs:

PUT {{base-url}}/_inference/chat_completion/google-model-garden-anthropic-chat-completion
RQ
{
    "service": "googlevertexai",
    "service_settings": {
        "provider": "google",
        "service_account_json": {{service_account_config}}
    },
    "task_settings": {
        "max_tokens": 5
    }
}
RS
{
    "error": {
        "root_cause": [
            {
                "type": "validation_exception",
                "reason": "Validation Failed: 1: For Google Vertex AI models, you must provide 'location', 'project_id', and 'model_id'. Provided values: location=null, project_id=null, model_id=null;"
            }
        ],
        "type": "validation_exception",
        "reason": "Validation Failed: 1: For Google Vertex AI models, you must provide 'location', 'project_id', and 'model_id'. Provided values: location=null, project_id=null, model_id=null;"
    },
    "status": 400
}

No URLs:

PUT {{base-url}}/_inference/chat_completion/google-model-garden-anthropic-chat-completion
RQ
{
    "service": "googlevertexai",
    "service_settings": {
        "provider": "anthropic",
        "service_account_json": {{service_account_config}}
    },
    "task_settings": {
        "max_tokens": 5
    }
}
RS
{
    "error": {
        "root_cause": [
            {
                "type": "validation_exception",
                "reason": "Validation Failed: 1: Google Model Garden provider=anthropic selected. Either 'uri' or 'streaming_uri' must be provided;"
            }
        ],
        "type": "validation_exception",
        "reason": "Validation Failed: 1: Google Model Garden provider=anthropic selected. Either 'uri' or 'streaming_uri' must be provided;"
    },
    "status": 400
}

Both URLs:


PUT {{base-url}}/_inference/chat_completion/google-model-garden-anthropic-chat-completion-1
RQ
{
    "service": "googlevertexai",
    "service_settings": {
        "provider": "anthropic",
        "service_account_json": {{service_account_config}},
        "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict",
        "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict"
    },
    "task_settings": {
        "max_tokens": 5
    }
}
RS
{
    "inference_id": "google-model-garden-anthropic-chat-completion-1",
    "task_type": "completion",
    "service": "googlevertexai",
    "service_settings": {
        "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict",
        "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict",
        "provider": "ANTHROPIC",
        "rate_limit": {
            "requests_per_minute": 1000
        }
    },
    "task_settings": {
        "max_tokens": 10
    }
}

Only Non-Streaming URL:

PUT {{base-url}}/_inference/chat_completion/google-model-garden-anthropic-chat-completion-2
RQ
{
    "service": "googlevertexai",
    "service_settings": {
        "provider": "anthropic",
        "service_account_json": {{service_account_config}},
        "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict"
    },
    "task_settings": {
        "max_tokens": 5
    }
}
RS
{
    "inference_id": "google-model-garden-anthropic-chat-completion-2",
    "task_type": "chat_completion",
    "service": "googlevertexai",
    "service_settings": {
        "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict",
        "provider": "ANTHROPIC",
        "rate_limit": {
            "requests_per_minute": 1000
        }
    },
    "task_settings": {
        "max_tokens": 5
    }
}

Only Streaming URL:

PUT {{base-url}}/_inference/completion/google-model-garden-anthropic-completion-3
RQ
{
    "service": "googlevertexai",
    "service_settings": {
        "provider": "anthropic",
        "service_account_json": {{service_account_config}},
        "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict"
    },
    "task_settings": {
        "max_tokens": 5
    }
}
RS
{
    "inference_id": "google-model-garden-anthropic-chat-completion-3",
    "task_type": "chat_completion",
    "service": "googlevertexai",
    "service_settings": {
        "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict",
        "provider": "ANTHROPIC",
        "rate_limit": {
            "requests_per_minute": 1000
        }
    },
    "task_settings": {
        "max_tokens": 5
    }
}

No Task Parameters:

PUT {{base-url}}/_inference/completion/google-model-garden-anthropic-completion-4
RQ
{
    "service": "googlevertexai",
    "service_settings": {
        "provider": "anthropic",
        "service_account_json": {{service_account_config}},
        "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict",
        "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict"
    }
}
RS
{
    "inference_id": "google-model-garden-anthropic-chat-completion-4",
    "task_type": "chat_completion",
    "service": "googlevertexai",
    "service_settings": {
        "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict",
        "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict",
        "provider": "ANTHROPIC",
        "rate_limit": {
            "requests_per_minute": 1000
        }
    }
}

Not Found:

PUT {{base-url}}/_inference/chat_completion/google-model-garden-anthropic-chat-completion-5
RQ
{
    "service": "googlevertexai",
    "service_settings": {
        "provider": "anthropic",
        "service_account_json": {{service_account_config}},
        "url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:rawPredict2",
        "streaming_url": "https://us-east5-aiplatform.googleapis.com/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict2"
    },
    "task_settings": {
        "max_tokens": 5
    }
}
RS
{
    "error": {
        "root_cause": [
            {
                "type": "unified_chat_completion_exception",
                "reason": "Received an unsuccessful status code for request from inference entity id [google-model-garden-anthropic-chat-completion] status [404]. Error message: [<!DOCTYPE html>\n<html lang=en>\n  <meta charset=utf-8>\n  <meta name=viewport content=\"initial-scale=1, minimum-scale=1, width=device-width\">\n  <title>Error 404 (Not Found)!!1</title>\n  <style>\n    *{margin:0;padding:0}html,code{font:15px/22px arial,sans-serif}html{background:#fff;color:#222;padding:15px}body{margin:7% auto 0;max-width:390px;min-height:180px;padding:30px 0 15px}* > body{background:url(//www.google.com/images/errors/robot.png) 100% 5px no-repeat;padding-right:205px}p{margin:11px 0 22px;overflow:hidden}ins{color:#777;text-decoration:none}a img{border:0}@media screen and (max-width:772px){body{background:none;margin-top:0;max-width:none;padding-right:0}}#logo{background:url(//www.google.com/images/branding/googlelogo/1x/googlelogo_color_150x54dp.png) no-repeat;margin-left:-5px}@media only screen and (min-resolution:192dpi){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat 0% 0%/100% 100%;-moz-border-image:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) 0}}@media only screen and (-webkit-min-device-pixel-ratio:2){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat;-webkit-background-size:100% 100%}}#logo{display:inline-block;height:54px;width:150px}\n  </style>\n  <a href=//www.google.com/><span id=logo aria-label=Google></span></a>\n  <p><b>404.</b> <ins>That’s an error.</ins>\n  <p>The requested URL <code>/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict2</code> was not found on this server.  <ins>That’s all we know.</ins>\n]"
            }
        ],
        "type": "status_exception",
        "reason": "Could not complete inference endpoint creation as validation call to service threw an exception.",
        "caused_by": {
            "type": "unified_chat_completion_exception",
            "reason": "Received an unsuccessful status code for request from inference entity id [google-model-garden-anthropic-chat-completion] status [404]. Error message: [<!DOCTYPE html>\n<html lang=en>\n  <meta charset=utf-8>\n  <meta name=viewport content=\"initial-scale=1, minimum-scale=1, width=device-width\">\n  <title>Error 404 (Not Found)!!1</title>\n  <style>\n    *{margin:0;padding:0}html,code{font:15px/22px arial,sans-serif}html{background:#fff;color:#222;padding:15px}body{margin:7% auto 0;max-width:390px;min-height:180px;padding:30px 0 15px}* > body{background:url(//www.google.com/images/errors/robot.png) 100% 5px no-repeat;padding-right:205px}p{margin:11px 0 22px;overflow:hidden}ins{color:#777;text-decoration:none}a img{border:0}@media screen and (max-width:772px){body{background:none;margin-top:0;max-width:none;padding-right:0}}#logo{background:url(//www.google.com/images/branding/googlelogo/1x/googlelogo_color_150x54dp.png) no-repeat;margin-left:-5px}@media only screen and (min-resolution:192dpi){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat 0% 0%/100% 100%;-moz-border-image:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) 0}}@media only screen and (-webkit-min-device-pixel-ratio:2){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat;-webkit-background-size:100% 100%}}#logo{display:inline-block;height:54px;width:150px}\n  </style>\n  <a href=//www.google.com/><span id=logo aria-label=Google></span></a>\n  <p><b>404.</b> <ins>That’s an error.</ins>\n  <p>The requested URL <code>/v1/projects/elastic-cloud-dev/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku:streamRawPredict2</code> was not found on this server.  <ins>That’s all we know.</ins>\n]"
        }
    },
    "status": 400
}

Testing of Performing Streaming Chat Completion is done and it is confirmed to be successful.

Jan-Kazlouski-elastic · 2025-09-29T13:29:45Z

Perform Chat Completion

Both URLs

event: message
data: {"id":"msg_vrtx_01DtbJbxQxXDC2NM98ex3eUY","choices":[{"delta":{"role":"assistant"},"index":0}],"model":"claude-3-5-haiku-20241022","object":null,"usage":{"completion_tokens":5,"prompt_tokens":12,"total_tokens":17}}

event: message
data: {"id":null,"choices":[{"delta":{"content":""},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"Deep learning is a subset"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{},"finish_reason":"max_tokens","index":0}],"model":null,"object":null,"usage":{"completion_tokens":5,"prompt_tokens":0,"total_tokens":5}}

event: message
data: [DONE]

Both URLs With Max Tokens in RQ

event: message
data: {"id":"msg_vrtx_01H9UMg6c8ey3rhtAsN6uomT","choices":[{"delta":{"role":"assistant"},"index":0}],"model":"claude-3-5-haiku-20241022","object":null,"usage":{"completion_tokens":5,"prompt_tokens":12,"total_tokens":17}}

event: message
data: {"id":null,"choices":[{"delta":{"content":""},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"Deep learning is a subset"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" of machine learning that uses"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" artificial"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" neural networks with multiple layers"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" ("},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"hence"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" \""},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"deep\") to progress"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"ively extract"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" higher"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"-level features"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" from"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" raw"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" input"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"."},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" Here"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" are key"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" points about deep learning:"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"\n\n1"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":". Core"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" Characteristics"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":":"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"\n- Uses"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" artificial"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" neural networks with many"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" layers"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"\n- Can"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" automatically"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" learn representations"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" from data\n-"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" Mim"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"ics the way"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" human brain"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" processes information"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"\n-"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" Capable"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" of learning complex patterns an"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"d features\n\n2. Key"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" Components"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":":\n- Neural"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" networks"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" with multiple"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" hidden"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" layers"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{},"finish_reason":"max_tokens","index":0}],"model":null,"object":null,"usage":{"completion_tokens":100,"prompt_tokens":0,"total_tokens":100}}

event: message
data: [DONE]

Only Non-Streaming URL

event: message
data: {"id":"msg_vrtx_018Ci31jn9VXVh3F8SGuePA5","choices":[{"delta":{"role":"assistant"},"index":0}],"model":"claude-3-5-haiku-20241022","object":null,"usage":{"completion_tokens":5,"prompt_tokens":12,"total_tokens":17}}

event: message
data: {"id":null,"choices":[{"delta":{"content":""},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"Deep learning is a subset"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{},"finish_reason":"max_tokens","index":0}],"model":null,"object":null,"usage":{"completion_tokens":5,"prompt_tokens":0,"total_tokens":5}}

event: message
data: [DONE]

Only Non-Streaming URL With Max Tokens in RQ

event: message
data: {"id":"msg_vrtx_01KiN7pMLqYfXWGvsR7wfkzg","choices":[{"delta":{"role":"assistant"},"index":0}],"model":"claude-3-5-haiku-20241022","object":null,"usage":{"completion_tokens":5,"prompt_tokens":12,"total_tokens":17}}

event: message
data: {"id":null,"choices":[{"delta":{"content":""},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"Deep learning is a subset"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" of machine learning that uses"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" artificial neural networks with"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" multiple layers ("},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"hence"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" \""},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"deep\") to progress"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"ively extract higher"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"-level features from raw"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" input"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"."},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" Here"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" are key"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" points about deep learning:"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"\n\n1"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":". Core"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" Characteristics"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":":"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"\n-"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" Uses"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" neural networks with many"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" hidden"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" layers"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"\n- Can"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" automatically"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" learn features"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" from data\n-"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" Capable of handling"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" complex"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":", non-linear relationships"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"\n-"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" Requires"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" large amounts of data"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" for"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" training\n\n2. Neural"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" Network"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" Structure"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":":\n- Input"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" layer"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"\n-"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{},"finish_reason":"max_tokens","index":0}],"model":null,"object":null,"usage":{"completion_tokens":100,"prompt_tokens":0,"total_tokens":100}}

event: message
data: [DONE]

Only Streaming URL

event: message
data: {"id":"msg_vrtx_01AwvZxPifsMPLhWyDCy92Zo","choices":[{"delta":{"role":"assistant"},"index":0}],"model":"claude-3-5-haiku-20241022","object":null,"usage":{"completion_tokens":5,"prompt_tokens":12,"total_tokens":17}}

event: message
data: {"id":null,"choices":[{"delta":{"content":""},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"Deep learning is a subset"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{},"finish_reason":"max_tokens","index":0}],"model":null,"object":null,"usage":{"completion_tokens":5,"prompt_tokens":0,"total_tokens":5}}

event: message
data: [DONE]

Only Streaming URL With Max Tokens in RQ

event: message
data: {"id":"msg_vrtx_01MGYZfdQf2LpreTA6xgJexk","choices":[{"delta":{"role":"assistant"},"index":0}],"model":"claude-3-5-haiku-20241022","object":null,"usage":{"completion_tokens":5,"prompt_tokens":12,"total_tokens":17}}

event: message
data: {"id":null,"choices":[{"delta":{"content":""},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"Deep learning is a subset"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" of machine learning that uses"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" artificial neural networks with multiple"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" layers"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" ("},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"hence \"deep\") to"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" progress"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"ively extract"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" higher-level features from"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" raw"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" input."},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" Here"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" are key"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" characteristics"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":":"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"\n\n1"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":". Core"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" Components"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":":"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"\n-"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" Artificial"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" neural networks"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"\n- Multiple hidden"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" layers"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"\n- Ability"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" to learn from large"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" amounts"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" of data"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"\n- Mim"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"ics human"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" brain"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"'s"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" neural"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" processing"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"\n\n2. Key"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" Techniques"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":":\n- Con"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"volutional Neural"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" Networks (CNNs"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":")"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"\n- Rec"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{},"finish_reason":"max_tokens","index":0}],"model":null,"object":null,"usage":{"completion_tokens":100,"prompt_tokens":0,"total_tokens":100}}

event: message
data: [DONE]

Both URLs No task settings on creation

event: message
data: {"id":"msg_vrtx_014AvDDsgg3BhcwxoQxRwuDR","choices":[{"delta":{"role":"assistant"},"index":0}],"model":"claude-3-5-haiku-20241022","object":null,"usage":{"completion_tokens":5,"prompt_tokens":12,"total_tokens":17}}

event: message
data: {"id":null,"choices":[{"delta":{"content":""},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"Deep learning is a subset"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" of machine learning that uses"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" artificial neural networks with"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" multiple layers ("},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"hence"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" \""},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"deep\") to progress"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"ively extract"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" higher-level features from"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" raw"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" input"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"."},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" Key"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" characteristics"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" include"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":":\n\n1"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":". Neural"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" Network"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" Architecture"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"\n-"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" Consists"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" of interconn"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"ected layers"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" of artificial"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" neurons"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"\n- Multiple"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" hidden layers between"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" input and output layers"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"\n- Capable"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" of learning complex"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":","},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" non-linear relationships"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"\n\n2. Key"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" Features"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"\n- Automatic"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" feature extraction"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"\n- Can"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" handle"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" un"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"structured data (images"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":", text, audio)"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"\n- Self"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"-learning and adaptive"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" capabilities"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"\n\n3. Common"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" Applications\n- Image"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" recognition"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"\n- Speech"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" recognition\n- Natural"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" language processing\n-"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" Autonomous"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" vehicles\n- Medical"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" diagnosis"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"\n- Predict"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"ive analytics\n\n4"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":". Learning"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" Techniques"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"\n- Supervised learning"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"\n- Unsuperv"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"ised learning\n-"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" Reinforcement learning\n\n5"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":". Popular"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" Architectures\n-"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" Convolutional Neural"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" Networks (CNNs"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":")\n- Rec"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"urrent Neural Networks (R"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"NNs)"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"\n- Generative"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" Adversarial Networks ("},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"GANs)"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"\n- Transform"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"ers\n\n6"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"."},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" Advantages\n- High"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" accuracy"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"\n- Can"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" handle complex, large"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"-scale datasets\n-"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" Reduces"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" nee"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"d for manual feature engineering"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"\n\n7"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":". Challenges\n-"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" Requires large"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" amounts"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" of training"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" data\n- Comput"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"ationally intensive\n-"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" Can"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" be difficult"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" to interpret"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"\n\nDeep learning has"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" revolutionized artificial"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" intelligence by"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" enabling"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" more"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" sophisticate"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"d an"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"d nu"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"anced machine"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" learning approaches"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"."},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{},"finish_reason":"end_turn","index":0}],"model":null,"object":null,"usage":{"completion_tokens":300,"prompt_tokens":0,"total_tokens":300}}

event: message
data: [DONE]

Both URLs No task settings on creation With Max Tokens in RQ

event: message
data: {"id":"msg_vrtx_018RE7gV36k9m8KGnKTeMnGa","choices":[{"delta":{"role":"assistant"},"index":0}],"model":"claude-3-5-haiku-20241022","object":null,"usage":{"completion_tokens":5,"prompt_tokens":12,"total_tokens":17}}

event: message
data: {"id":null,"choices":[{"delta":{"content":""},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"Deep learning is a subset"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" of machine learning that uses"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" artificial"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" neural networks with multiple layers"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" ("},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"hence"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" \""},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"deep\") to progress"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"ively extract"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" higher"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"-level features from"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" raw"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" input"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"."},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" Here"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" are key"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" characteristics"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":":"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"\n\n1"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":". Core"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" Components"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":":"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"\n-"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" Artificial"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" neural networks"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"\n- Multiple hidden layers"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"\n- Complex"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" pattern"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" recognition"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"\n- Ability"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" to learn from large"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" amounts of data\n\n2"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":". Key"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" Techniques"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":":\n- Con"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"volutional Neural Networks ("},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"CNNs)"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":"\n- Recurrent"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{"content":" Neural Networks (R"},"index":0}],"model":null,"object":null}

event: message
data: {"id":null,"choices":[{"delta":{},"finish_reason":"max_tokens","index":0}],"model":null,"object":null,"usage":{"completion_tokens":100,"prompt_tokens":0,"total_tokens":100}}

event: message
data: [DONE]

Jan-Kazlouski-elastic · 2025-09-29T13:56:49Z

Regression Tests for Google Vertex AI.

Create Completion endpoint

Success


RQ
{
    "service": "googlevertexai",
    "service_settings": {
        "service_account_json": {{service_account_config}},
        "model_id": "gemini-2.5-pro",
        "location": "us-central1",
        "project_id": "project_id"
    }
}
RS
{
    "inference_id": "google-vertex-ai-completion",
    "task_type": "completion",
    "service": "googlevertexai",
    "service_settings": {
        "project_id": "project_id",
        "location": "us-central1",
        "model_id": "gemini-2.5-pro",
        "provider": "GOOGLE",
        "rate_limit": {
            "requests_per_minute": 1000
        }
    }
}

No model_id


RQ
{
    "service": "googlevertexai",
    "service_settings": {
        "service_account_json": {{service_account_config}},
        "location": "us-central1",
        "project_id": "project_id"
    }
}

RS
{
    "error": {
        "root_cause": [
            {
                "type": "validation_exception",
                "reason": "Validation Failed: 1: For Google Vertex AI models, you must provide 'location', 'project_id', and 'model_id'. Provided values: location=us-central1, project_id=1014491842772, model_id=null;"
            }
        ],
        "type": "validation_exception",
        "reason": "Validation Failed: 1: For Google Vertex AI models, you must provide 'location', 'project_id', and 'model_id'. Provided values: location=us-central1, project_id=1014491842772, model_id=null;"
    },
    "status": 400
}

Perform Non-Streaming Completion


RQ
{
    "input": "The sky above the port was the color of television tuned to a dead channel."
}
RS
{
    "completion": [
        {
            "result": "That is the iconic opening line of William Gibson's 1984 debut novel, **Neuromancer**.\n\nIt is widely regarded as one of the most brilliant and effective opening sentences in modern literature, especially within science fiction. Here’s a breakdown of why it's so powerful:\n\n### 1. Instant World-Building\nIn a single sentence, Gibson establishes the entire mood and setting of his novel. This isn't a world of blue skies and natural beauty. It's a world where the man-made has superseded the natural to such an extent that the sky itself is described in terms of a technological failure. It immediately signals a gritty, polluted, and dystopian future.\n\n### 2. Subverting Poetic Language\nThe sentence structure is traditionally poetic (\"The sky above the port...\"), but the simile at the end is jarringly modern, ugly, and technical. This clash between a classic literary form and a piece of low-tech jargon perfectly encapsulates the \"high tech, low life\" ethos of the cyberpunk genre that Gibson was pioneering.\n\n### 3. Establishing Tone\nThe image is bleak and unsettling. A \"dead channel\" implies a lack of signal, an absence of information, a void. It's not a peaceful, uniform gray, but a staticky, lifeless, and oppressive color. This sets a noir-ish, melancholic, and anxious tone that persists throughout the novel.\n\n### 4. A Generational Image\nThe meaning of the line has evolved with technology, which is a fascinating aspect of its legacy:\n*   **In 1984:** A \"dead channel\" on an analog CRT television was a screen of flickering, staticky, light-gray noise. This is the image Gibson intended—a dynamic, unpleasant, and textured sky.\n*   **Today:** For younger readers, a \"dead channel\" might be a solid blue or black screen from a modern digital TV or monitor. While different from the original intent, this new interpretation—a flat, empty, digital void—still powerfully conveys a sense of technological alienation.\n\nWilliam Gibson himself has commented on this generational shift, acknowledging that he was thinking of the specific static of an old black-and-white TV, but finds the modern \"blue screen of death\" interpretation equally valid and interesting.\n\nIn short, that one sentence is a masterclass in economical writing. It establishes the setting, tone, and central themes of *Neuromancer* before the story has even begun."
        }
    ]
}

Perform Streaming Completion


RQ
{
    "input": "The sky above the port was the color of television tuned to a dead channel."
}
RS
event: message
data: {"completion":[{"delta":"That's the legendary opening line of William Gibson's 1984 novel, ***Neuromancer***.\n\nIt is widely considered one of the most effective and iconic opening sentences in modern literature, especially within science fiction. It does a phenomenal amount of work in just 14 words.\n\nHere's"}]}

event: message
data: {"completion":[{"delta":" a breakdown of why it's so brilliant:\n\n### 1. It Establishes the Genre (Cyberpunk)\nIn a single stroke, the line marries the natural world with defunct technology.\n*   **The Natural:** \"The sky above the port\" is a classic, almost poetic, scenic description"}]}

event: message
data: {"completion":[{"delta":".\n*   **The Artificial & Decayed:** \"...the color of television tuned to a dead channel\" is jarring, modern, and specific. It's not the grey of a storm cloud; it's the specific, staticky, lifeless grey of technological failure.\n\nThis fusion of the natural world with technology ("}]}

event: message
data: {"completion":[{"delta":"often in a state of decay) is the absolute heart of the cyberpunk genre. Gibson didn't just describe a scene; he announced a new literary sensibility.\n\n### 2. It Sets the Mood (Tone)\nThe image is incredibly bleak. A \"dead channel\" evokes feelings of:\n*   **Em"}]}

event: message
data: {"completion":[{"delta":"ptiness:** A lack of signal, no information, a void.\n*   **Alienation:** It's an unnatural, disorienting sight.\n*   **Dystopia:** This is not a beautiful, clear blue sky. It's polluted, oppressive, and grim. The world the characters inhabit is broken"}]}

event: message
data: {"completion":[{"delta":", much like the television.\n\nThis is the \"low life\" part of cyberpunk's \"high tech, low life\" motto.\n\n### 3. It's a Masterclass in Imagery and Simile\nThe simile is potent and multi-sensory. When reading it, especially for the first time in"}]}

event: message
data: {"completion":[{"delta":" the 1980s, you don't just *see* the fuzzy, colorless static; you can almost *hear* the white-noise hiss that accompanies it. It's an image that's both visual and auditory, creating a powerful and immersive experience for the reader from the very first sentence"}]}

event: message
data: {"completion":[{"delta":".\n\n### The Evolution of its Meaning\nOne of the most fascinating aspects of this line is how its meaning has changed with technology.\n\n*   **In 1984:** An analog television tuned to a dead channel displayed a screen of flickering black-and-white static. This is the image Gibson intended"}]}

event: message
data: {"completion":[{"delta":".\n*   **Today:** A modern television or monitor \"tuned to a dead channel\" often displays a solid blue or black screen, perhaps with a \"No Signal\" message.\n\nWilliam Gibson himself has commented on this. He's joked that if he were writing it today, he might have to say the"}]}

event: message
data: {"completion":[{"delta":" sky was \"the color of a blue screen of death.\"\n\nThis technological shift doesn't diminish the line's power; it anchors it in a specific historical and technological moment. It makes the novel a kind of artifact of the very era it was critiquing, turning the sentence itself into a piece of retro"}]}

event: message
data: {"completion":[{"delta":"-tech. Even if a younger reader has to look up what \"television static\" looks like, the core idea—describing the natural world in terms of technological failure—remains as powerful as ever."}]}

event: message
data: [DONE]

Create Chat Completion endpoint

RQ

{
    "service": "googlevertexai",
    "service_settings": {
        "service_account_json": {{service_account_config}},
        "model_id": "gemini-2.5-pro",
        "location": "us-central1",
        "project_id": "project_id"
    }
}

RS

{
    "inference_id": "google-vertex-ai-chat-completion",
    "task_type": "chat_completion",
    "service": "googlevertexai",
    "service_settings": {
        "project_id": "project_id",
        "location": "us-central1",
        "model_id": "gemini-2.5-pro",
        "provider": "GOOGLE",
        "rate_limit": {
            "requests_per_minute": 1000
        }
    }
}

Perform Chat Completion


RQ
{
    "messages": [
        {
            "role": "user",
            "content": "What is deep learning?"
        }
    ],
    "max_completion_tokens": 100
}
RS
event: message
data: {"id":"54zaaLahA-OcmecPmbjUCA","choices":[{"delta":{"content":"Of course! Here","role":"model"},"finish_reason":"MAX_TOKENS","index":0}],"model":"gemini-2.5-pro","object":"chat.completion.chunk","usage":{"completion_tokens":4,"prompt_tokens":5,"total_tokens":103}}

event: message
data: [DONE]

Jan-Kazlouski-elastic · 2025-09-29T13:58:13Z

@jonathan-buttner
Testing is finished. All good. We're ready to merge.

Add Google Model Garden Anthropic integration

ef50c43

elasticsearchmachine added v9.2.0 external-contributor Pull request authored by a developer outside the Elasticsearch team labels Sep 3, 2025

Jan-Kazlouski-elastic added 14 commits September 3, 2025 21:33

Clean up AnthropicChatCompletionStreamingProcessor

1de1067

Merge remote-tracking branch 'origin/main' into google-model-garden-i…

ea9f933

…ntegration

Merge remote-tracking branch 'origin/main' into google-model-garden-i…

9f61342

…ntegration # Conflicts: # server/src/main/java/org/elasticsearch/TransportVersions.java

Enhance GoogleVertexAiChatCompletionServiceSettings to support option…

c3ce521

…al parameters based on transport version

Merge remote-tracking branch 'origin/main' into google-model-garden-i…

fc136a5

…ntegration # Conflicts: # server/src/main/java/org/elasticsearch/TransportVersions.java

Add extractOptionalUri method and corresponding tests for URI extraction

4e6ccee

Add GoogleModelGardenProvider support to chat completion models and t…

07eed54

…ests

Merge remote-tracking branch 'origin/main' into google-model-garden-i…

1adf958

…ntegration

Enhance AnthropicChatCompletionStreamingProcessor and related classes…

a4ad1c5

… to support new content block types and improve parsing logic

Refactor AnthropicChatCompletionResponseHandler to use a custom error…

b143661

… parser and add unit tests for response validation

Add unit tests for AnthropicChatCompletionStreamingProcessor to valid…

0712434

…ate response parsing and error handling

Add unit tests for GoogleModelGardenAnthropicChatCompletionRequestEnt…

fa44bd4

…ity to validate serialization of user fields

Add support for Anthropic provider in Google Vertex AI chat completio…

8a1e710

…n model and update related tests

Jan-Kazlouski-elastic marked this pull request as ready for review September 11, 2025 14:00

elasticsearchmachine added the needs:triage Requires assignment of a team area label label Sep 11, 2025

Jan-Kazlouski-elastic added 5 commits September 11, 2025 17:02

Merge remote-tracking branch 'origin/main' into google-model-garden-i…

ff63315

…ntegration # Conflicts: # server/src/main/java/org/elasticsearch/TransportVersions.java

Add changelog

93cb5ad

Refactor switch case in GoogleVertexAiActionCreator to handle null case

54ff2ff

Validate service settings for Google Vertex AI model configuration

67d5d8a

Enhance Anthropic model tests to validate URI handling and provider r…

a4540d3

…equirements

Jan-Kazlouski-elastic commented Sep 11, 2025

View reviewed changes

Jan-Kazlouski-elastic and others added 4 commits September 12, 2025 15:36

Merge remote-tracking branch 'origin/main' into google-model-garden-i…

4d1a6fa

…ntegration # Conflicts: # server/src/main/java/org/elasticsearch/TransportVersions.java

[CI] Auto commit changes from spotless

f4be58b

Refactor switch case in GoogleVertexAiService to handle null case

d926c71

elasticsearchmachine and others added 9 commits September 25, 2025 10:20

[CI] Update transport version definitions

87c1464

Fixed unit tests

3e498ce

Merge remote-tracking branch 'origin/main' into google-model-garden-i…

066b396

…ntegration # Conflicts: # server/src/main/resources/transport/upper_bounds/9.2.csv

[CI] Update transport version definitions

617c090

Fix validation logic for Google Model Garden and Vertex AI settings

ffc8e4f

[CI] Update transport version definitions

ac47f22

Add validation tests for Google Vertex AI and Model Garden settings

e47e311

Refactor validation logic for Google Vertex AI and Model Garden settings

2852695

Add comment

23b86ec

jonathan-buttner reviewed Sep 26, 2025

View reviewed changes

DonalEvans reviewed Sep 26, 2025

View reviewed changes

Jan-Kazlouski-elastic added 2 commits September 29, 2025 10:05

Merge remote-tracking branch 'origin/main' into google-model-garden-i…

d1b63bf

…ntegration # Conflicts: # server/src/main/resources/transport/upper_bounds/9.2.csv

Update Google Vertex AI Task Settings parsing logic and AnthropicChat…

89a744e

…CompletionStreamingProcessor readability

Jan-Kazlouski-elastic requested review from DonalEvans and jonathan-buttner September 29, 2025 11:04

Merge remote-tracking branch 'origin/main' into google-model-garden-i…

2847a1f

…ntegration # Conflicts: # server/src/main/resources/transport/upper_bounds/9.2.csv

jonathan-buttner approved these changes Sep 29, 2025

View reviewed changes

jonathan-buttner self-assigned this Sep 29, 2025

jonathan-buttner added the >enhancement label Sep 29, 2025

Merge branch 'main' into google-model-garden-integration

55da91c

jonathan-buttner enabled auto-merge (squash) September 29, 2025 14:06

jonathan-buttner merged commit d18da3c into elastic:main Sep 29, 2025
36 checks passed

Jan-Kazlouski-elastic mentioned this pull request Oct 6, 2025

Add Google Model Garden support for completion and chat_completion tasks elastic/elasticsearch-specification#5423

Open

2 tasks

Add Google Model Garden's Anthropic support to Inference Plugin #134080

Add Google Model Garden's Anthropic support to Inference Plugin #134080

Uh oh!

Conversation

Jan-Kazlouski-elastic commented Sep 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Jan-Kazlouski-elastic commented Sep 11, 2025

Uh oh!

Jan-Kazlouski-elastic Sep 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Jan-Kazlouski-elastic commented Sep 25, 2025

Uh oh!

jonathan-buttner left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Jan-Kazlouski-elastic commented Sep 29, 2025

Uh oh!

jonathan-buttner left a comment

Choose a reason for hiding this comment

Uh oh!

Jan-Kazlouski-elastic commented Sep 29, 2025

Uh oh!

Jan-Kazlouski-elastic commented Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Jan-Kazlouski-elastic commented Sep 29, 2025

Uh oh!

Jan-Kazlouski-elastic commented Sep 29, 2025

Uh oh!

Jan-Kazlouski-elastic commented Sep 29, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Jan-Kazlouski-elastic commented Sep 3, 2025 •

edited

Loading

Jan-Kazlouski-elastic Sep 11, 2025 •

edited

Loading

Jan-Kazlouski-elastic commented Sep 29, 2025 •

edited

Loading