Skip to content

Inference Extension: Extract model name from request body #3836

@sjberman

Description

@sjberman

In order to support model redirects and traffic splitting, NGINX needs to be able to extract the model name from the client request body. It will then get used to set the X-Gateway-Model-Name header in the request to the Endpoint Picker (EPP) for it to make a decision on which endpoint to route the client request to.

This story is to simply write the module to extract the model name. A followup story will actually use this.

Acceptance Criteria:

  • Create an NJS module that can extract the model name from the client request body
  • The NJS module should be built in to the NGINX container images

Developer Notes:

  • OpenAI uses a request body that should look like the following when a client makes a request to an AI workload:
{
  "model": "gpt-4o",
  "messages": [
    {
      "role": "user",
      "content": "Some question..."
    }
  ]
}

Design doc: https://github.com/nginx/nginx-gateway-fabric/blob/main/docs/proposals/gateway-inference-extension.md

Metadata

Metadata

Assignees

Labels

area/inference-extensionRelated to the Gateway API Inference ExtensionenhancementNew feature or requestrefinedRequirements are refined and the issue is ready to be implemented.size/mediumEstimated to be completed within a week

Type

No type

Projects

Status

✅ Done

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions