-
Notifications
You must be signed in to change notification settings - Fork 137
Closed
Labels
area/inference-extensionRelated to the Gateway API Inference ExtensionRelated to the Gateway API Inference ExtensionenhancementNew feature or requestNew feature or requestrefinedRequirements are refined and the issue is ready to be implemented.Requirements are refined and the issue is ready to be implemented.size/mediumEstimated to be completed within a weekEstimated to be completed within a week
Milestone
Description
In order to support model redirects and traffic splitting, NGINX needs to be able to extract the model name from the client request body. It will then get used to set the X-Gateway-Model-Name
header in the request to the Endpoint Picker (EPP) for it to make a decision on which endpoint to route the client request to.
This story is to simply write the module to extract the model name. A followup story will actually use this.
Acceptance Criteria:
- Create an NJS module that can extract the model name from the client request body
- The NJS module should be built in to the NGINX container images
Developer Notes:
- OpenAI uses a request body that should look like the following when a client makes a request to an AI workload:
{
"model": "gpt-4o",
"messages": [
{
"role": "user",
"content": "Some question..."
}
]
}
Design doc: https://github.com/nginx/nginx-gateway-fabric/blob/main/docs/proposals/gateway-inference-extension.md
Metadata
Metadata
Assignees
Labels
area/inference-extensionRelated to the Gateway API Inference ExtensionRelated to the Gateway API Inference ExtensionenhancementNew feature or requestNew feature or requestrefinedRequirements are refined and the issue is ready to be implemented.Requirements are refined and the issue is ready to be implemented.size/mediumEstimated to be completed within a weekEstimated to be completed within a week
Type
Projects
Status
✅ Done