Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion site-src/api-types/inferenceobjective.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,10 @@

The **InferenceObjective** API defines a set of serving objectives of the specific request it is associated with. This CRD currently houses only `Priority` but will be expanded to include fields such as SLO attainment.

## Usage

To associate a request to the InferencePool with a specific InferenceObjective, the system uses a specific header: `x-gateway-inference-objective` with the value of the header set to the InferenceObjective metadata name. So the calling client must set the header key/value on the request to associate the selected InferenceObjective. If no InferenceObjective is selected, default values are used.

## Spec

The full spec of the InferenceModel is defined [here](/reference/x-spec/#inferenceobjective).
The full spec of the InferenceObjective is defined [here](/reference/x-spec/#inferenceobjective).