-
Notifications
You must be signed in to change notification settings - Fork 168
Open
Labels
bugSomething isn't workingSomething isn't workingconfigurationEnvoy Proxy Configuration RelatedEnvoy Proxy Configuration Related
Description
Description:
I want to implement rate limiting based on usage costs in the Envoy AI Gateway.
I have two models with slightly different billing methods. I added a metadataKey to provide the billing metrics for the models.
This configuration does not achieve the desired effect. In the free model, the CEL seems to be using the CEL calculation formula of the other model.
Sample:
I accessed the free GLM45 model, but I noticed in the response headers that the "billing_charges" quota was decreasing.
apiVersion: aigateway.envoyproxy.io/v1alpha1
kind: AIGatewayRoute
metadata:
name: envoy-ai-gateway-tke-glm45
namespace: default
spec:
llmRequestCosts:
- metadataKey: llm_input_token
type: InputToken
- metadataKey: llm_output_token
type: OutputToken
- metadataKey: llm_total_token
type: TotalToken
- metadataKey: billing_charges
type: CEL
cel: "0"
---
apiVersion: aigateway.envoyproxy.io/v1alpha1
kind: AIGatewayRoute
metadata:
name: envoy-ai-gateway-tke-glm47
namespace: default
spec:
llmRequestCosts:
- metadataKey: llm_input_token
type: InputToken
- metadataKey: llm_output_token
type: OutputToken
- metadataKey: llm_total_token
type: TotalToken
- metadataKey: billing_charges
type: CEL
cel: "int(double(input_tokens) / 1000.0 * 30.0 * 7.2 + double(output_tokens) / 1000.0 * 150.0 * 7.2)"
---
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: BackendTrafficPolicy
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"gateway.envoyproxy.io/v1alpha1","kind":"BackendTrafficPolicy","metadata":{"annotations":{},"name":"envoy-ai-gateway-default","namespace":"default"},"spec":{"rateLimit":{"global":{"rules":[{"clientSelectors":[{"headers":[{"name":"x-user-id","type":"Distinct"}]}],"cost":{"request":{"from":"Number","number":0},"response":{"from":"Metadata","metadata":{"key":"billing_charges","namespace":"io.envoy.ai_gateway"}}},"limit":{"requests":100000,"unit":"Day"}},{"cost":{"request":{"from":"Number","number":0},"response":{"from":"Metadata","metadata":{"key":"billing_charges","namespace":"io.envoy.ai_gateway"}}},"limit":{"requests":1000000,"unit":"Day"}}]},"type":"Global"},"targetRefs":[{"group":"gateway.networking.k8s.io","kind":"Gateway","name":"envoy-ai-gateway-default"}]}}
creationTimestamp: "2025-12-11T15:32:59Z"
generation: 1
name: envoy-ai-gateway-default
namespace: default
resourceVersion: "19951812"
uid: b4986c34-cfe2-4758-afb0-10668815df71
spec:
rateLimit:
global:
rules:
- clientSelectors:
- headers:
- invert: false
name: x-user-id
type: Distinct
cost:
request:
from: Number
number: 0
response:
from: Metadata
metadata:
key: billing_charges
namespace: io.envoy.ai_gateway
limit:
requests: 100000
unit: Day
- cost:
request:
from: Number
number: 0
response:
from: Metadata
metadata:
key: billing_charges
namespace: io.envoy.ai_gateway
limit:
requests: 1000000
unit: Day
type: Global
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingconfigurationEnvoy Proxy Configuration RelatedEnvoy Proxy Configuration Related