File tree Expand file tree Collapse file tree 3 files changed +37
-33
lines changed
Expand file tree Collapse file tree 3 files changed +37
-33
lines changed Original file line number Diff line number Diff line change @@ -30,7 +30,6 @@ The current manifests rely on Envoy Gateway [v1.2.1](https://gateway.envoyproxy.
3030 ```
3131 Additionally, if you would like to enable the admin interface, you can uncomment the admin lines and run this again.
3232
33-
34331 . ** Deploy Gateway**
3534
3635 ``` bash
@@ -41,6 +40,12 @@ The current manifests rely on Envoy Gateway [v1.2.1](https://gateway.envoyproxy.
4140
4241 ``` bash
4342 kubectl apply -f ./manifests/ext_proc.yaml
43+ ```
44+
45+ 1 . ** Deploy Envoy Gateway Custom Policies**
46+
47+ ``` bash
48+ kubectl apply -f ./manifests/extension_policy.yaml
4449 kubectl apply -f ./manifests/patch_policy.yaml
4550 ```
4651
Original file line number Diff line number Diff line change @@ -103,35 +103,3 @@ spec:
103103 port : 9002
104104 targetPort : 9002
105105 type : ClusterIP
106- ---
107- apiVersion : gateway.envoyproxy.io/v1alpha1
108- kind : EnvoyExtensionPolicy
109- metadata :
110- name : ext-proc-policy
111- namespace : default
112- spec :
113- extProc :
114- - backendRefs :
115- - group : " "
116- kind : Service
117- name : inference-gateway-ext-proc
118- port : 9002
119- processingMode :
120- request :
121- body : Buffered
122- response :
123- # The timeouts are likely not needed here. We can experiment with removing/tuning them slowly.
124- # The connection limits are more important and will cause the opaque: ext_proc_gRPC_error_14 error in Envoy GW if not configured correctly.
125- messageTimeout : 1000s
126- backendSettings :
127- circuitBreaker :
128- maxConnections : 40000
129- maxPendingRequests : 40000
130- maxParallelRequests : 40000
131- timeout :
132- tcp :
133- connectTimeout : 24h
134- targetRef :
135- group : gateway.networking.k8s.io
136- kind : HTTPRoute
137- name : llm-route
Original file line number Diff line number Diff line change 1+ apiVersion : gateway.envoyproxy.io/v1alpha1
2+ kind : EnvoyExtensionPolicy
3+ metadata :
4+ name : ext-proc-policy
5+ namespace : default
6+ spec :
7+ extProc :
8+ - backendRefs :
9+ - group : " "
10+ kind : Service
11+ name : inference-gateway-ext-proc
12+ port : 9002
13+ processingMode :
14+ request :
15+ body : Buffered
16+ response :
17+ # The timeouts are likely not needed here. We can experiment with removing/tuning them slowly.
18+ # The connection limits are more important and will cause the opaque: ext_proc_gRPC_error_14 error in Envoy GW if not configured correctly.
19+ messageTimeout : 1000s
20+ backendSettings :
21+ circuitBreaker :
22+ maxConnections : 40000
23+ maxPendingRequests : 40000
24+ maxParallelRequests : 40000
25+ timeout :
26+ tcp :
27+ connectTimeout : 24h
28+ targetRef :
29+ group : gateway.networking.k8s.io
30+ kind : HTTPRoute
31+ name : llm-route
You can’t perform that action at this time.
0 commit comments