Skip to content

Commit c8758e8

Browse files
Fix: Merge Extra field during VirtualService merge for multiple Infer… (#58393) (#650)
* Fix: Merge Extra field during VirtualService merge for multiple InferencePools When multiple HTTPRoutes referencing different InferencePools are attached to the same Gateway, only the first InferencePool's ext_proc configuration was applied. This fix ensures all InferencePool configurations are preserved during VirtualService merge. Root cause: The mergeHTTPRoutes() function merged HTTP routes and annotations but ignored the Extra field where InferencePool ext_proc configurations are stored. As a result, only the first HTTPRoute's InferencePool configuration was retained. Changes: 1. route_collections.go: Add Extra field merge logic in mergeHTTPRoutes() - Merge InferencePool config maps instead of overwriting - Add detailed logging for debugging and verification - Log final merged result with InferencePool count 2. conversion.go: Improve shadow service lookup for InferencePools - Use KRT collection (ctx.Services) for direct K8s service lookup - Add fallback to K8s service labels when PushContext not yet updated - Add comprehensive logging for troubleshooting Impact: - Fixes multi-InferencePool deployments on same Gateway - No performance impact (configuration-time only, not runtime) - Enables scalable inference routing with multiple models per Gateway Testing: - Verified with 2 InferencePools on 1 Gateway - Both routes now have ext_proc configuration in Envoy - Confirmed via config_dump and istiod logs Fixes #58392 * Revert conversion.go changes - focus only on VirtualService merge fix Reverted conversion.go changes as they are unnecessary. The InferencePool dependency registered(krt.FetchOne on InferencePools collection) is sufficient to trigger HTTPRoute recomputation when shadow services are created during InferencePool reconciliation. Related to #58392 * Address code review feedback for InferencePool merge - Refactor route_collections.go for better readability * Pull base.Extra nil check out one level * Use early return for non-InferencePool configs * Extract type assertions to variables * Change log.Infof → log.Debugf * Add comment on route name conflict prevention - Fix test infrastructure in conversion_test.go * Add mock services for infpool-model1/model2 * Enables proper multi-route InferencePool testing - Update golden files with correct expected behavior * BackendNotFound → All references resolved * Empty destinations → Properly populated hosts All reviewer feedback addressed. Tests passing. * Add test case for multiple HTTPRoutes with InferencePools on same gateway * Add release notes for PR #58393 * Fix race condition in VirtualService merge by deep copying InferencePool configs The Config.DeepCopy() method only performs a shallow copy of the Extra field. When merging multiple VirtualServices with InferencePool configs, this caused race conditions as multiple goroutines could modify the same underlying map. This fix adds an explicit deep copy of the InferencePool configs map before merging to ensure thread safety when running with race detector enabled. Fixes unit-tests-arm64_istio test failures in CI. * fix: preserve first Extra field value and add type safety check * chore: remove unnecessary comments * fix: preserve first Extra field value, add type safety, and log unexpected types Co-authored-by: jazzsir <jazzsir@gmail.com>
1 parent f9d36e0 commit c8758e8

File tree

6 files changed

+264
-2
lines changed

6 files changed

+264
-2
lines changed

pilot/pkg/config/kube/gateway/conversion_test.go

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -158,6 +158,30 @@ var services = []*model.Service{
158158
Ports: inferencePoolPorts,
159159
Hostname: host.Name(fmt.Sprintf("%s.default.svc.domain.suffix", firstValue(InferencePoolServiceName("infpool-gen2")))),
160160
},
161+
{
162+
Attributes: model.ServiceAttributes{
163+
Namespace: "default",
164+
Labels: map[string]string{
165+
InferencePoolExtensionRefSvc: "model1-epp",
166+
InferencePoolExtensionRefPort: "9002",
167+
InferencePoolExtensionRefFailureMode: "FailClose",
168+
},
169+
},
170+
Ports: ports,
171+
Hostname: host.Name(fmt.Sprintf("%s.default.svc.domain.suffix", firstValue(InferencePoolServiceName("infpool-model1")))),
172+
},
173+
{
174+
Attributes: model.ServiceAttributes{
175+
Namespace: "default",
176+
Labels: map[string]string{
177+
InferencePoolExtensionRefSvc: "model2-epp",
178+
InferencePoolExtensionRefPort: "9002",
179+
InferencePoolExtensionRefFailureMode: "FailClose",
180+
},
181+
},
182+
Ports: ports,
183+
Hostname: host.Name(fmt.Sprintf("%s.default.svc.domain.suffix", firstValue(InferencePoolServiceName("infpool-model2")))),
184+
},
161185

162186
{
163187
Attributes: model.ServiceAttributes{

pilot/pkg/config/kube/gateway/route_collections.go

Lines changed: 52 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -682,12 +682,63 @@ func mergeHTTPRoutes(baseVirtualServices krt.Collection[RouteWithKey], opts ...k
682682
sortRoutesByCreationTime(configs)
683683
base := configs[0].DeepCopy()
684684
baseVS := base.Spec.(*istio.VirtualService)
685-
for _, config := range configs[1:] {
685+
// Deep copy the InferencePool configs map to avoid race conditions
686+
// The default DeepCopy() only does shallow copy of Extra field
687+
if base.Extra != nil {
688+
if ipConfigs, ok := base.Extra[constants.ConfigExtraPerRouteRuleInferencePoolConfigs].(map[string]kube.InferencePoolRouteRuleConfig); ok {
689+
// Create a new map to avoid modifying the shared underlying map
690+
newIPConfigs := make(map[string]kube.InferencePoolRouteRuleConfig, len(ipConfigs))
691+
for k, v := range ipConfigs {
692+
newIPConfigs[k] = v
693+
}
694+
base.Extra[constants.ConfigExtraPerRouteRuleInferencePoolConfigs] = newIPConfigs
695+
}
696+
}
697+
for i, config := range configs[1:] {
686698
thisVS := config.Spec.(*istio.VirtualService)
687699
baseVS.Http = append(baseVS.Http, thisVS.Http...)
688700
// append parents
689701
base.Annotations[constants.InternalParentNames] = fmt.Sprintf("%s,%s",
690702
base.Annotations[constants.InternalParentNames], config.Annotations[constants.InternalParentNames])
703+
// Merge Extra field (especially for InferencePool configs)
704+
if base.Extra == nil && config.Extra != nil {
705+
base.Extra = make(map[string]any)
706+
}
707+
if config.Extra != nil {
708+
for k, v := range config.Extra {
709+
// For non-InferencePool configs, keep the first value for stability
710+
if k != constants.ConfigExtraPerRouteRuleInferencePoolConfigs {
711+
if _, exists := base.Extra[k]; !exists {
712+
base.Extra[k] = v
713+
}
714+
continue
715+
}
716+
// For InferencePool configs, merge the maps
717+
baseMap, baseOk := base.Extra[k].(map[string]kube.InferencePoolRouteRuleConfig)
718+
configMap, configOk := v.(map[string]kube.InferencePoolRouteRuleConfig)
719+
if baseOk && configOk {
720+
log.Debugf("Merging InferencePool configs: adding %d route configs from VirtualService %d to base (namespace=%s)",
721+
len(configMap), i+1, config.Namespace)
722+
// Route names are composed of the HTTPRoute/VirtualService namespaced name so they can't possibly conflict
723+
for routeName, routeConfig := range configMap {
724+
baseMap[routeName] = routeConfig
725+
}
726+
} else if configOk {
727+
if _, exists := base.Extra[k]; !exists {
728+
log.Debugf("Creating new InferencePool config map from VirtualService %d (namespace=%s)", i+1, config.Namespace)
729+
base.Extra[k] = v
730+
}
731+
} else if !configOk {
732+
log.Debugf("Skipping InferencePool config from VirtualService %d due to unexpected type (namespace=%s)", i+1, config.Namespace)
733+
}
734+
}
735+
}
736+
}
737+
// Log final merged InferencePool configs
738+
if base.Extra != nil {
739+
if ipConfigs, ok := base.Extra[constants.ConfigExtraPerRouteRuleInferencePoolConfigs].(map[string]kube.InferencePoolRouteRuleConfig); ok {
740+
log.Debugf("Final merged VirtualService for key %s has %d InferencePool route configs", object.Key, len(ipConfigs))
741+
}
691742
}
692743
sortHTTPRoutes(baseVS.Http)
693744
base.Name = strings.ReplaceAll(object.Key, "/", "~")

pilot/pkg/config/kube/gateway/testdata/http.status.yaml.golden

Lines changed: 69 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,24 @@ metadata:
1616
spec: null
1717
status: {}
1818
---
19+
apiVersion: inference.networking.k8s.io/v1
20+
kind: InferencePool
21+
metadata:
22+
creationTimestamp: null
23+
name: infpool-model1
24+
namespace: default
25+
spec: null
26+
status: {}
27+
---
28+
apiVersion: inference.networking.k8s.io/v1
29+
kind: InferencePool
30+
metadata:
31+
creationTimestamp: null
32+
name: infpool-model2
33+
namespace: default
34+
spec: null
35+
status: {}
36+
---
1937
apiVersion: gateway.networking.k8s.io/v1beta1
2038
kind: GatewayClass
2139
metadata:
@@ -53,7 +71,7 @@ status:
5371
status: "True"
5472
type: Programmed
5573
listeners:
56-
- attachedRoutes: 11
74+
- attachedRoutes: 13
5775
conditions:
5876
- lastTransitionTime: fake
5977
message: No errors found
@@ -284,6 +302,56 @@ status:
284302
---
285303
apiVersion: gateway.networking.k8s.io/v1beta1
286304
kind: HTTPRoute
305+
metadata:
306+
creationTimestamp: null
307+
name: multi-route-infpool-1
308+
namespace: default
309+
spec: null
310+
status:
311+
parents:
312+
- conditions:
313+
- lastTransitionTime: fake
314+
message: Route was valid
315+
reason: Accepted
316+
status: "True"
317+
type: Accepted
318+
- lastTransitionTime: fake
319+
message: All references resolved
320+
reason: ResolvedRefs
321+
status: "True"
322+
type: ResolvedRefs
323+
controllerName: istio.io/gateway-controller
324+
parentRef:
325+
name: gateway
326+
namespace: istio-system
327+
---
328+
apiVersion: gateway.networking.k8s.io/v1beta1
329+
kind: HTTPRoute
330+
metadata:
331+
creationTimestamp: null
332+
name: multi-route-infpool-2
333+
namespace: default
334+
spec: null
335+
status:
336+
parents:
337+
- conditions:
338+
- lastTransitionTime: fake
339+
message: Route was valid
340+
reason: Accepted
341+
status: "True"
342+
type: Accepted
343+
- lastTransitionTime: fake
344+
message: All references resolved
345+
reason: ResolvedRefs
346+
status: "True"
347+
type: ResolvedRefs
348+
controllerName: istio.io/gateway-controller
349+
parentRef:
350+
name: gateway
351+
namespace: istio-system
352+
---
353+
apiVersion: gateway.networking.k8s.io/v1beta1
354+
kind: HTTPRoute
287355
metadata:
288356
creationTimestamp: null
289357
name: multiple-inferencepool-backend-refs

pilot/pkg/config/kube/gateway/testdata/http.yaml

Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -423,3 +423,79 @@ spec:
423423
name: vllm-llama3-8b-instruct-epp
424424
port:
425425
number: 9002
426+
---
427+
# Test case for multiple HTTPRoutes with InferencePools on same gateway
428+
# This tests that Extra field (InferencePool configs) are properly merged
429+
apiVersion: gateway.networking.k8s.io/v1beta1
430+
kind: HTTPRoute
431+
metadata:
432+
name: multi-route-infpool-1
433+
namespace: default
434+
spec:
435+
parentRefs:
436+
- name: gateway
437+
namespace: istio-system
438+
hostnames: ["multi-infpool.domain.example"]
439+
rules:
440+
- matches:
441+
- path:
442+
type: PathPrefix
443+
value: /model1
444+
backendRefs:
445+
- name: infpool-model1
446+
group: inference.networking.k8s.io
447+
kind: InferencePool
448+
port: 80
449+
---
450+
apiVersion: gateway.networking.k8s.io/v1beta1
451+
kind: HTTPRoute
452+
metadata:
453+
name: multi-route-infpool-2
454+
namespace: default
455+
spec:
456+
parentRefs:
457+
- name: gateway
458+
namespace: istio-system
459+
hostnames: ["multi-infpool.domain.example"]
460+
rules:
461+
- matches:
462+
- path:
463+
type: PathPrefix
464+
value: /model2
465+
backendRefs:
466+
- name: infpool-model2
467+
group: inference.networking.k8s.io
468+
kind: InferencePool
469+
port: 80
470+
---
471+
apiVersion: inference.networking.k8s.io/v1
472+
kind: InferencePool
473+
metadata:
474+
name: infpool-model1
475+
namespace: default
476+
spec:
477+
targetPorts:
478+
- number: 8000
479+
selector:
480+
matchLabels:
481+
app: model1-server
482+
endpointPickerRef:
483+
name: model1-epp
484+
port:
485+
number: 9002
486+
---
487+
apiVersion: inference.networking.k8s.io/v1
488+
kind: InferencePool
489+
metadata:
490+
name: infpool-model2
491+
namespace: default
492+
spec:
493+
targetPorts:
494+
- number: 8000
495+
selector:
496+
matchLabels:
497+
app: model2-server
498+
endpointPickerRef:
499+
name: model2-epp
500+
port:
501+
number: 9002

pilot/pkg/config/kube/gateway/testdata/http.yaml.golden

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -252,6 +252,40 @@ spec:
252252
---
253253
apiVersion: networking.istio.io/v1
254254
kind: VirtualService
255+
metadata:
256+
annotations:
257+
internal.istio.io/parents: HTTPRoute/multi-route-infpool-1.default,HTTPRoute/multi-route-infpool-2.default
258+
internal.istio.io/route-semantics: gateway
259+
creationTimestamp: null
260+
name: istio-system~gateway-istio-autogenerated-k8s-gateway-default~multi-infpool.domain.example
261+
namespace: default
262+
spec:
263+
gateways:
264+
- istio-system/gateway-istio-autogenerated-k8s-gateway-default
265+
hosts:
266+
- multi-infpool.domain.example
267+
http:
268+
- match:
269+
- uri:
270+
prefix: /model1
271+
name: default.multi-route-infpool-1.0
272+
route:
273+
- destination:
274+
host: infpool-model1-ip-aaaaf2d6.default.svc.domain.suffix
275+
port:
276+
number: 80
277+
- match:
278+
- uri:
279+
prefix: /model2
280+
name: default.multi-route-infpool-2.0
281+
route:
282+
- destination:
283+
host: infpool-model2-ip-f857bff9.default.svc.domain.suffix
284+
port:
285+
number: 80
286+
---
287+
apiVersion: networking.istio.io/v1
288+
kind: VirtualService
255289
metadata:
256290
annotations:
257291
internal.istio.io/parents: HTTPRoute/redirect-prefix-replace.default

releasenotes/notes/58393.yaml

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
apiVersion: release-notes/v2
2+
kind: bug-fix
3+
area: traffic-management
4+
issue:
5+
- 58392
6+
7+
releaseNotes:
8+
- |
9+
**Fixed** an issue where InferencePool configurations were lost during VirtualService merging when multiple HTTPRoutes referencing different InferencePools were attached to the same Gateway.

0 commit comments

Comments
 (0)