@@ -103,6 +103,14 @@ INFERENCESERVICE_NAME=<your-inferenceservice-name>
103103
104104# KServe creates a headless service by default (no stable ClusterIP)
105105# Create a stable ClusterIP service for consistent routing
106+
107+ # Option 1: Using the template file (recommended)
108+ # Substitute variables and apply
109+ sed -e " s/{{INFERENCESERVICE_NAME}}/$INFERENCESERVICE_NAME /g" \
110+ -e " s/{{NAMESPACE}}/$NAMESPACE /g" \
111+ service-predictor-stable.yaml | oc apply -f - -n $NAMESPACE
112+
113+ # Option 2: Using heredoc
106114cat << EOF | oc apply -f - -n $NAMESPACE
107115apiVersion: v1
108116kind: Service
@@ -239,6 +247,7 @@ Find the `kserve_dynamic_cluster` section and update:
239247` ` `
240248
241249Replace :
250+
242251- ` my-model` with your InferenceService name
243252- ` my-namespace` with your namespace
244253
@@ -709,6 +718,7 @@ oc logs -l app=semantic-router -c semantic-router -n $NAMESPACE -f
709718**Important Log Events:**
710719
711720- `routing_decision` : Which model was selected and why
721+
712722 ` ` ` json
713723 {
714724 "msg": "routing_decision",
@@ -720,6 +730,7 @@ oc logs -l app=semantic-router -c semantic-router -n $NAMESPACE -f
720730 ` ` `
721731
722732- `cache_hit`/`cache_miss` : Cache performance
733+
723734 ` ` ` json
724735 {
725736 "msg": "cache_hit",
@@ -730,6 +741,7 @@ oc logs -l app=semantic-router -c semantic-router -n $NAMESPACE -f
730741 ` ` `
731742
732743- `llm_usage` : Token usage and costs
744+
733745 ` ` ` json
734746 {
735747 "msg": "llm_usage",
@@ -742,6 +754,7 @@ oc logs -l app=semantic-router -c semantic-router -n $NAMESPACE -f
742754 ` ` `
743755
744756- `pii_detection` : PII found in requests
757+
745758 ` ` ` json
746759 {
747760 "msg": "pii_detection",
@@ -815,9 +828,11 @@ oc describe pod -l app=semantic-router -n $NAMESPACE
815828 - Solution : Check network policies, proxy settings
816829
8178302. **PVC not bound** : Storage not provisioned
831+
818832 ` ` ` bash
819833 oc get pvc -n $NAMESPACE
820834 ` ` `
835+
821836 - Solution : Check StorageClass, provision capacity
822837
8238383. **OOM during model download** : Insufficient memory
@@ -836,21 +851,27 @@ oc logs -l app=semantic-router -c semantic-router -n $NAMESPACE --previous
836851**Common Causes**:
837852
8388531. **Configuration error** : Invalid YAML or missing fields
854+
839855 ` ` `
840856 Failed to load config: yaml: unmarshal errors
841857 ` ` `
858+
842859 - Solution : Validate YAML syntax, check required fields
843860
8448612. **Invalid IP address** : Router validation failed
862+
845863 ` ` `
846864 invalid IP address format, got: my-model.svc.cluster.local
847865 ` ` `
866+
848867 - Solution : Use service ClusterIP (not DNS) in `vllm_endpoints.address` - see Step 1 for creating stable service
849868
8508693. **Missing models** : Classification models not downloaded
870+
851871 ` ` `
852872 failed to read mapping file: no such file or directory
853873 ` ` `
874+
854875 - Solution : Check init container completed successfully
855876
856877# ## Cannot Connect to InferenceService
@@ -870,24 +891,30 @@ oc exec $POD -c semantic-router -n $NAMESPACE -- \
870891**Common Causes**:
871892
8728931. **InferenceService not ready** :
894+
873895 ` ` ` bash
874896 oc get inferenceservice -n $NAMESPACE
875897 ` ` `
898+
876899 - Solution : Wait for READY=True, check predictor logs
877900
8789012. **Wrong DNS name** : Incorrect service name in Envoy config
879902 - Solution : Verify format: `<inferenceservice>-predictor.<namespace>.svc.cluster.local`
880903
8819043. **Network policy blocking** : Istio/NetworkPolicy restrictions
905+
882906 ` ` ` bash
883907 oc get networkpolicies -n $NAMESPACE
884908 ` ` `
909+
885910 - Solution : Add policy to allow traffic from router to predictor
886911
8879124. **PeerAuthentication conflict** : mTLS mode mismatch
913+
888914 ` ` ` bash
889915 oc get peerauthentication -n $NAMESPACE
890916 ` ` `
917+
891918 - Solution : Ensure PERMISSIVE mode or adjust Envoy TLS config
892919
893920# ## Predictor Pod IP Changed (If Using Pod IP Instead of Service IP)
@@ -901,6 +928,7 @@ oc exec $POD -c semantic-router -n $NAMESPACE -- \
901928**Solution**:
902929
9039301. Switch to stable service approach (recommended) :
931+
904932 ` ` ` bash
905933 # Create stable service
906934 cat <<EOF | oc apply -f - -n $NAMESPACE
@@ -944,9 +972,11 @@ oc logs -l app=semantic-router -c semantic-router -n $NAMESPACE \
944972**Common Causes**:
945973
9469741. **Threshold too high** : Similarity threshold prevents matches
975+
947976 ` ` ` yaml
948977 similarity_threshold: 0.99 # Too strict
949978 ` ` `
979+
950980 - Solution : Lower threshold to 0.8-0.85
951981
9529822. **Cache disabled** : Not enabled in config
@@ -1066,6 +1096,7 @@ For multi-replica deployments with shared cache:
10661096
106710971. Deploy Milvus in your cluster
106810982. Update `configmap-router-config.yaml` :
1099+
10691100 ` ` ` yaml
10701101 semantic_cache:
10711102 enabled: true
@@ -1077,6 +1108,7 @@ For multi-replica deployments with shared cache:
10771108 ` ` `
10781109
107911103. Apply and restart :
1111+
10801112 ` ` ` bash
10811113 oc apply -f configmap-router-config.yaml -n $NAMESPACE
10821114 oc rollout restart deployment/semantic-router-kserve -n $NAMESPACE
0 commit comments