Skip to content

Commit 651ff89

Browse files
authored
Update troubleshooting.md
1 parent 914d9d2 commit 651ff89

File tree

1 file changed

+5
-4
lines changed

1 file changed

+5
-4
lines changed

site-src/guides/troubleshooting.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -50,12 +50,13 @@ spec:
5050
name: my-inference-pool
5151
```
5252

53-
## 503 Service Unavailable
54-
### `upstream connect error or disconnect/reset before headers. reset reason: remote connection failure, transport failure reason: delayed connect error: Connection refused`
55-
This error indicates that the gateway successfully identified the correct model server pod but failed to establish a connection to it. This is likely caused by the port number specified in the InferencePool's configuration doesn't match the port your model server is listening on. The gateway tries to connect to the wrong port and is refused.
53+
## 502 Bad Gateway or 503 Service Unavailable
54+
### `upstream connect error or disconnect/reset ...`
55+
The gateway can return an error when it cannot connect to its backends. This error indicates that the gateway successfully identified the correct model server pod but failed to establish a connection to it. This is likely caused by the port number specified in the InferencePool's configuration doesn't match the port your model server is listening on. The gateway tries to connect to the wrong port and is refused.
5656

5757
**Solution**: Verify the port specified in your InferencePool matches the port number exposed by your model server container, and update your InferencePool accordingly.
5858

59+
## 503 Service Unavailable
5960
### `no healthy upstream`
6061
This error indicates that the HTTPRoute and InferencePool are correctly configured, but there are no healthy pods in the pool to route traffic to. This can happen if the pods are crashing, still starting up, or failing their health checks.
6162

@@ -79,7 +80,7 @@ The EPP's core function is to intelligently route requests to the most optimal m
7980

8081
For unexpected routing behaviors:
8182

82-
* Verify the expected metrics are being emitted from the model server. Some model servers aren't fully compatible with the default expected metrics, vLLM is generally the most up-to-date in this regard.
83+
* Verify the expected metrics are being emitted from the model server. Some model servers aren't fully compatible with the default expected metrics, vLLM is generally the most up-to-date in this regard. See [Support Model Servers](https://gateway-api-inference-extension.sigs.k8s.io/implementations/model-servers/).
8384
* Check your [plugins](https://gateway-api-inference-extension.sigs.k8s.io/guides/epp-configuration/config-text/) configuration, especially the weights of the scorer plugins. If weight is omitted, a default weight of 1 will be used.
8485

8586
## Poor Performance under High Concurrency

0 commit comments

Comments
 (0)