[Feature request] Adding a checker to see if a custom endpoint is working properly

I'm trying to run a model using the following command on my server:

```
docker run --gpus all --shm-size 1g -p 8080:80 -v /scratch/data:/data -e HUGGING_FACE_HUB_TOKEN=$HF_TOKEN -e HF_HUB_ENA^CE_HF_TRANSFER=0 ghcr.io/huggingface/text-generation-inference:1.1.0 --model-id bigcode/starcoder
```

But when I configure the IP in my editor: `http://XXXX:8080/generate` I would like to have a test from the editor that tell me whether or not the editor can successfully connect.

It could be helpful in the settings or as a dedicated command from the LLM extensions to verify that everything is in place and have helpful messages in case it is not.

As a check from the server side, I'm using `nvidia-smi` to see if additional GPU usage is happening

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature request] Adding a checker to see if a custom endpoint is working properly #106

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature request] Adding a checker to see if a custom endpoint is working properly #106

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions