Skip to content

Commit 6b2923a

Browse files
authored
Update status endpoint path in documentation (#2920)
The defined path, llm/v1/status is incorrect and will return a 404 ```bash curl http://localhost:8080/llm/v1/status -i HTTP/1.1 404 NOT FOUND ``` The correct path should simply be `/status` ```bash $ curl http://localhost:8080/status -i HTTP/1.1 200 OK Server: gunicorn Date: Wed, 17 Sep 2025 19:25:25 GMT Connection: close Content-Type: application/json Content-Length: 102 {"status":"ok","model_name":"microsoft/llmlingua-2-xlm-roberta-large-meetingbank","device_map":"cpu"} ```
1 parent f2b64b0 commit 6b2923a

File tree

1 file changed

+2
-2
lines changed
  • app/_kong_plugins/ai-prompt-compressor

1 file changed

+2
-2
lines changed

app/_kong_plugins/ai-prompt-compressor/index.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -136,7 +136,7 @@ The compressor service exposes both REST and JSON-RPC endpoints. You can use the
136136

137137
* **POST `/llm/v1/compressPrompt`**: Compresses a prompt using either a compression ratio or a target token count. Supports selective compression via `<LLMLINGUA>` tags.
138138

139-
* **GET `/llm/v1/status`**: Returns information about the currently loaded LLMLingua model and device settings (for example, CPU or GPU).
139+
* **GET `/status`**: Returns information about the currently loaded LLMLingua model and device settings (for example, CPU or GPU).
140140

141141
* **POST `/`**: JSON-RPC endpoint that supports the `llm.v1.compressPrompt` method. Use this to invoke compression programmatically over JSON-RPC.
142142

@@ -212,4 +212,4 @@ sequenceDiagram
212212
{% endmermaid %}
213213
<!-- vale on -->
214214

215-
The AI Prompt Compressor plugin applies structured compression to preserve essential context of prompts sent by users, rather than trimming prompts arbitrarily or risking token overflows. This ensures the LLM receives a well-formed, focused prompt keeping token usage under control.
215+
The AI Prompt Compressor plugin applies structured compression to preserve essential context of prompts sent by users, rather than trimming prompts arbitrarily or risking token overflows. This ensures the LLM receives a well-formed, focused prompt keeping token usage under control.

0 commit comments

Comments
 (0)