Skip to content

Commit 7189b50

Browse files
committed
add readme
1 parent bf33f17 commit 7189b50

File tree

1 file changed

+152
-0
lines changed

1 file changed

+152
-0
lines changed

README.md

Lines changed: 152 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,152 @@
1+
# vllm-orchestrator-gateway
2+
3+
This gateway service enables controlled interaction with the FMS Guardrails Orchestrator by enforcing stricter access to its exposed endpoints. It provides a mechanism of configuring fixed detector pipelines, and then provides a unique /v1/chat/completions endpoint per configured detector pipeline. This allows for drop-in replacement of an unguardrailed chat completions model with a guardrailed one.
4+
5+
### Getting started
6+
To see the entire stack of the vllm-orchestrator-gateway there are a few services that need to be spun up. Some of these can be swapped for other services that use follow certain api's, such as the detectors and generation models.
7+
8+
- [FMS Guardrails Orchestrator](https://github.com/foundation-model-stack/fms-guardrails-orchestrator)
9+
- [Guardrails Regex Detector](https://github.com/trustyai-explainability/guardrails-regex-detector)
10+
- [VLLM using Qwen/Qwen2.5-1.5B-Instruct](https://docs.vllm.ai/en/latest/getting_started/quickstart.html#openai-compatible-server)
11+
12+
13+
### Sample config
14+
The config has 3 main fields, `orchestrator`, `detectors` and `routes`.
15+
16+
`orchestrator` is where the `orchestrator` service lives.
17+
18+
`detectors` are detectors services that have been defined in the `fms-guardrails-orchestrator` config file. You can specify what detector belongs to input and/or output.
19+
20+
`routes` are the dynamically exposed routes used to enforce detectors onto endpoints such as the `pii` endpoint that registers our `regex-language` detector. You can also specify no detectors such as the `passthrough` route down below.
21+
22+
`fallback_message` in the `routes` field is used as a response from the gateway when a detection is found either in the input or output.
23+
24+
```yaml
25+
orchestrator:
26+
host: localhost
27+
port: 8085
28+
detectors:
29+
- name: regex-language
30+
input: false
31+
output: true
32+
detector_params:
33+
regex:
34+
- email
35+
- ssn
36+
routes:
37+
- name: pii
38+
detectors:
39+
- regex-language
40+
fallback_message: "I'm sorry, I'm afraid I can't do that."
41+
- name: passthrough
42+
detectors:
43+
```
44+
45+
46+
### Sample request
47+
```bash
48+
curl "localhost:8090/pii/v1/chat/completions" \
49+
-H "Content-Type: application/json" \
50+
-d '{
51+
"model": "Qwen/Qwen2.5-1.5B-Instruct",
52+
"messages": [
53+
{
54+
"role": "user",
55+
"content": "say hello to me at [email protected]"
56+
},
57+
{
58+
"role": "user",
59+
"content": "btw here is my social 123456789"
60+
}
61+
]
62+
}'
63+
```
64+
### Sample response with generation
65+
```bash
66+
{
67+
"choices": [
68+
{
69+
"finish_reason": "stop",
70+
"index": 0,
71+
"logprobs": null,
72+
"message": {
73+
"audio": null,
74+
"content": "Hello! It looks like you've provided my email address and a social security number. I'm just an AI assistant and not an email or social security system. Please correct this.",
75+
"refusal": null,
76+
"role": "assistant",
77+
"tool_calls": null
78+
}
79+
}
80+
],
81+
"created": 1741182909,
82+
"detections": null,
83+
"id": "chatcmpl-971213a0e09446a8b11bd447db0f3a64",
84+
"model": "Qwen/Qwen2.5-1.5B-Instruct",
85+
"object": "chat.completion",
86+
"service_tier": null,
87+
"system_fingerprint": null,
88+
"usage": {
89+
"completion_tokens": 37,
90+
"prompt_tokens": 61,
91+
"total_tokens": 98
92+
},
93+
"warnings": null
94+
}
95+
```
96+
97+
### Sample response with found detection
98+
```bash
99+
{
100+
"choices": [
101+
{
102+
"finish_reason": "stop",
103+
"index": 0,
104+
"logprobs": null,
105+
"message": {
106+
"audio": null,
107+
"content": "I'm sorry, I'm afraid I can't do that.",
108+
"refusal": null,
109+
"role": "assistant",
110+
"tool_calls": null
111+
}
112+
}
113+
],
114+
"created": 1741182848,
115+
"detections": {
116+
"input": null,
117+
"output": [
118+
{
119+
"choice_index": 0,
120+
"results": [
121+
{
122+
"detection": "EmailAddress",
123+
"detection_type": "pii",
124+
"detector_id": "regex-language",
125+
"end": 176,
126+
"score": 1.0,
127+
"start": 152,
128+
129+
}
130+
]
131+
}
132+
]
133+
},
134+
"id": "16a0abbf4b0c431e885be5cfa4ff1c4b",
135+
"model": "Qwen/Qwen2.5-1.5B-Instruct",
136+
"object": "chat.completion",
137+
"service_tier": null,
138+
"system_fingerprint": null,
139+
"usage": {
140+
"completion_tokens": 83,
141+
"prompt_tokens": 61,
142+
"total_tokens": 144
143+
},
144+
"warnings": [
145+
{
146+
"message": "Unsuitable output detected.",
147+
"type": "UNSUITABLE_OUTPUT"
148+
}
149+
]
150+
}
151+
```
152+

0 commit comments

Comments
 (0)