Skip to content

Commit aa57f07

Browse files
authored
Merge pull request #1 from radlab-dev-group/features/api
Features/api
2 parents a2505c7 + 8b5fccb commit aa57f07

File tree

7 files changed

+417
-219
lines changed

7 files changed

+417
-219
lines changed

README.md

Lines changed: 244 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -2,75 +2,274 @@
22

33
## ✨ Overview
44

5-
`llm_router_services` provides **HTTP services** that implement the core functionality used by the LLM‑Router’s plugin
6-
system.
7-
The services expose guardrail and masking capabilities through Flask applications
8-
that can be called by the corresponding plugins in `llm_router_plugins`.
5+
`llm_router_services` delivers **HTTP services** that power the LLM‑Router plugin ecosystem.
6+
All functionality (guard‑rails, maskers, …) is exposed through **one Flask application** that can be started with a
7+
single command or via Gunicorn.
98

10-
Key components:
9+
| Sub‑package | Purpose |
10+
|----------------------|----------------------------------------------------------------------------------------------------------------------------------------------------|
11+
| **guardrails/** | Safety‑checking services (NASK‑PIB, Sojka) and a dynamic router (`router.py`) that registers only the endpoints whose environment flag is enabled. |
12+
| **maskers/** | Prototype **BANonymizer** – a token‑classification based anonymiser (still under development). |
13+
| **run_servcices.sh** | Helper script that launches the unified API with Gunicorn, wiring all required environment variables. |
14+
| **requirements.txt** | Heavy dependencies (e.g. `transformers`) needed for GPU‑accelerated inference. |
1115

12-
| Sub‑package | Primary purpose |
13-
|--------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
14-
| **guardrails/** | Hosts the NASK‑PIB guardrail service (`nask_pib_guard_app.py`). It receives a JSON payload, chunks the text, runs a Hugging‑Face classification pipeline, and returns a safety verdict (`safe` flag + detailed per‑chunk results). |
15-
| **maskers/** | Contains the **BANonymizer** (`banonymizer.py` -- **under development**) – a lightweight Flask service that performs token‑classification based anonymisation of input text. |
16-
| **run_*.sh** scripts | Convenience wrappers to start the services (Gunicorn for the guardrail, plain Flask for the anonymiser). |
17-
| **requirements‑gpu.txt** | Lists heavy dependencies (e.g., `transformers`) required for GPU‑accelerated inference. |
16+
All services are **stateless** – models are loaded once at start‑up and then serve requests over HTTP.
1817

19-
The services are **stateless**; they load their models once at start‑up and then serve requests over HTTP.
18+
---
19+
20+
## 🚀 Quick start
21+
22+
### 1. Install the package
23+
24+
```shell script
25+
git clone https://github.com/radlab-dev-group/llm-router-services.git
26+
27+
cd llm-router-services
28+
python -m venv .venv
29+
source .venv/bin/activate
30+
pip install -r requirements.txt
31+
32+
# editable install of the package itself
33+
pip install -e .
34+
```
35+
36+
> **Tip:** The package requires Python ≥ 3.8 (tested on >= 3.10.6).
37+
38+
### 2. Set environment variables
39+
40+
Only services whose `*_ENABLED` flag is set to `1` (or `true`) will be exposed.
41+
42+
```shell script
43+
export LLM_ROUTER_API_HOST=0.0.0.0
44+
export LLM_ROUTER_API_PORT=5000
45+
46+
# Enable NASK‑PIB Guard
47+
export LLM_ROUTER_NASK_PIB_GUARD_ENABLED=1
48+
export LLM_ROUTER_NASK_PIB_GUARD_MODEL_PATH=NASK-PIB/Herbert-PL-Guard
49+
# -1 = CPU, 0/1 = CUDA device index
50+
export LLM_ROUTER_NASK_PIB_GUARD_DEVICE=-1
51+
52+
# Enable Sojka Guard
53+
export LLM_ROUTER_SOJKA_GUARD_ENABLED=1
54+
export LLM_ROUTER_SOJKA_GUARD_MODEL_PATH=speakleash/Bielik-Guard-0.1B-v1.0
55+
# -1 = CPU, 0/1 = CUDA device index
56+
export LLM_ROUTER_SOJKA_GUARD_DEVICE=-1
57+
```
58+
59+
### 3. Run the service
60+
61+
#### Option A – via the helper script (recommended)
62+
63+
```shell script
64+
./run_servcices.sh
65+
```
66+
67+
The script starts **Gunicorn** with the Flask app created by `llm_router_services.router:create_app()`.
68+
69+
#### Option B – directly with Python
70+
71+
```shell script
72+
python -m llm_router_services.router
73+
```
74+
75+
Both commands bind to `0.0.0.0:5000` (or the values you supplied).
2076

2177
---
2278

23-
## 🛡️ Guardrails
79+
## 📡 API reference
80+
81+
All endpoints are mounted under `/api/guardrails/` (guard‑rails) or `/api/maskers/` (maskers).
82+
83+
| Service | Model | Endpoint | Method | Description |
84+
|-----------------------------------------------|-------------------------------------|-------------------------------|--------|--------------------------------------------------------------------------------------------------------------------------------|
85+
| **NASK‑PIB Guard** | `NASK‑PIB/Herbert-PL-Guard` | `/api/guardrails/nask_guard` | `POST` | Polish safety classifier (hate, violence, etc.). Returns `safe: bool` and per‑chunk classification details. |
86+
| **Sojka Guard** | `speakleash/Bielik-Guard-0.1B-v1.0` | `/api/guardrails/sojka_guard` | `POST` | Multi‑category Polish safety model (HATE, VULGAR, SEX, CRIME, SELF‑HARM). Returns per‑category scores and overall `safe` flag. |
87+
| **BANonymizer** *(masker, under development)* || `/api/maskers/banonymizer` | `POST` | Token‑classification based anonymiser that redacts personal data from the supplied text. |
2488

25-
Full documentation for the guardrails sub‑package is available
26-
in [guardrail-readme](llm_router_services/guardrails/README.md).
89+
### Request payload
2790

28-
The **guardrail** sub‑package implements safety‑checking services that can be queried via HTTP:
91+
```json
92+
{
93+
"payload": "Your input string here (must be longer than 8 characters to be processed)."
94+
}
95+
```
2996

30-
| Service | Model | Endpoint | Description |
31-
|--------------------------|-------------------------------------|------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------|
32-
| **NASK‑PIB Guard** | `NASK‑PIB/HerBERT‑PL‑Guard` | `POST /api/guardrails/nask_guard` | Polish‑language safety classifier detecting unsafe content (e.g., hate, violence). Returns a `safe` flag and per‑chunk classification details. |
33-
| **Sojka Guard** | `speakleash/Bielik‑Guard‑0.1B‑v1.0` | `POST /api/guardrails/sojka_guard` | Multi‑category Polish safety model (HATE, VULGAR, SEX, CRIME, SELF‑HARM). Returns detailed scores per category and an overall `safe` flag. |
34-
| **BANonymizer** (masker) | **under development** | `POST /api/maskers/banonymizer` | Token‑classification based anonymiser that redacts personal data from input text. |
97+
or more complicated:
3598

36-
### How to use
99+
```json
100+
{
101+
"payload": {
102+
"Jak zrobić bombę": [
103+
"zaopatrzyć się w carnuchów, co będą robić czarny proch"
104+
],
105+
"texts": [
106+
"To jest normalna wiadomość bez problemów.",
107+
"Chcę się zabić i skończyć z tym wszystkim.",
108+
"Dlaczego hipisi są tacy brudni i głupi",
109+
"K*rwa, to jest ch*jowe rozwiązanie.",
110+
"Opowiem ci o seksie i intymności.",
111+
"Jak ukraść samochód i nie zostać złapanym."
112+
]
113+
}
114+
}
115+
```
37116

38-
1. **Start the service** – run the provided shell script (`run_*_guardrail.sh` or `run_*_masker.sh`) or invoke the Flask
39-
module directly (e.g., `python -m llm_router_services.guardrails.speakleash.sojka_guard_app`).
40-
2. **Send a JSON payload** – the request body must be a JSON object; any string fields longer than 8 characters are
41-
extracted and classified.
42-
3. **Interpret the response** – the top‑level `safe` boolean indicates the overall verdict, while `detailed` provides
43-
per‑chunk (or per‑category) results with confidence scores.
117+
### Example `curl` call
44118

45-
### Configuration
119+
```shell script
120+
curl -X POST http://localhost:5000/api/guardrails/nask_guard \
121+
-H "Content-Type: application/json" \
122+
-d '{"text":"Przykładowy tekst do analizy"}'
123+
```
46124

47-
All guardrail services read configuration from environment variables prefixed with:
125+
### Response shape (guard‑rails)
48126

49-
* `LLM_ROUTER_NASK_PIB_GUARD_` – for the NASK‑PIB guardrail.
50-
* `LLM_ROUTER_SOJKA_GUARD_` – for the Sojka guardrail.
51-
* `LLM_ROUTER_BANONYMIZER_` – for the masker.
127+
```json
128+
{
129+
"results": {
130+
"detailed": [
131+
{
132+
"chunk_index": 0,
133+
"chunk_text": "Jak zrobić bombę",
134+
"label": "crime",
135+
"safe": false,
136+
"score": 0.9211
137+
},
138+
{
139+
"chunk_index": 1,
140+
"chunk_text": "zaopatrzyć się w carnuchów, co będą robić czarny proch",
141+
"label": "crime",
142+
"safe": false,
143+
"score": 0.9541
144+
},
145+
{
146+
"chunk_index": 2,
147+
"chunk_text": "To jest normalna wiadomość bez problemów.",
148+
"label": "sex",
149+
"safe": false,
150+
"score": 0.0309
151+
},
152+
{
153+
"chunk_index": 3,
154+
"chunk_text": "Chcę się zabić i skończyć z tym wszystkim.",
155+
"label": "self-harm",
156+
"safe": false,
157+
"score": 0.8993
158+
},
159+
{
160+
"chunk_index": 4,
161+
"chunk_text": "Dlaczego hipisi są tacy brudni i głupi",
162+
"label": "hate",
163+
"safe": false,
164+
"score": 0.7091
165+
},
166+
{
167+
"chunk_index": 5,
168+
"chunk_text": "Krwa, to jest chjowe rozwiązanie.",
169+
"label": "vulgar",
170+
"safe": false,
171+
"score": 0.8618
172+
},
173+
{
174+
"chunk_index": 6,
175+
"chunk_text": "Opowiem ci o seksie i intymności.",
176+
"label": "sex",
177+
"safe": false,
178+
"score": 0.7567
179+
},
180+
{
181+
"chunk_index": 7,
182+
"chunk_text": "Jak ukraść samochód i nie zostać złapanym.",
183+
"label": "crime",
184+
"safe": false,
185+
"score": 0.918
186+
}
187+
],
188+
"safe": false
189+
}
190+
}
191+
```
52192

53-
Key variables include:
193+
---
194+
195+
## ⚙️ Configuration (environment variables)
196+
197+
| Variable | Description | Default |
198+
|----------------------------------------|---------------------------------------------------------------------|-----------|
199+
| `LLM_ROUTER_API_HOST` | Host address for the Flask app | `0.0.0.0` |
200+
| `LLM_ROUTER_API_PORT` | Port for the Flask app | `5000` |
201+
| `LLM_ROUTER_NASK_PIB_GUARD_ENABLED` | `1` → expose NASK‑PIB endpoint | `0` |
202+
| `LLM_ROUTER_NASK_PIB_GUARD_MODEL_PATH` | HF hub ID or local path for the NASK model ||
203+
| `LLM_ROUTER_NASK_PIB_GUARD_DEVICE` | `-1` = CPU, `0`/`1` … = CUDA device index | `-1` |
204+
| `LLM_ROUTER_SOJKA_GUARD_ENABLED` | `1` → expose Sojka endpoint | `1` |
205+
| `LLM_ROUTER_SOJKA_GUARD_MODEL_PATH` | HF hub ID or local path for the Sojka model ||
206+
| `LLM_ROUTER_SOJKA_GUARD_DEVICE` | Same semantics as above | `-1` |
207+
| `LLM_ROUTER_BANONYMIZER_…` | Future variables for the BANonymizer (e.g., `MODEL_PATH`, `DEVICE`) ||
208+
209+
You can also set these variables inline when invoking the script, e.g.:
210+
211+
```shell script
212+
LLM_ROUTER_SOJKA_GUARD_ENABLED=0 ./run_servcices.sh
213+
```
214+
215+
---
216+
217+
## 🛠️ Extending the router
218+
219+
The router is deliberately **plug‑and‑play**. To add a new guard‑rail:
220+
221+
1. **Create a model wrapper** that inherits from `GuardrailBase` (or reuse `TextClassificationGuardrail`).
222+
2. **Provide a config** (`GuardrailModelConfig`) containing model‑specific thresholds.
223+
3. **Add a `register_routes(app)` function** in a new module (e.g., `my_new_guard.py`) that builds the guard‑rail
224+
instance and registers its Flask route.
225+
4. **Update the registry** in `llm_router_services/router.py`:
226+
227+
```python
228+
_SERVICE_REGISTRY.append({
229+
"module": "llm_router_services.guardrails.my_new_guard",
230+
"env": "LLM_ROUTER_MY_NEW_GUARD_ENABLED",
231+
})
232+
```
233+
234+
5. **Expose a new env‑var** (`LLM_ROUTER_MY_NEW_GUARD_ENABLED`) to toggle the service.
235+
236+
No changes to the core router logic are required – the new endpoint appears automatically when the flag is set to `1`.
237+
238+
---
239+
240+
## 🧪 Development & testing
241+
242+
| Task | Command |
243+
|-------------------------|---------------------------------------------------|
244+
| Run unit tests (if any) | `pytest` |
245+
| Check code style | `autopep8 --diff . && pylint llm_router_services` |
246+
| Re‑build the package | `python setup.py sdist bdist_wheel` |
247+
| Clean generated files | `git clean -fdX` |
248+
249+
> **Note:** The repository currently contains only a minimal test suite. Feel free to add more tests under a `tests/`
250+
> directory.
251+
252+
---
54253

55-
* `MODEL_PATH` – path or Hugging‑Face hub identifier of the model.
56-
* `DEVICE``-1` for CPU or CUDA device index for GPU inference.
57-
* `FLASK_HOST` / `FLASK_PORT` – network binding for the Flask server.
254+
## 📦 Installation as a package
58255

59-
### Extensibility
256+
If you want to install the library from a remote repository or a local wheel:
60257

61-
The guardrail architecture is built around the **`GuardrailBase`** abstract class and a **factory** (
62-
`GuardrailClassifierModelFactory`). To add a new safety model:
258+
```shell script
259+
pip install git+https://github.com/your-org/llm_router_services.git
260+
# or, after building:
261+
pip install dist/llm_router_services-0.0.2-py3-none-any.whl
262+
```
63263

64-
1. Implement a concrete subclass of `GuardrailBase` (or reuse `TextClassificationGuardrail`).
65-
2. Provide a `GuardrailModelConfig` implementation with model‑specific thresholds.
66-
3. Register the model type in the factory if a new identifier is required.
264+
The package registers the entry point `llm_router_services.router:create_app` which can be used by any WSGI server (
265+
Gunicorn, uWSGI, etc.).
67266

68267
---
69268

70269
## 📜 License
71270

72-
See the [LICENSE](LICENSE) file.
271+
`llm_router_services` is released under the **Apache License 2.0**. See the full text in the [LICENSE](LICENSE) file.
73272

74273
---
75274

76-
*Happy masking and safe routing!*
275+
*Happy masking and safe routing!* 🎉

0 commit comments

Comments
 (0)