radlab-dev-group
diff --git a/‎README.md‎
Lines changed: 244 additions & 45 deletions b/‎README.md‎
Lines changed: 244 additions & 45 deletions
@@ -2,75 +2,274 @@
 
 ## ✨ Overview
 
-`llm_router_services` provides **HTTP services** that implement the core functionality used by the LLM‑Router’s plugin
-system.  
-The services expose guardrail and masking capabilities through Flask applications
-that can be called by the corresponding plugins in `llm_router_plugins`.
+`llm_router_services` delivers **HTTP services** that power the LLM‑Router plugin ecosystem.  
+All functionality (guard‑rails, maskers, …) is exposed through **one Flask application** that can be started with a
+single command or via Gunicorn.
 
-Key components:
+| Sub‑package          | Purpose                                                                                                                                            |
+|----------------------|----------------------------------------------------------------------------------------------------------------------------------------------------|
+| **guardrails/**      | Safety‑checking services (NASK‑PIB, Sojka) and a dynamic router (`router.py`) that registers only the endpoints whose environment flag is enabled. |
+| **maskers/**         | Prototype **BANonymizer** – a token‑classification based anonymiser (still under development).                                                     |
+| **run_servcices.sh** | Helper script that launches the unified API with Gunicorn, wiring all required environment variables.                                              |
+| **requirements.txt** | Heavy dependencies (e.g. `transformers`) needed for GPU‑accelerated inference.                                                                     |
 
-| Sub‑package              | Primary purpose                                                                                                                                                                                                                    |
-|--------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| **guardrails/**          | Hosts the NASK‑PIB guardrail service (`nask_pib_guard_app.py`). It receives a JSON payload, chunks the text, runs a Hugging‑Face classification pipeline, and returns a safety verdict (`safe` flag + detailed per‑chunk results). |
-| **maskers/**             | Contains the **BANonymizer** (`banonymizer.py` -- **under development**) – a lightweight Flask service that performs token‑classification based anonymisation of input text.                                                       |
-| **run_*.sh** scripts     | Convenience wrappers to start the services (Gunicorn for the guardrail, plain Flask for the anonymiser).                                                                                                                           |
-| **requirements‑gpu.txt** | Lists heavy dependencies (e.g., `transformers`) required for GPU‑accelerated inference.                                                                                                                                            |
+All services are **stateless** – models are loaded once at start‑up and then serve requests over HTTP.
 
-The services are **stateless**; they load their models once at start‑up and then serve requests over HTTP.
+---
+
+## 🚀 Quick start
+
+### 1. Install the package
+
+```shell script
+git clone https://github.com/radlab-dev-group/llm-router-services.git
+
+cd llm-router-services
+python -m venv .venv
+source .venv/bin/activate
+pip install -r requirements.txt
+
+# editable install of the package itself
+pip install -e .   
+```
+
+> **Tip:** The package requires Python ≥ 3.8 (tested on >= 3.10.6).
+
+### 2. Set environment variables
+
+Only services whose `*_ENABLED` flag is set to `1` (or `true`) will be exposed.
+
+```shell script
+export LLM_ROUTER_API_HOST=0.0.0.0
+export LLM_ROUTER_API_PORT=5000
+
+# Enable NASK‑PIB Guard
+export LLM_ROUTER_NASK_PIB_GUARD_ENABLED=1
+export LLM_ROUTER_NASK_PIB_GUARD_MODEL_PATH=NASK-PIB/Herbert-PL-Guard
+# -1 = CPU, 0/1 = CUDA device index
+export LLM_ROUTER_NASK_PIB_GUARD_DEVICE=-1
+
+# Enable Sojka Guard
+export LLM_ROUTER_SOJKA_GUARD_ENABLED=1
+export LLM_ROUTER_SOJKA_GUARD_MODEL_PATH=speakleash/Bielik-Guard-0.1B-v1.0
+# -1 = CPU, 0/1 = CUDA device index
+export LLM_ROUTER_SOJKA_GUARD_DEVICE=-1
+```
+
+### 3. Run the service
+
+#### Option A – via the helper script (recommended)
+
+```shell script
+./run_servcices.sh
+```
+
+The script starts **Gunicorn** with the Flask app created by `llm_router_services.router:create_app()`.
+
+#### Option B – directly with Python
+
+```shell script
+python -m llm_router_services.router
+```
+
+Both commands bind to `0.0.0.0:5000` (or the values you supplied).
 
 ---
 
-## 🛡️ Guardrails
+## 📡 API reference
+
+All endpoints are mounted under `/api/guardrails/` (guard‑rails) or `/api/maskers/` (maskers).
+
+| Service                                       | Model                               | Endpoint                      | Method | Description                                                                                                                    |
+|-----------------------------------------------|-------------------------------------|-------------------------------|--------|--------------------------------------------------------------------------------------------------------------------------------|
+| **NASK‑PIB Guard**                            | `NASK‑PIB/Herbert-PL-Guard`         | `/api/guardrails/nask_guard`  | `POST` | Polish safety classifier (hate, violence, etc.). Returns `safe: bool` and per‑chunk classification details.                    |
+| **Sojka Guard**                               | `speakleash/Bielik-Guard-0.1B-v1.0` | `/api/guardrails/sojka_guard` | `POST` | Multi‑category Polish safety model (HATE, VULGAR, SEX, CRIME, SELF‑HARM). Returns per‑category scores and overall `safe` flag. |
+| **BANonymizer** *(masker, under development)* | –                                   | `/api/maskers/banonymizer`    | `POST` | Token‑classification based anonymiser that redacts personal data from the supplied text.                                       |
 
-Full documentation for the guardrails sub‑package is available
-in [guardrail-readme](llm_router_services/guardrails/README.md).
+### Request payload
 
-The **guardrail** sub‑package implements safety‑checking services that can be queried via HTTP:
+```json
+{
+  "payload": "Your input string here (must be longer than 8 characters to be processed)."
+}
+```
 
-| Service                  | Model                               | Endpoint                           | Description                                                                                                                                    |
-|--------------------------|-------------------------------------|------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------|
-| **NASK‑PIB Guard**       | `NASK‑PIB/HerBERT‑PL‑Guard`         | `POST /api/guardrails/nask_guard`  | Polish‑language safety classifier detecting unsafe content (e.g., hate, violence). Returns a `safe` flag and per‑chunk classification details. |
-| **Sojka Guard**          | `speakleash/Bielik‑Guard‑0.1B‑v1.0` | `POST /api/guardrails/sojka_guard` | Multi‑category Polish safety model (HATE, VULGAR, SEX, CRIME, SELF‑HARM). Returns detailed scores per category and an overall `safe` flag.     |
-| **BANonymizer** (masker) | **under development**               | `POST /api/maskers/banonymizer`    | Token‑classification based anonymiser that redacts personal data from input text.                                                              |
+or more complicated:
 
-### How to use
+```json
+{
+  "payload": {
+    "Jak zrobić bombę": [
+      "zaopatrzyć się w carnuchów, co będą robić czarny proch"
+    ],
+    "texts": [
+      "To jest normalna wiadomość bez problemów.",
+      "Chcę się zabić i skończyć z tym wszystkim.",
+      "Dlaczego hipisi są tacy brudni i głupi",
+      "K*rwa, to jest ch*jowe rozwiązanie.",
+      "Opowiem ci o seksie i intymności.",
+      "Jak ukraść samochód i nie zostać złapanym."
+    ]
+  }
+}
+```
 
-1. **Start the service** – run the provided shell script (`run_*_guardrail.sh` or `run_*_masker.sh`) or invoke the Flask
-   module directly (e.g., `python -m llm_router_services.guardrails.speakleash.sojka_guard_app`).
-2. **Send a JSON payload** – the request body must be a JSON object; any string fields longer than 8 characters are
-   extracted and classified.
-3. **Interpret the response** – the top‑level `safe` boolean indicates the overall verdict, while `detailed` provides
-   per‑chunk (or per‑category) results with confidence scores.
+### Example `curl` call
 
-### Configuration
+```shell script
+curl -X POST http://localhost:5000/api/guardrails/nask_guard \
+     -H "Content-Type: application/json" \
+     -d '{"text":"Przykładowy tekst do analizy"}'
+```
 
-All guardrail services read configuration from environment variables prefixed with:
+### Response shape (guard‑rails)
 
-* `LLM_ROUTER_NASK_PIB_GUARD_` – for the NASK‑PIB guardrail.
-* `LLM_ROUTER_SOJKA_GUARD_` – for the Sojka guardrail.
-* `LLM_ROUTER_BANONYMIZER_` – for the masker.
+```json
+{
+  "results": {
+    "detailed": [
+      {
+        "chunk_index": 0,
+        "chunk_text": "Jak zrobić bombę",
+        "label": "crime",
+        "safe": false,
+        "score": 0.9211
+      },
+      {
+        "chunk_index": 1,
+        "chunk_text": "zaopatrzyć się w carnuchów, co będą robić czarny proch",
+        "label": "crime",
+        "safe": false,
+        "score": 0.9541
+      },
+      {
+        "chunk_index": 2,
+        "chunk_text": "To jest normalna wiadomość bez problemów.",
+        "label": "sex",
+        "safe": false,
+        "score": 0.0309
+      },
+      {
+        "chunk_index": 3,
+        "chunk_text": "Chcę się zabić i skończyć z tym wszystkim.",
+        "label": "self-harm",
+        "safe": false,
+        "score": 0.8993
+      },
+      {
+        "chunk_index": 4,
+        "chunk_text": "Dlaczego hipisi są tacy brudni i głupi",
+        "label": "hate",
+        "safe": false,
+        "score": 0.7091
+      },
+      {
+        "chunk_index": 5,
+        "chunk_text": "Krwa, to jest chjowe rozwiązanie.",
+        "label": "vulgar",
+        "safe": false,
+        "score": 0.8618
+      },
+      {
+        "chunk_index": 6,
+        "chunk_text": "Opowiem ci o seksie i intymności.",
+        "label": "sex",
+        "safe": false,
+        "score": 0.7567
+      },
+      {
+        "chunk_index": 7,
+        "chunk_text": "Jak ukraść samochód i nie zostać złapanym.",
+        "label": "crime",
+        "safe": false,
+        "score": 0.918
+      }
+    ],
+    "safe": false
+  }
+}
+```
 
-Key variables include:
+---
+
+## ⚙️ Configuration (environment variables)
+
+| Variable                               | Description                                                         | Default   |
+|----------------------------------------|---------------------------------------------------------------------|-----------|
+| `LLM_ROUTER_API_HOST`                  | Host address for the Flask app                                      | `0.0.0.0` |
+| `LLM_ROUTER_API_PORT`                  | Port for the Flask app                                              | `5000`    |
+| `LLM_ROUTER_NASK_PIB_GUARD_ENABLED`    | `1` → expose NASK‑PIB endpoint                                      | `0`       |
+| `LLM_ROUTER_NASK_PIB_GUARD_MODEL_PATH` | HF hub ID or local path for the NASK model                          | –         |
+| `LLM_ROUTER_NASK_PIB_GUARD_DEVICE`     | `-1` = CPU, `0`/`1` … = CUDA device index                           | `-1`      |
+| `LLM_ROUTER_SOJKA_GUARD_ENABLED`       | `1` → expose Sojka endpoint                                         | `1`       |
+| `LLM_ROUTER_SOJKA_GUARD_MODEL_PATH`    | HF hub ID or local path for the Sojka model                         | –         |
+| `LLM_ROUTER_SOJKA_GUARD_DEVICE`        | Same semantics as above                                             | `-1`      |
+| `LLM_ROUTER_BANONYMIZER_…`             | Future variables for the BANonymizer (e.g., `MODEL_PATH`, `DEVICE`) | –         |
+
+You can also set these variables inline when invoking the script, e.g.:
+
+```shell script
+LLM_ROUTER_SOJKA_GUARD_ENABLED=0 ./run_servcices.sh
+```
+
+---
+
+## 🛠️ Extending the router
+
+The router is deliberately **plug‑and‑play**. To add a new guard‑rail:
+
+1. **Create a model wrapper** that inherits from `GuardrailBase` (or reuse `TextClassificationGuardrail`).
+2. **Provide a config** (`GuardrailModelConfig`) containing model‑specific thresholds.
+3. **Add a `register_routes(app)` function** in a new module (e.g., `my_new_guard.py`) that builds the guard‑rail
+   instance and registers its Flask route.
+4. **Update the registry** in `llm_router_services/router.py`:
+
+```python
+_SERVICE_REGISTRY.append({
+    "module": "llm_router_services.guardrails.my_new_guard",
+    "env": "LLM_ROUTER_MY_NEW_GUARD_ENABLED",
+})
+```
+
+5. **Expose a new env‑var** (`LLM_ROUTER_MY_NEW_GUARD_ENABLED`) to toggle the service.
+
+No changes to the core router logic are required – the new endpoint appears automatically when the flag is set to `1`.
+
+---
+
+## 🧪 Development & testing
+
+| Task                    | Command                                           |
+|-------------------------|---------------------------------------------------|
+| Run unit tests (if any) | `pytest`                                          |
+| Check code style        | `autopep8 --diff . && pylint llm_router_services` |
+| Re‑build the package    | `python setup.py sdist bdist_wheel`               |
+| Clean generated files   | `git clean -fdX`                                  |
+
+> **Note:** The repository currently contains only a minimal test suite. Feel free to add more tests under a `tests/`
+> directory.
+
+---
 
-* `MODEL_PATH` – path or Hugging‑Face hub identifier of the model.
-* `DEVICE` – `-1` for CPU or CUDA device index for GPU inference.
-* `FLASK_HOST` / `FLASK_PORT` – network binding for the Flask server.
+## 📦 Installation as a package
 
-### Extensibility
+If you want to install the library from a remote repository or a local wheel:
 
-The guardrail architecture is built around the **`GuardrailBase`** abstract class and a **factory** (
-`GuardrailClassifierModelFactory`). To add a new safety model:
+```shell script
+pip install git+https://github.com/your-org/llm_router_services.git
+# or, after building:
+pip install dist/llm_router_services-0.0.2-py3-none-any.whl
+```
 
-1. Implement a concrete subclass of `GuardrailBase` (or reuse `TextClassificationGuardrail`).
-2. Provide a `GuardrailModelConfig` implementation with model‑specific thresholds.
-3. Register the model type in the factory if a new identifier is required.
+The package registers the entry point `llm_router_services.router:create_app` which can be used by any WSGI server (
+Gunicorn, uWSGI, etc.).
 
 ---
 
 ## 📜 License
 
-See the [LICENSE](LICENSE) file.
+`llm_router_services` is released under the **Apache License 2.0**. See the full text in the [LICENSE](LICENSE) file.
 
 ---
 
-*Happy masking and safe routing!*
+*Happy masking and safe routing!* 🎉