Skip to content

Commit 65a9510

Browse files
AbdullahKIRMANsukrukirman
andauthored
docs: enhance README with installation instructions, usage examples, and troubleshooting tips (#7)
Co-authored-by: sukrukirman <[email protected]>
1 parent dda4162 commit 65a9510

File tree

2 files changed

+271
-83
lines changed

2 files changed

+271
-83
lines changed

README.md

Lines changed: 169 additions & 83 deletions
Original file line numberDiff line numberDiff line change
@@ -5,131 +5,217 @@
55
[![Moderators CI](https://github.com/viddexa/moderators/actions/workflows/ci.yml/badge.svg?branch=main)](https://github.com/viddexa/moderators/actions/workflows/ci.yml)
66
[![Moderators License](https://img.shields.io/pypi/l/moderators)](https://github.com/viddexa/moderators/blob/main/LICENSE)
77

8-
# TODO: refactor readme to target users instead of maintainers
8+
Run open‑source content moderation models (NSFW, toxicity, etc.) with one line — from Python or the CLI. Works with Hugging Face models or local folders. Outputs are normalized and app‑ready.
99

10-
This repository provides an extensible core skeleton for content moderation. Phase 1 includes:
11-
- Standard data classes (Box, PredictionResult)
12-
- BaseModerator flow (predict → _preprocess → _predict → _postprocess)
13-
- ModelHubMixin-based `AutoModerator` factory (reads `config.json` from HF Hub or local)
14-
- CLI: `moderators` (load and run inference)
10+
- One simple API and CLI
11+
- Use any compatible Transformers model from the Hub or disk
12+
- Normalized JSON output you can plug into your app
13+
- Optional auto‑install of dependencies for a smooth first run
1514

16-
First integration: Transformers.
15+
Note: Today we ship a Transformers-based integration for image/text classification.
1716

18-
## Installation
1917

20-
Create Python environment (Python 3.10+ recommended):
18+
## Who is this for?
19+
Developers and researchers/academics who want to quickly evaluate or deploy moderation models without wiring different runtimes or dealing with model‑specific output formats.
2120

22-
```bash
23-
uv venv --python 3.10
24-
source .venv/bin/activate
25-
```
2621

27-
Install with pip:
22+
## Installation
23+
Pick one option:
2824

25+
Using pip (recommended):
2926
```bash
30-
pip install moderators[transformers]
27+
pip install moderators
3128
```
3229

33-
Install with uv:
34-
30+
Using uv:
3531
```bash
36-
uv add "moderators[transformers]"
32+
uv venv --python 3.10
33+
source .venv/bin/activate
34+
uv add moderators
3735
```
3836

39-
Install from source:
40-
37+
From source (cloned repo):
4138
```bash
4239
uv sync --extra transformers
4340
```
4441

45-
## Quick Start
42+
Requirements:
43+
- Python 3.10+
44+
- For image tasks, Pillow and a DL framework (PyTorch preferred). Moderators can auto‑install these.
45+
46+
47+
## Quickstart
48+
Run a model in a few lines.
4649

50+
Python API:
4751
```python
4852
from moderators.auto_model import AutoModerator
4953

50-
moderator = AutoModerator.from_pretrained("org/model") # or a local folder path
51-
results = moderator("some input")
52-
print(results)
54+
# Load from the Hugging Face Hub (e.g., NSFW image classifier)
55+
moderator = AutoModerator.from_pretrained("viddexa/nsfw-mini")
56+
57+
# Run on a local image path
58+
result = moderator("/path/to/image.jpg")
59+
print(result)
5360
```
5461

55-
`config.json` example (Transformers):
56-
```json
57-
{
58-
"architecture": "TransformersModerator",
59-
"task": "image-classification"
60-
}
62+
CLI:
63+
```bash
64+
moderators viddexa/nsfw-mini /path/to/image.jpg
6165
```
6266

63-
- Naming convention: the `XyzModerator` class must be defined in `moderators/integrations/xyz_moderator.py`.
64-
- Note: `AutoModerator` is a factory class; it returns the actual integration instance.
67+
Text example (sentiment/toxicity):
68+
```bash
69+
moderators distilbert/distilbert-base-uncased-finetuned-sst-2-english "I love this!"
70+
```
6571

66-
## Automatic dependency installation
67-
When using the Transformers integration, the library may auto-install missing dependencies at runtime:
68-
- transformers
69-
- A deep learning framework (PyTorch preferred: torch)
70-
- Pillow (for image tasks)
7172

72-
It uses `uv` if available, otherwise falls back to `pip`. Disable auto-install via:
73-
```
74-
export MODERATORS_DISABLE_AUTO_INSTALL=1
73+
## What do results look like?
74+
You get a list of normalized prediction entries. In Python, they’re dataclasses; in the CLI, you get JSON.
75+
76+
Python shape (pretty-printed):
77+
```text
78+
[
79+
PredictionResult(
80+
source_path='',
81+
classifications={'NSFW': 0.9821},
82+
detections=[],
83+
raw_output={'label': 'NSFW', 'score': 0.9821}
84+
),
85+
...
86+
]
7587
```
7688

77-
## Usage Overview
78-
`AutoModerator.from_pretrained("org/model")` dynamically loads the correct integration class based on the `"architecture"` field in `config.json`.
89+
JSON shape (CLI output):
90+
```json
91+
[
92+
{
93+
"source_path": "",
94+
"classifications": {"NSFW": 0.9821},
95+
"detections": [],
96+
"raw_output": {"label": "NSFW", "score": 0.9821}
97+
}
98+
]
99+
```
79100

80-
## Command Line (CLI)
81-
Run models directly from the terminal.
101+
Tip (Python):
102+
```python
103+
from dataclasses import asdict
104+
from moderators.auto_model import AutoModerator
82105

83-
Usage:
84-
```
85-
moderators <model_id_or_local_dir> <input> [--local-files-only]
106+
moderator = AutoModerator.from_pretrained("viddexa/nsfw-mini")
107+
result = moderator("/path/to/image.jpg")
108+
json_ready = [asdict(r) for r in result]
109+
print(json_ready)
86110
```
87111

88-
Examples:
89-
- Text classification:
90-
```
91-
moderators distilbert/distilbert-base-uncased-finetuned-sst-2-english "I love this!"
92-
```
93112

94-
- Image classification (Falconsai/nsfw_image_detection) with a local image:
113+
## Example: Real output on a sample image
114+
Image source:
115+
116+
![Example input image](https://img.freepik.com/free-photo/front-view-woman-doing-exercises_23-2148498678.jpg?t=st=1760435237~exp=1760438837~hmac=9a0a0a56f83d8fa52f424c7acdf4174dffc3e4d542e189398981a13af3f82b40&w=360)
117+
118+
Raw model scores:
119+
```json
120+
[
121+
{ "normal": 0.9999891519546509 },
122+
{ "nsfw": 0.000010843970812857151 }
123+
]
95124
```
96-
moderators Falconsai/nsfw_image_detection /path/to/image.jpg
125+
126+
Moderators normalized JSON shape:
127+
```json
128+
[
129+
{ "source_path": "", "classifications": {"normal": 0.9999891519546509}, "detections": [], "raw_output": {"label": "normal", "score": 0.9999891519546509} },
130+
{ "source_path": "", "classifications": {"nsfw": 0.000010843970812857151}, "detections": [], "raw_output": {"label": "nsfw", "score": 0.000010843970812857151} }
131+
]
97132
```
98133

99-
Notes:
100-
- The CLI prints JSON to stdout.
101-
- Use `--local-files-only` to force offline usage if all files are already cached.
102134

103-
## Transformers config inference
104-
If `"architecture"` is missing but the config looks like a Transformers model (e.g., has `architectures`, `transformers_version`, `id2label`/`label2id`), the factory assumes:
105-
- `architecture = "TransformersModerator"`
106-
- It tries to infer `"task"` (e.g., classification). If it cannot infer, you must specify `"task"` explicitly (e.g., `"image-classification"`).
135+
## Comparison at a glance
136+
The table below places Moderators next to the raw Transformers `pipeline()` usage.
107137

108-
## Callbacks
109-
Moderators run a minimal callback system around prediction:
110-
- `on_predict_start(moderator)` is called before prediction.
111-
- `on_predict_end(moderator)` is called after prediction.
138+
| Feature | Transformers.pipeline() | Moderators |
139+
|---|---|---|
140+
| Usage | `pipeline("task", model=...)` | `AutoModerator.from_pretrained(...)` |
141+
| Model configuration | Manual or model-specific | Automatic via `config.json` (task inference when possible) |
142+
| Output format | Varies by model/pipe | Standardized `PredictionResult` / JSON |
143+
| Requirements | Manual dependency setup | Optional automatic `pip/uv` install |
144+
| CLI | None or project-specific | Built-in `moderators` CLI (JSON to stdout) |
145+
| Extensibility | Mostly one ecosystem | Open to new integrations (same interface) |
146+
| Error messages | Vary by model | Consistent, task/integration-guided |
147+
| Task detection | User-provided | Auto-inferred from config when possible |
112148

113-
By default, `on_predict_start` enqueues a lightweight analytics event (see below). You can customize per-instance callbacks:
114-
```python
115-
mod = AutoModerator.from_pretrained("org/model")
116-
# Disable all start callbacks (including analytics)
117-
mod.callbacks["on_predict_start"].clear()
118-
# Or add your own callback
119-
def my_callback(m):
120-
print("Starting inference for", m.model_id)
121-
mod.callbacks["on_predict_start"].append(my_callback)
122-
```
123149

124-
## Anonymous Telemetry
150+
## Pick a model
151+
- From the Hub: pass a model id like `viddexa/nsfw-mini` or any compatible Transformers model.
152+
- From disk: pass a local folder that contains a `config.json` next to your weights.
125153

126-
We believe in providing our users with full control over their data. By default, our package is configured to collect analytics to help improve the experience for all users. However, we respect that some users may prefer to opt out of this data collection.
154+
Moderators detects the task and integration from the config when possible, so you don’t have to specify pipelines manually.
127155

128-
To opt out of sending analytics, you can simply create `~/.moderators/settings.json` file with `"sync": false`. This ensures that no data is transmitted from your machine to our analytics tools.
129156

130-
## Limitations (Phase 1)
131-
- Only `TransformersModerator` is supported; other architectures raise `NotImplementedError`.
132-
- Image tasks require Pillow and at least one DL framework (preferably PyTorch). The library may attempt auto-install, otherwise it will raise an error.
157+
## Command line usage
158+
Run models from your terminal and get normalized JSON to stdout.
133159

134-
## Integrations
135-
- Transformers integration
160+
Usage:
161+
```bash
162+
moderators <model_id_or_local_dir> <input> [--local-files-only]
163+
```
164+
165+
Examples:
166+
- Text classification:
167+
```bash
168+
moderators distilbert/distilbert-base-uncased-finetuned-sst-2-english "I love this!"
169+
```
170+
- Image classification (local image):
171+
```bash
172+
moderators viddexa/nsfw-mini /path/to/image.jpg
173+
```
174+
175+
Tips:
176+
- `--local-files-only` forces offline usage if files are cached.
177+
- The CLI prints a single JSON array (easy to pipe or parse).
178+
179+
180+
## Examples
181+
- Small demos and benchmarking script: `examples/README.md`, `examples/benchmarks.py`
182+
183+
184+
## FAQ
185+
- Which tasks are supported?
186+
- Image and text classification via Transformers (e.g., NSFW, sentiment/toxicity). More can be added over time.
187+
- Does it need a GPU?
188+
- No. CPU is fine for small models. If your framework has CUDA installed, it will use it.
189+
- How are dependencies handled?
190+
- If something is missing (e.g., `torch`, `transformers`, `Pillow`), Moderators can auto‑install via `uv` or `pip` unless you disable it. To disable:
191+
```bash
192+
export MODERATORS_DISABLE_AUTO_INSTALL=1
193+
```
194+
- Can I run offline?
195+
- Yes. Use `--local-files-only` in the CLI or `local_files_only=True` in Python after you have the model cached.
196+
- What does “normalized output” mean?
197+
- Regardless of the underlying pipeline, you always get the same result schema (classifications/detections/raw_output), so your app code stays simple.
198+
199+
200+
## Roadmap
201+
What’s planned:
202+
- Ultralytics integration (YOLO family) via `UltralyticsModerator`
203+
- Optional ONNX Runtime backend where applicable
204+
- Simple backend switch (API/CLI flag, e.g., `--backend onnx|torch`)
205+
- Expanded benchmarks: latency, throughput, memory on common tasks
206+
- Documentation and examples to help you pick the right option
207+
208+
209+
## Troubleshooting
210+
- ImportError (PIL/torch/transformers):
211+
- Install the package (`pip install moderators`) or let auto‑install run (ensure `MODERATORS_DISABLE_AUTO_INSTALL` is unset). If you prefer manual dependency control, install extras: `pip install "moderators[transformers]"`.
212+
- OSError: couldn’t find `config.json` / model files:
213+
- Check your model id or local folder path; ensure `config.json` is present.
214+
- HTTP errors when pulling from the Hub:
215+
- Verify connectivity and auth (if private). Use offline mode if already cached.
216+
- GPU not used:
217+
- Ensure your framework is installed with CUDA support.
218+
219+
220+
## License
221+
Apache-2.0. See `LICENSE`.

0 commit comments

Comments
 (0)