The OpenClaw2Go model registry is an open collection of model configurations for running AI models on GPU pods. Community contributions help expand the model catalog for everyone.
- Run your model on an OpenClaw2Go pod
- Export your config:
openclaw2go registry export --format issue - Open a New Model Issue
- Paste the exported config and test evidence
- A maintainer will review and merge your contribution
- Fork this repository
- Create a new JSON file in
models/(use an existing file as reference) - Run validation:
python3 scripts/validate.py - Submit a Pull Request
CI will automatically validate your JSON and check that the HuggingFace repo exists.
Each model is a JSON file in models/ with these fields:
| Field | Required | Description |
|---|---|---|
id |
Yes | Unique ID in provider/name format (lowercase) |
name |
Yes | Human-readable model name |
type |
Yes | llm, audio, or image |
engine |
Yes | llamacpp, llamacpp-audio, image-gen, or vllm |
repo |
Yes | HuggingFace repository name |
files |
Yes | Array of files to download from the repo |
downloadDir |
Yes | Must start with /workspace/models/ |
servedAs |
Yes (LLM) | Model name exposed via API |
vram |
Yes | Object with model (MB) and overhead (MB) fields |
kvCacheMbPer1kTokens |
Recommended | KV cache VRAM per 1k tokens (with q8_0) |
defaults |
Recommended | Default contextLength and port |
startDefaults |
Optional | Default values like gpuLayers, parallel |
extraStartArgs |
Optional | Additional CLI args for the engine |
provider |
Yes (LLM) | Provider config with name and api |
default |
Yes | Whether this is the default for its type (usually false) |
status |
Yes | stable, experimental, or deprecated |
verifiedOn |
Optional | Array of GPU names verified on |
VRAM values should be measured, not guessed:
- Start the model on a pod
- Run
nvidia-smiand note VRAM usage - Set
vram.modelto the model weight VRAM (approximate) - Set
vram.overheadto the remaining VRAM minus KV cache
For LLM models, measure kvCacheMbPer1kTokens:
- Run model with a known context length (e.g., 150k)
- Note total VRAM used
- Calculate:
(total_vram - model_vram - overhead) / (context_length / 1000)
This value should reflect q8_0 KV quantization (the entrypoint uses -ctk q8_0 -ctv q8_0).
Before submitting, validate your config:
python3 scripts/validate.py
python3 scripts/validate.py --check-hf # Also verify HF repos existdownloadDirmust start with/workspace/models/(path restriction)enginemust be one of the known engines (engine whitelist)extraStartArgsare passed as CLI args to known binaries only (no code execution)- All merges require maintainer review