Skip to content

Commit a2a0f2c

Browse files
Phase 1.5: add soup chat, soup push, DPO trainer + smoke tests
- soup chat --model ./path: interactive terminal chat with LoRA adapters (auto-detects base model, supports /quit /clear /system commands) - soup push --model ./path --repo user/model: upload to HuggingFace Hub (auto model card generation, token from env/cache/flag) - DPO trainer: full DPOTrainerWrapper with LoRA + quantization support (configurable dpo_beta, preference data format {prompt, chosen, rejected}) - Smoke tests: real SFT + DPO training with tiny-gpt2 (pytest -m smoke) - SFT trainer: fallback for models without chat_template - Updated README, schema, formats, pyproject.toml, .gitignore Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent d167cd4 commit a2a0f2c

File tree

16 files changed

+1083
-13
lines changed

16 files changed

+1083
-13
lines changed

.claude/settings.json

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,12 +13,15 @@
1313
"Bash(git remote*)",
1414
"Bash(ruff check*)",
1515
"Bash(ruff format*)",
16+
"Bash(python -m ruff*)",
1617
"Bash(pytest*)",
1718
"Bash(python -m pytest*)",
1819
"Bash(pip install*)",
1920
"Bash(pip list*)",
2021
"Bash(soup *)",
21-
"Bash(python -m soup_cli*)"
22+
"Bash(python -m soup_cli*)",
23+
"Bash(cd /c/Users/tokmo/peder/Soup && python -m pytest*)",
24+
"Bash(cd /c/Users/tokmo/peder/Soup && ruff check*)"
2225
]
2326
}
2427
}

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,3 +34,6 @@ wandb/
3434
# Secrets
3535
.env
3636
*.key
37+
38+
# Internal plan (not for repo)
39+
plan.md

README.md

Lines changed: 58 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -86,6 +86,50 @@ training:
8686
output: ./output
8787
```
8888
89+
## DPO Training
90+
91+
Train with preference data using Direct Preference Optimization:
92+
93+
```yaml
94+
base: meta-llama/Llama-3.1-8B-Instruct
95+
task: dpo
96+
97+
data:
98+
train: ./data/preferences.jsonl
99+
format: dpo
100+
101+
training:
102+
epochs: 3
103+
dpo_beta: 0.1
104+
lora:
105+
r: 64
106+
alpha: 16
107+
quantization: 4bit
108+
```
109+
110+
## Chat with your model
111+
112+
```bash
113+
# Chat with a LoRA adapter (auto-detects base model)
114+
soup chat --model ./output
115+
116+
# Specify base model explicitly
117+
soup chat --model ./output --base meta-llama/Llama-3.1-8B-Instruct
118+
119+
# Adjust generation
120+
soup chat --model ./output --temperature 0.3 --max-tokens 256
121+
```
122+
123+
## Push to HuggingFace
124+
125+
```bash
126+
# Upload model to HF Hub
127+
soup push --model ./output --repo your-username/my-model
128+
129+
# Make it private
130+
soup push --model ./output --repo your-username/my-model --private
131+
```
132+
89133
## Data Formats
90134

91135
Soup supports these formats (auto-detected):
@@ -105,6 +149,11 @@ Soup supports these formats (auto-detected):
105149
{"messages": [{"role": "user", "content": "Hi"}, {"role": "assistant", "content": "Hello!"}]}
106150
```
107151

152+
**DPO (preference pairs):**
153+
```json
154+
{"prompt": "Explain gravity", "chosen": "Gravity is a force...", "rejected": "I don't know"}
155+
```
156+
108157
## Data Tools
109158

110159
```bash
@@ -121,12 +170,14 @@ soup data validate ./data/train.jsonl --format alpaca
121170
|---|---|
122171
| LoRA / QLoRA fine-tuning ||
123172
| SFT (Supervised Fine-Tune) ||
124-
| DPO (Direct Preference Optimization) | 🔜 |
173+
| DPO (Direct Preference Optimization) | |
125174
| Auto batch size ||
126175
| Auto GPU detection (CUDA/MPS/CPU) ||
127176
| Live terminal dashboard ||
128-
| Alpaca / ShareGPT / ChatML formats ||
177+
| Alpaca / ShareGPT / ChatML / DPO formats ||
129178
| HuggingFace datasets support ||
179+
| Interactive model chat ||
180+
| Push to HuggingFace Hub ||
130181
| Experiment tracking | 🔜 |
131182
| Web dashboard | 🔜 |
132183
| Cloud mode (BYOG) | 🔜 |
@@ -143,7 +194,12 @@ soup data validate ./data/train.jsonl --format alpaca
143194
git clone https://github.com/MakazhanAlpamys/Soup.git
144195
cd Soup
145196
pip install -e ".[dev]"
197+
198+
# Run unit tests (fast, no GPU needed)
146199
pytest tests/ -v
200+
201+
# Run smoke tests (downloads tiny model, runs real training)
202+
pytest tests/ -m smoke -v
147203
```
148204

149205
## License

pyproject.toml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@ dependencies = [
3333
"datasets>=2.14.0",
3434
"bitsandbytes>=0.41.0",
3535
"accelerate>=0.25.0",
36+
"huggingface-hub>=0.16.0",
3637
]
3738

3839
[project.optional-dependencies]
@@ -57,3 +58,5 @@ select = ["E", "F", "I", "N", "W"]
5758

5859
[tool.pytest.ini_options]
5960
testpaths = ["tests"]
61+
markers = ["smoke: slow smoke tests that download models and run training (run with: pytest -m smoke)"]
62+
addopts = "-m 'not smoke'"

soup_cli/cli.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
from rich.console import Console
55

66
from soup_cli import __version__
7-
from soup_cli.commands import data, init, train
7+
from soup_cli.commands import chat, data, init, push, train
88

99
console = Console()
1010

@@ -18,6 +18,8 @@
1818
# Register sub-commands
1919
app.command()(init.init)
2020
app.command()(train.train)
21+
app.command()(chat.chat)
22+
app.command()(push.push)
2123
app.add_typer(data.app, name="data", help="Dataset tools: inspect, convert, validate.")
2224

2325

0 commit comments

Comments
 (0)