Skip to content

Commit e2e08c3

Browse files
darrenangleclaude
andcommitted
Add SFT data generation pipeline for poetry training
Implements multi-agent orchestrator supporting both Claude Code agents and OpenRouter models (Kimi K2, DeepSeek) for generating verified poetry with dual reasoning traces (SYNTH stenographic + natural language). Key components: - src/abide/synth_trace.py: SYNTH trace synthesis using Baguettotron notation - scripts/openrouter_generator.py: OpenRouter API integration with abide verification - scripts/sft_orchestrator.py: Parallel agent coordination with configurable mix - .claude/commands/generate-sft.md: Slash command interface - config/sft_generation.yaml: Default configuration for top 10 learnable forms Also adds GRPO training infrastructure for Baguettotron with variance-based form selection and DAPO loss to prevent entropy collapse. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
1 parent 13b5ecb commit e2e08c3

File tree

11 files changed

+4364
-4
lines changed

11 files changed

+4364
-4
lines changed

.beads/issues.jsonl

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,7 @@
5252
{"id":"abide-4ei","title":"Add Tritina (3x3-line stanzas + 1-line envoi)","description":"","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-07T19:03:41.484179717-06:00","updated_at":"2025-12-08T11:38:10.128485781-06:00","closed_at":"2025-12-08T11:38:10.128485781-06:00"}
5353
{"id":"abide-4ie","title":"Blues Poem form implementation","description":"AAa tercets with L1 repeated/varied as L2, L3 rhymes","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-07T14:29:20.833270567-06:00","updated_at":"2025-12-07T14:44:48.770427318-06:00","closed_at":"2025-12-07T14:44:48.770427318-06:00","dependencies":[{"issue_id":"abide-4ie","depends_on_id":"abide-rgl","type":"blocks","created_at":"2025-12-07T14:29:36.510251614-06:00","created_by":"daemon"}]}
5454
{"id":"abide-4k6","title":"Ghazal form implementation","description":"5-15 couplets, AA BA CA pattern with radif (refrain) and qafiya (rhyme)","status":"closed","priority":1,"issue_type":"task","created_at":"2025-12-07T14:29:16.027733238-06:00","updated_at":"2025-12-07T14:44:48.767802115-06:00","closed_at":"2025-12-07T14:44:48.767802115-06:00","dependencies":[{"issue_id":"abide-4k6","depends_on_id":"abide-rgl","type":"blocks","created_at":"2025-12-07T14:29:36.474694476-06:00","created_by":"daemon"}]}
55+
{"id":"abide-4l9w","title":"Build SFT orchestrator script","description":"Create scripts/sft_orchestrator.py that coordinates form agents, manages model mix configuration, and handles progress tracking with append-only JSONL output","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-29T16:43:15.709428915-06:00","updated_at":"2025-12-29T16:51:14.531810751-06:00","closed_at":"2025-12-29T16:51:14.531810751-06:00","dependencies":[{"issue_id":"abide-4l9w","depends_on_id":"abide-ww9u","type":"blocks","created_at":"2025-12-29T16:43:45.519612388-06:00","created_by":"darren"}]}
5556
{"id":"abide-4pq7","title":"Audit StaircasePoem verification","description":"","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-11T10:55:22.217891418-06:00","updated_at":"2025-12-11T11:08:01.630787869-06:00","closed_at":"2025-12-11T11:08:01.630787869-06:00"}
5657
{"id":"abide-4tf","title":"Audit ChantRoyal verification","description":"","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-11T10:45:53.232943326-06:00","updated_at":"2025-12-11T11:08:01.639200354-06:00","closed_at":"2025-12-11T11:08:01.639200354-06:00"}
5758
{"id":"abide-4x3i","title":"Update training script and start background training","description":"Apply optimal params to train_grpo.py and launch training","status":"open","priority":2,"issue_type":"task","created_at":"2025-12-23T20:52:41.086438031-06:00","updated_at":"2025-12-23T20:52:41.086438031-06:00","dependencies":[{"issue_id":"abide-4x3i","depends_on_id":"abide-zuf1","type":"blocks","created_at":"2025-12-23T20:53:13.411609274-06:00","created_by":"darren"}]}
@@ -329,6 +330,7 @@
329330
{"id":"abide-nu6n","title":"Audit scoring: PrecisionVerse","description":"Check verify() scoring - fix if too lenient","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-23T07:12:14.948554602-06:00","updated_at":"2025-12-23T07:20:18.47301495-06:00","closed_at":"2025-12-23T07:20:18.47301495-06:00"}
330331
{"id":"abide-nuyk","title":"Audit scoring: BurnsStanza","description":"Check verify() scoring - fix if too lenient","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-23T07:06:19.301492189-06:00","updated_at":"2025-12-23T07:23:41.581061048-06:00","closed_at":"2025-12-23T07:23:41.581061048-06:00"}
331332
{"id":"abide-o54","title":"Add Burns Stanza/Standard Habbie (AAABAB with short B)","description":"","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-07T19:04:17.055144771-06:00","updated_at":"2025-12-08T11:38:10.11266736-06:00","closed_at":"2025-12-08T11:38:10.11266736-06:00"}
333+
{"id":"abide-o7s9","title":"Implement OpenRouter generator with abide verification loop","description":"Create scripts/openrouter_generator.py that uses OpenRouter API for Kimi K2 and other models, with retry loop based on abide scores","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-29T16:42:57.629915288-06:00","updated_at":"2025-12-29T16:48:50.981580033-06:00","closed_at":"2025-12-29T16:48:50.981580033-06:00","dependencies":[{"issue_id":"abide-o7s9","depends_on_id":"abide-ww9u","type":"blocks","created_at":"2025-12-29T16:43:30.28092095-06:00","created_by":"darren"}]}
332334
{"id":"abide-ohs","title":"Add Mesostic constraint (middle letters spell word)","description":"","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-07T19:04:26.581522622-06:00","updated_at":"2025-12-08T11:38:10.10476702-06:00","closed_at":"2025-12-08T11:38:10.10476702-06:00"}
333335
{"id":"abide-ojyo","title":"Audit Quatrain verification","description":"","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-11T10:52:45.616301354-06:00","updated_at":"2025-12-11T11:02:03.353080894-06:00","closed_at":"2025-12-11T11:02:03.353080894-06:00"}
334336
{"id":"abide-otk","title":"Improve rentrement verification in Roundel","description":"\n## Current State\nRoundel verifies: LineCount(11), StanzaCount(3), RhymeScheme\nHas Refrain with threshold=0.5 for partial match but not included in constraints!\n\n## Issue\nThe _refrain is created but NOT added to the constraint list!\nLines 248-254 show Refrain is initialized but constraints list only has line_count and rhyme\n\n## Fix\nAdd self._refrain to the constraints list\n\n## Files\n- src/abide/forms/rondel.py:200-279\n","status":"closed","priority":3,"issue_type":"task","created_at":"2025-12-11T00:48:16.279305177-06:00","updated_at":"2025-12-11T00:55:16.467899729-06:00","closed_at":"2025-12-11T00:55:16.467899729-06:00"}
@@ -340,6 +342,7 @@
340342
{"id":"abide-p5ka","title":"Audit scoring: BlankVerse","description":"Check verify() scoring - fix if too lenient","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-23T07:05:59.015257782-06:00","updated_at":"2025-12-23T07:22:26.118064351-06:00","closed_at":"2025-12-23T07:22:26.118064351-06:00"}
341343
{"id":"abide-pa98","title":"Audit scoring: GoldenRatio","description":"Check verify() scoring - fix if too lenient","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-23T07:09:16.991904061-06:00","updated_at":"2025-12-23T07:25:43.120652676-06:00","closed_at":"2025-12-23T07:25:43.120652676-06:00"}
342344
{"id":"abide-pcj9","title":"Audit Quatina verification","description":"","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-11T10:52:40.540319609-06:00","updated_at":"2025-12-11T11:08:01.632623229-06:00","closed_at":"2025-12-11T11:08:01.632623229-06:00"}
345+
{"id":"abide-pldd","title":"Create /generate-sft slash command","description":"Create .claude/commands/generate-sft.md that orchestrates parallel haiku agents and OpenRouter workers to generate verified poetry with dual reasoning traces","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-29T16:43:09.731029344-06:00","updated_at":"2025-12-29T16:51:14.524834028-06:00","closed_at":"2025-12-29T16:51:14.524834028-06:00","dependencies":[{"issue_id":"abide-pldd","depends_on_id":"abide-ww9u","type":"blocks","created_at":"2025-12-29T16:43:40.447443821-06:00","created_by":"darren"}]}
343346
{"id":"abide-pms","title":"Audit BlankVerse verification","description":"","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-11T10:45:17.766887446-06:00","updated_at":"2025-12-11T11:05:19.343041578-06:00","closed_at":"2025-12-11T11:05:19.343041578-06:00"}
344347
{"id":"abide-pne","title":"Verify Kyrielle refrain implementation","description":"\n## Question\nKyrielle should have a refrain (last line of each stanza repeats)\nNeed to check if this is currently verified in the implementation\n\n## Investigation\n1. Read kyrielle.py to check current constraints\n2. If refrain not verified, add Refrain constraint\n3. Traditional kyrielle: 8-syllable lines, AABB or ABAB rhyme, refrain line ends each quatrain\n\n## Files\n- src/abide/forms/kyrielle.py\n","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-11T00:47:36.57084964-06:00","updated_at":"2025-12-11T00:56:19.809815567-06:00","closed_at":"2025-12-11T00:56:19.809815567-06:00"}
345348
{"id":"abide-psr","title":"Audit Distich verification","description":"","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-11T10:47:12.623018471-06:00","updated_at":"2025-12-11T11:02:03.358048374-06:00","closed_at":"2025-12-11T11:02:03.358048374-06:00"}
@@ -393,6 +396,7 @@
393396
{"id":"abide-vghk","title":"Audit Mesostic verification","description":"","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-11T10:50:23.02576068-06:00","updated_at":"2025-12-11T11:00:38.436874926-06:00","closed_at":"2025-12-11T11:00:38.436874926-06:00"}
394397
{"id":"abide-vhcz","title":"Audit MonotoneMountain verification","description":"","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-11T10:50:43.292307731-06:00","updated_at":"2025-12-11T11:06:27.973305658-06:00","closed_at":"2025-12-11T11:06:27.973305658-06:00"}
395398
{"id":"abide-vksc","title":"Audit scoring: TriangularVerse","description":"Check verify() scoring - fix if too lenient","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-23T07:15:58.712355106-06:00","updated_at":"2025-12-23T07:25:30.570550107-06:00","closed_at":"2025-12-23T07:25:30.570550107-06:00"}
399+
{"id":"abide-vl8r","title":"Implement SYNTH trace synthesizer","description":"Create abide/synth_trace.py that converts rubric results to SYNTH stenographic format with confidence markers, section headers, and tree structures","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-29T16:43:03.74349666-06:00","updated_at":"2025-12-29T16:48:50.988037073-06:00","closed_at":"2025-12-29T16:48:50.988037073-06:00","dependencies":[{"issue_id":"abide-vl8r","depends_on_id":"abide-ww9u","type":"blocks","created_at":"2025-12-29T16:43:35.356668139-06:00","created_by":"darren"}]}
396400
{"id":"abide-vnpi","title":"Audit scoring: ThunderVerse","description":"Check verify() scoring - fix if too lenient","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-23T07:15:48.568643636-06:00","updated_at":"2025-12-23T07:25:30.574649396-06:00","closed_at":"2025-12-23T07:25:30.574649396-06:00"}
397401
{"id":"abide-voyp","title":"Audit scoring: Rondeau","description":"Check verify() scoring - fix if too lenient","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-23T07:13:16.252145369-06:00","updated_at":"2025-12-23T07:26:16.531436275-06:00","closed_at":"2025-12-23T07:26:16.531436275-06:00"}
398402
{"id":"abide-vvou","title":"Audit scoring: FibonacciPoem","description":"Check verify() scoring - fix if too lenient","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-23T07:08:56.512053375-06:00","updated_at":"2025-12-23T07:22:37.129386224-06:00","closed_at":"2025-12-23T07:22:37.129386224-06:00"}
@@ -410,6 +414,7 @@
410414
{"id":"abide-wqx","title":"Add Anaphora constraint (repeated opening words)","description":"","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-07T19:04:26.633413969-06:00","updated_at":"2025-12-08T11:38:10.103710923-06:00","closed_at":"2025-12-08T11:38:10.103710923-06:00"}
411415
{"id":"abide-wr3","title":"Audit FibonacciVerse verification","description":"","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-11T10:48:42.715489289-06:00","updated_at":"2025-12-11T11:06:27.976076299-06:00","closed_at":"2025-12-11T11:06:27.976076299-06:00"}
412416
{"id":"abide-wsnb","title":"Audit scoring: EchoEnd","description":"Check verify() scoring - fix if too lenient","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-23T07:08:15.940000365-06:00","updated_at":"2025-12-23T07:26:20.0345503-06:00","closed_at":"2025-12-23T07:26:20.0345503-06:00"}
417+
{"id":"abide-ww9u","title":"Epic: SFT Data Generation Pipeline for Reasoning Model Warmup","description":"","status":"in_progress","priority":2,"issue_type":"epic","created_at":"2025-12-29T16:31:23.689218876-06:00","updated_at":"2025-12-29T16:51:19.599143358-06:00"}
413418
{"id":"abide-wz1i","title":"Audit scoring: LiteraryBallad","description":"Check verify() scoring - fix if too lenient","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-23T07:10:33.514302683-06:00","updated_at":"2025-12-23T07:23:41.580708254-06:00","closed_at":"2025-12-23T07:23:41.580708254-06:00"}
414419
{"id":"abide-x2dt","title":"Audit scoring: Ghazal","description":"Check verify() scoring - fix if too lenient","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-23T07:09:11.876972622-06:00","updated_at":"2025-12-23T07:22:07.927394877-06:00","closed_at":"2025-12-23T07:22:07.927394877-06:00"}
415420
{"id":"abide-x4r","title":"Add EndWordPattern to Tritina","description":"\n## Current State\nTritina only verifies: LineCount(10), StanzaCount(4), StanzaSizes([3,3,3,1])\n\n## Missing Verification \nEnd-word rotation pattern for 3 words: ABC -\u003e CAB -\u003e BCA\n- Stanza 1: A B C\n- Stanza 2: C A B \n- Stanza 3: B C A\n- Envoi: all 3 words\n\n## Implementation\nUse EndWordPattern constraint with num_words=3, rotation=[2,0,1]\nWeight: 3.0 (defining characteristic)\n\n## Files\n- src/abide/forms/tina.py:29-112\n- src/abide/constraints/relational.py (EndWordPattern)\n","status":"closed","priority":1,"issue_type":"task","created_at":"2025-12-11T00:46:31.470485088-06:00","updated_at":"2025-12-11T00:51:20.134226961-06:00","closed_at":"2025-12-11T00:51:20.134226961-06:00"}

.claude/commands/generate-sft.md

Lines changed: 122 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,122 @@
1+
# Generate SFT Training Data
2+
3+
Generate verified poetry SFT data with dual reasoning traces (SYNTH + natural).
4+
5+
## Arguments
6+
$ARGUMENTS
7+
8+
Parse the arguments to extract:
9+
- `--forms`: Comma-separated list of poetry forms (default: top 10 learnable)
10+
- `--num`: Number of examples per form (default: 100)
11+
- `--min-score`: Minimum abide score to accept (default: 0.8)
12+
- `--model-mix`: Ratio of Claude agents vs OpenRouter (0-1, default: 0.5)
13+
- `--openrouter-model`: Model for OpenRouter (default: moonshotai/kimi-k2)
14+
- `--output`: Output JSONL file (default: data/sft_dataset.jsonl)
15+
- `--backend`: Force backend: openrouter, claude, or mixed
16+
17+
## Task
18+
19+
You are orchestrating SFT data generation for poetry forms. For each form:
20+
21+
1. **Spawn parallel haiku agents** (if using Claude backend):
22+
- Each agent generates poems for its assigned form
23+
- Agent verifies with abide until score >= min_score
24+
- Agent produces both SYNTH and natural reasoning traces
25+
26+
2. **Run OpenRouter workers** (if using OpenRouter backend):
27+
- Use the openrouter_generator.py script
28+
- Models: Kimi K2, DeepSeek, etc.
29+
- Same verification loop with abide
30+
31+
3. **Output format** (append-only JSONL):
32+
```json
33+
{
34+
"form": "Sonnet",
35+
"prompt": "Write a Sonnet about autumn in a melancholic tone",
36+
"synth_trace": "<think>\nSonnet requirements:\n├─ lines: 14 ●\n...",
37+
"natural_trace": "<think>\nI need to write a Sonnet about autumn...",
38+
"poem": "When autumn's gentle breath...",
39+
"score": 0.92,
40+
"rubric": [...],
41+
"model": "claude-haiku",
42+
"timestamp": "2024-12-29T22:30:00Z"
43+
}
44+
```
45+
46+
## Execution
47+
48+
Run the orchestrator with parsed arguments:
49+
50+
```bash
51+
python scripts/sft_orchestrator.py \
52+
--forms "$FORMS" \
53+
--num "$NUM" \
54+
--min-score "$MIN_SCORE" \
55+
--model-mix "$MODEL_MIX" \
56+
--output "$OUTPUT"
57+
```
58+
59+
Or for specific backends:
60+
- OpenRouter only: `--backend openrouter --model moonshotai/kimi-k2`
61+
- Claude only: `--backend claude`
62+
63+
## SYNTH Trace Format
64+
65+
The SYNTH traces use Baguettotron's stenographic notation:
66+
67+
**Logical markers:**
68+
- `` derivation/implication
69+
- `` iterative refinement
70+
- `?` uncertainty
71+
- `!/※` insight/breakthrough
72+
- `` approximation
73+
- `` conclusion
74+
75+
**Confidence markers:**
76+
- `` high confidence
77+
- `` medium confidence
78+
- `` low confidence
79+
- `` warning/risk
80+
81+
**Verification:**
82+
- `` unverified
83+
- `` partial verification
84+
- `` confirmed
85+
86+
**Structure:**
87+
```
88+
<think>
89+
FormName requirements:
90+
├─ constraint1: ●
91+
├─ constraint2: ◐
92+
└─ topic: theme
93+
94+
### 1. Task Analysis
95+
→ planning steps...
96+
97+
### 2. Composition Strategy
98+
⟨H≈0.5⟩ exploring options...
99+
100+
### 3. Verification
101+
✓ line_count: 100%
102+
☑ syllables: 85%
103+
104+
∴ FormName complete: ● high confidence
105+
</think>
106+
```
107+
108+
## Progress Tracking
109+
110+
Report progress periodically:
111+
- `[N/total] FormName: score=X.XX (model)`
112+
- Success/failure counts
113+
- Rate (poems/minute)
114+
- ETA for completion
115+
116+
## Example Usage
117+
118+
```
119+
/generate-sft --forms Sonnet,Haiku --num 10 --model-mix 0.7
120+
```
121+
122+
This generates 20 poems total (10 Sonnets + 10 Haikus), with 70% using Claude agents and 30% using OpenRouter.

config/sft_generation.yaml

Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,61 @@
1+
# SFT Data Generation Configuration
2+
# Usage: python scripts/sft_orchestrator.py --config config/sft_generation.yaml
3+
4+
pipeline:
5+
# Poetry forms to generate (top 10 learnable from GRPO signal analysis)
6+
forms:
7+
- IrregularOde
8+
- ConsonantCascade
9+
- Sonnet
10+
- CoprimeVerse
11+
- Anaphora
12+
- Mesostic
13+
- Rubaiyat
14+
- PetrarchanSonnet
15+
- Etheree
16+
- BroadBallad
17+
18+
# Number of verified examples per form
19+
num_per_form: 100
20+
21+
# Minimum abide score to accept a poem
22+
min_score: 0.8
23+
24+
# Maximum retry attempts per poem
25+
max_retries: 5
26+
27+
model_mix:
28+
# Ratio of Claude Code agents vs OpenRouter workers
29+
# 0.0 = all OpenRouter, 1.0 = all Claude, 0.5 = 50/50 mix
30+
claude_ratio: 0.5
31+
32+
claude:
33+
# Model for Claude Code agents: haiku or sonnet
34+
model: haiku
35+
# Number of parallel agents
36+
parallel_agents: 4
37+
38+
openrouter:
39+
# OpenRouter API key from environment
40+
api_key_env: OPENROUTER_API_KEY
41+
42+
# Models to use (randomly selected per task)
43+
models:
44+
- moonshotai/kimi-k2
45+
# - deepseek/deepseek-r1
46+
# - anthropic/claude-sonnet-4
47+
48+
# Number of parallel workers
49+
parallel_workers: 4
50+
51+
output:
52+
# Output JSONL file (append-only)
53+
file: data/sft_dataset.jsonl
54+
55+
# Print progress checkpoint every N examples
56+
checkpoint_every: 10
57+
58+
# Topics and tones are loaded from:
59+
# - data/topics.txt (one per line)
60+
# - data/tones.txt (one per line)
61+
# If files don't exist, defaults are used.

0 commit comments

Comments
 (0)