Commit 8468c2f
Add traditional forms GRPO training with weighted sampling
New training mode that focuses on well-known poetry forms:
Form tiers:
- Tier 1 (weight 5, ~40%): Classic forms (sonnet, haiku, limerick, etc.)
- Tier 2 (weight 3, ~39%): Well-known traditional (ghazal, sestina, etc.)
- Tier 3 (weight 1, ~21%): Less common but legitimate historical forms
Excludes 59 constraint-based/mathematical forms (lipogram, fibonacci, etc.)
that are too difficult for initial training.
Changes:
- prompt_generator.py: Add form tiers and weighted sampling functions
- train_grpo.py: Support ABIDE_TRADITIONAL=1 env var for traditional mode
- run_grpo_traditional.sh: New script for traditional-only training
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <[email protected]>1 parent 82ed0a7 commit 8468c2f
File tree
3 files changed
+1115
-166
lines changed- scripts
3 files changed
+1115
-166
lines changed
0 commit comments