Abide

Note: This is an experimental research library in active development. APIs may change.

Automatic reward functions for training LLMs to write poetry.

Abide is a composable constraint algebra that transforms poetic form specifications into differentiable reward signals. Define a form once, get verification, scoring, and natural language instructions automatically.

The Problem

Training LLMs to write poetry is hard because:

Manual evaluation doesn't scale - You can't hand-score thousands of poems during RL training
Binary pass/fail loses signal - A poem with 13/14 correct lines shouldn't score the same as garbage
Form rules are complex - Villanelles have refrains, sestinas rotate end-words, sonnets need iambic pentameter
Verification and prompts diverge - Your reward function checks one thing, your prompt says another

The Solution

from abide.forms import ShakespeareanSonnet
from abide.primitives import MeterType

sonnet = ShakespeareanSonnet()

# Automatic reward signal (0-1 continuous score)
result = sonnet.verify(llm_output)
reward = result.score  # 0.73

# Automatic prompt generation (matches your reward function exactly)
prompt = sonnet.instruction()
# "Write a sonnet with 14 lines of iambic pentameter,
#  following the rhyme scheme ABAB CDCD EFEF GG..."

# Automatic rubric (explainable rewards)
for item in result.rubric:
    print(f"{item.criterion}: {item.score:.0%}")
# Line count: 100%
# Syllables per line: 85%
# Rhyme scheme: 67%

Key Capabilities

1. Continuous Reward Signals

Every constraint produces smooth, differentiable scores—not binary pass/fail:

from abide.constraints import SyllablesPerLine

constraint = SyllablesPerLine([5, 7, 5])  # Haiku (strict)

# Poem with 6-7-5 syllables (first line has 6 instead of 5)
result = constraint.verify("An old and silent pond\nA frog jumps into the pond\nSplash! Silence again")
print(result.score)  # 0.87 (partial credit for being close)

2. Metrical Analysis (Scansion)

Detect and enforce meter using CMU Pronouncing Dictionary stress patterns:

from abide.primitives import scan_line, MeterType, meter_score
from abide.constraints import Meter, FootLength

# Analyze Shakespeare's iambic pentameter
line = "Shall I compare thee to a summer's day"
result = scan_line(line)

print(result.binary_pattern)    # "1101110101"
print(result.foot_count)        # 5
print(result.dominant_meter)    # MeterType.IAMB
print(result.regularity)        # 0.80

# Score against expected meter
score = meter_score(line, MeterType.IAMB, expected_feet=5)
print(score)  # 0.99 (accounts for natural substitutions)

# Use as a constraint
blank_verse = Meter(MeterType.IAMB, FootLength.PENTAMETER)
result = blank_verse.verify(poem)

3. Form Inference (Reverse Engineering)

Analyze any poem and extract a specification that it passes 100%:

from abide.inference import analyze_poem, infer_form

# Take any poem
poem = """Do not go gentle into that good night,
Old age should burn and rave at close of day;
Rage, rage against the dying of the light..."""

# Infer its constraints
analysis = analyze_poem(poem)
print(analysis.rhyme_scheme)      # "ABABAB..."
print(analysis.syllable_pattern)  # [10, 10, 10, ...]
print(analysis.refrains)          # [(0, [5, 11, 17]), ...]

# Generate a FormSpec the poem passes with score 1.0
spec = infer_form(poem, name="Dylan Thomas Style")
assert spec.weighted_score(poem) == 1.0

# Now use it to train/evaluate other poems in the same style
new_poem_score = spec.weighted_score(new_llm_output)

4. Visual/Shape Poetry

Enforce line length patterns for concrete poetry:

from abide.constraints import LineShape, ShapeType, MeasureMode

# Diamante: 1-2-3-4-3-2-1 word pattern
diamante = LineShape(
    shape_type=ShapeType.DIAMOND,
    num_lines=7,
    mode=MeasureMode.WORDS,
)

# Etheree: 1-2-3-4-5-6-7-8-9-10 syllables
etheree = LineShape(
    lengths=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
    mode=MeasureMode.SYLLABLES,
)

5. Composable Constraint Algebra

Build complex forms from simple primitives:

from abide.constraints import (
    LineCount, RhymeScheme, SyllablesPerLine,
    Refrain, Meter, And, Or, WeightedSum
)

# Villanelle: 19 lines, ABA rhyme, two rotating refrains
villanelle = And([
    LineCount(19),
    RhymeScheme("ABA ABA ABA ABA ABA ABAA"),
    Refrain(reference_line=0, repeat_at=[5, 11, 17]),   # First refrain
    Refrain(reference_line=2, repeat_at=[8, 14, 18]),   # Second refrain
])

# Weighted scoring (some constraints matter more)
sonnet = WeightedSum([
    (LineCount(14), 2.0),           # Structure is critical
    (SyllablesPerLine([10]*14), 1.5),  # Meter matters
    (RhymeScheme("ABABCDCDEFEFGG"), 2.0),  # Rhyme is important
], threshold=0.7)  # Minimum score to "pass"

Supported Forms (50+)

Classic Forms

Form	Lines	Key Features
Haiku	3	5-7-5 syllables
Tanka	5	5-7-5-7-7 syllables
Sonnet (3 variants)	14	Iambic pentameter + rhyme scheme
Villanelle	19	ABA rhyme + two refrains
Sestina	39	End-word rotation across 6 stanzas
Pantoum	Variable	Interlocking quatrains
Ghazal	Variable	Couplets with radif + qafiya

Stanza Forms

Form	Lines/Stanza	Rhyme Scheme
Ottava Rima	8	ABABABCC
Rhyme Royal	7	ABABBCC
Spenserian Stanza	9	ABABBCBCC + alexandrine
Ballad Stanza	4	ABCB, 8-6-8-6 syllables
Burns Stanza	6	AAABAB

Shape/Visual Poetry

Form	Pattern
Diamante	1-2-3-4-3-2-1 words
Cinquain	2-4-6-8-2 syllables
Etheree	1-2-3-4-5-6-7-8-9-10 syllables

And Many More

Quatrain, Couplet (heroic, short, elegiac), Blank Verse, Ode (Pindaric, Horatian, Irregular), Ballad, Kyrielle, Epigram, Tercet, Rubaiyat, Free Verse, Rondeau, Triolet, Ballade, Blues Poem, Clerihew, Limerick...

For LLM Training

Reward Function Integration

from abide.forms import Villanelle
from abide.verifiers import RewardEnvironment

# Create reward environment for your RL framework
env = RewardEnvironment(
    form=Villanelle(),
    reward_scale=(0.0, 1.0),  # Continuous rewards
)

# In your training loop
def compute_reward(generated_poem: str) -> float:
    return env.score(generated_poem)

Prompt-Reward Alignment

The same specification generates both prompts and rewards:

from abide.specs import villanelle_spec

spec = villanelle_spec()

# Generate training prompt
prompt = spec.full_instruction()
# "Write a Villanelle with the following requirements:
#  - Write exactly 19 lines in 6 stanzas (5 tercets + 1 quatrain)
#  - Follow the rhyme scheme ABA ABA ABA ABA ABA ABAA
#  - Line 1 must be repeated exactly at lines 6, 12, 18
#  - Line 3 must be repeated exactly at lines 9, 15, 19"

# Verify output with matching criteria
result = spec.verify(llm_output)
reward = spec.weighted_score(llm_output)

Curriculum Learning

Test specific constraints in isolation:

from abide.specs import shakespearean_sonnet_spec

spec = shakespearean_sonnet_spec()

# Stage 1: Just learn line count
stage1_reward = spec.verify_subset(poem, "line_count")

# Stage 2: Add syllables
stage2_reward = spec.verify_subset(poem, "line_count", "syllables")

# Stage 3: Full form
stage3_reward = spec.weighted_score(poem)

Installation

# Using uv (recommended)
uv add abide

# Using pip
pip install abide

Quick Start

from abide.forms import Haiku
from abide import verify

poem = """An old silent pond
A frog jumps into the pond
Splash! Silence again"""

result = verify(poem, Haiku())
print(f"Score: {result.score:.0%}")  # Score: 100%

# See what matched and what didn't
for item in result.rubric:
    print(f"  {item.criterion}: {'PASS' if item.passed else 'FAIL'} ({item.score:.0%})")

Development

git clone https://github.com/darrenangle/abide.git
cd abide
uv sync
uv run pytest  # 435 tests

Architecture

abide/
├── primitives/     # NLP tools: syllables, phonetics, rhyme, meter
├── constraints/    # Composable constraint types
├── forms/          # Pre-built form templates (50+)
├── specs/          # FormSpec for instruction generation
├── inference/      # Reverse-engineer forms from poems
└── verifiers/      # RL framework integration

Experiments

We're actively running GRPO (Group Relative Policy Optimization) experiments to train language models on poetic form constraints.

Training Scripts

Script	Description
`scripts/train_grpo.py`	Main GRPO trainer using verifiers library
`scripts/train_grpo_trl.py`	TRL-based GRPO with KL regularization (beta parameter)
`scripts/find_learnable_forms.py`	Identify forms with high within-rollout variance (best GRPO signal)
`scripts/prompt_generator.py`	Generate training prompts from 140+ poetic forms

Run Scripts

Script	Model	Forms	Notes
`scripts/run_grpo_trl.sh`	Gemma 3 4B	Top 10 learnable	TRL with beta=0.04 KL regularization
`scripts/run_grpo_learnable.sh`	Gemma 3 4B	Top 10 learnable	Verifiers library GRPO
`scripts/run_grpo_traditional.sh`	Gemma 3 4B	Traditional forms	Weighted sampling by form popularity

Key Findings

Learnable forms matter: Forms with high within-rollout variance (model sometimes succeeds, sometimes fails on same prompt) produce better GRPO learning signal than forms that are too easy or too hard.
KL regularization helps stability: Adding a beta parameter for KL divergence (e.g., beta=0.04) helps prevent policy collapse. Still experimenting with the right settings.
Top 10 learnable forms (by GRPO signal):
- Epigram, ThunderVerse, ColorSpectrum, CoprimeVerse, ElementalVerse
- CharacterPalindromePoem, QuestionQuest, VowelPilgrimage, Mesostic, Terzanelle

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 72 Commits
.beads		.beads
.claude/commands		.claude/commands
.github		.github
config		config
experiments		experiments
scripts		scripts
specs		specs
src/abide		src/abide
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
README.md		README.md
project-seed.md		project-seed.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Abide

The Problem

The Solution

Key Capabilities

1. Continuous Reward Signals

2. Metrical Analysis (Scansion)

3. Form Inference (Reverse Engineering)

4. Visual/Shape Poetry

5. Composable Constraint Algebra

Supported Forms (50+)

Classic Forms

Stanza Forms

Shape/Visual Poetry

And Many More

For LLM Training

Reward Function Integration

Prompt-Reward Alignment

Curriculum Learning

Installation

Quick Start

Development

Architecture

Experiments

Training Scripts

Run Scripts

Key Findings

License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

darrenangle/abide

Folders and files

Latest commit

History

Repository files navigation

Abide

The Problem

The Solution

Key Capabilities

1. Continuous Reward Signals

2. Metrical Analysis (Scansion)

3. Form Inference (Reverse Engineering)

4. Visual/Shape Poetry

5. Composable Constraint Algebra

Supported Forms (50+)

Classic Forms

Stanza Forms

Shape/Visual Poetry

And Many More

For LLM Training

Reward Function Integration

Prompt-Reward Alignment

Curriculum Learning

Installation

Quick Start

Development

Architecture

Experiments

Training Scripts

Run Scripts

Key Findings

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages