AI-powered themed category generation for logic-grid puzzles. Uses the Anthropic API (Claude Sonnet 4.6 by default) to turn a theme description into fully structured puzzle categories.
npm install logic-grid-ai logic-gridlogic-grid is a peer dependency. Requires an Anthropic API key when using the default client.
import { generateTheme } from "logic-grid-ai";
import { generate } from "logic-grid";
const theme = await generateTheme({
theme: "pirate adventure",
size: 4,
categories: 4,
});
// {
// categories: [
// { name: "Pirate", values: ["Blackbeard", "Redbeard", ...], noun: "" },
// { name: "Ship", values: ["Revenge", "Kraken", ...], noun: "captain",
// verb: ["commands the", ...], ordered: true, orderingPhrases: { comparators: {...} } },
// ...
// ]
// }
const puzzle = generate({
size: 4,
categories: 4,
categoryNames: theme.categories,
});
// Clues like: "Blackbeard commands the Revenge."Generate themed categories for a logic grid puzzle.
const result = await generateTheme({
theme: "space exploration", // theme description (required)
size: 5, // values per category, 3-8 (required)
categories: 4, // number of categories, 3-8 (required)
constraints: ["kid-friendly"], // optional hints for the AI
client: myClient, // optional custom AIClient
});Returns a ThemeResult:
interface ThemeResult {
categories: Category[]; // from logic-grid
}At least one category must have ordered: true with orderingPhrases.comparators defining all 7 comparator phrases. The result is validated against structural and semantic rules (value uniqueness, noun consistency, category count, ordered category presence, etc.). If validation fails, the AI is retried with error feedback up to 3 times.
If all retries fail, generateTheme throws a ThemeGenerationError. The error carries an errors array of structured ThemeValidationError objects (each with a stable code like "no_ordered_category" or "duplicate_value" and a human-readable message), so callers can branch on the failure mode:
import { generateTheme, ThemeGenerationError } from "logic-grid-ai";
try {
const theme = await generateTheme({ theme: "...", size: 4, categories: 4 });
} catch (err) {
if (err instanceof ThemeGenerationError) {
if (err.errors.some((e) => e.code === "no_ordered_category")) {
// Show a hint about ordered categories
}
}
throw err;
}Transport-level retries (429s, 5xx, network errors) are already handled inside the Anthropic SDK with exponential backoff — they don't consume one of the 3 semantic-retry attempts.
Create the default AI client backed by the Anthropic SDK. If no key is provided, reads from ANTHROPIC_API_KEY. Pass { model } to override the default model (claude-sonnet-4-6):
import { createAnthropicClient } from "logic-grid-ai";
const fast = createAnthropicClient(undefined, { model: "claude-haiku-4-5" });
const explicit = createAnthropicClient("sk-ant-...");Implement the AIClient interface to use a different provider:
import type { AIClient } from "logic-grid-ai";
const myClient: AIClient = {
async completeJSON(prompt, schema) {
// Call your preferred API, return JSON matching the schema
},
};
const theme = await generateTheme({
theme: "cooking competition",
size: 4,
categories: 4,
client: myClient,
});Validate AI output against structural and semantic rules. Returns ThemeValidationError[] (empty = valid). Each error has a stable code, a human-readable message, and an optional category field naming the offending category. Used internally by generateTheme, but exported for custom pipelines.
import { validateThemeResult } from "logic-grid-ai";
const errors = validateThemeResult(result, 4, 4);
if (errors.length > 0) {
for (const e of errors) console.error(`[${e.code}] ${e.message}`);
}Rewrite an existing set of puzzle clues in a different voice or style. Constraint semantics are preserved — only the surface phrasing changes. Useful for re-skinning the default clue text after generation.
import { rewriteClues } from "logic-grid-ai";
const rewritten = await rewriteClues({
clues: puzzle.clues, // Clue[] from logic-grid
style: "pirate storytelling", // optional — describes the desired voice
client: myClient, // optional — custom AIClient
});
// Returns Clue[] with the same constraints and replaced text fields.All clues are rewritten in a single batched AI call. Each clue is sent alongside its constraint JSON so the AI has ground-truth semantics. Output is validated against duplicate / empty / overlong text rules; retries up to 3 times before throwing a RewriteCluesError (parallel to ThemeGenerationError — carries errors: RewriteCluesValidationError[] with codes like "empty_clue", "long_clue", "duplicate_clue").
Validate raw AI output for rewriteClues. Returns RewriteCluesValidationError[] (empty = valid). Each error has a code, a message, and an optional 1-indexed clueIndex.
import { validateRewrittenClues } from "logic-grid-ai";
const errors = validateRewrittenClues({ clues: ["..."] }, puzzle.clues.length);Translate every visible string of a logic-grid puzzle to a target locale using AI: clue text, category names, and category value labels. Intended for ahead-of-time (AOT) puzzle pipelines that produce localized corpora once and serve them statically — quality is the constraint, not latency. The package engine stays English-only; this is a post-processing layer that returns localization maps the renderer composes with the canonical puzzle.
import { translate } from "logic-grid-ai";
import { generate } from "logic-grid";
const puzzle = generate({ size: 4, categories: 4, seed: 42 });
const localized = await translate({
puzzle,
locale: "German", // also accepts BCP-47 like "de-DE"
});
// localized = {
// clues: [{ constraint, text: "Bob wohnt genau 2 Häuser vom gelben Haus entfernt." }, ...],
// categoryNames: { "House": "Haus", "Color": "Farbe", ... },
// valueLabels: { "Yellow": "Gelb", "Cat": "Katze", "Alice": "Alice", ... },
// }The original puzzle.constraints and puzzle.grid are passed through unchanged — the engine continues to operate on canonical English keys. Renderers compose categoryNames / valueLabels over the canonical grid to display localized headers. The structural validator guarantees every canonical key has a non-empty entry, so renderers can treat the maps as exhaustive and surface any missing key as an error rather than silently rendering a half-localized grid.
The function runs a two-stage AI flow:
- Translator produces all three surfaces (localized clue text, category names, value labels) in a single batched call. The constraint JSON is shown alongside each English clue as ground truth — if the source clue text is ambiguous or has drifted (e.g. via
rewriteClues), the constraint defines the meaning. - Validator round-trips each translated clue back to a constraint type and checks polarity, direction, numeric/unit preservation, and proper-noun preservation in the clue text. Failures are fed back to the translator on retry (up to 3 attempts). Completeness of
categoryNamesandvalueLabelsis enforced structurally.
const localized = await translate({
puzzle,
locale: "ja-JP",
client: createAnthropicClient(undefined, { model: "claude-sonnet-4-6" }),
validator: createAnthropicClient(undefined, {
model: "claude-opus-4-5",
temperature: 0,
}),
});Validator best practice. Single-model validation has correlated blind spots — the validator's mistakes overlap with the translator's. For production AOT pipelines, pass a
validatorclient backed by a different model than the translator. When bothclientandvalidatorare omitted, the package creates two default Anthropic clients withvalidatorattemperature: 0for low-variance (near-deterministic — Anthropic has no seed, so minor cross-run variance is still possible) verdicts.Footgun: if you pass
clientbut notvalidator, the validator reuses yourclientas-is — including its temperature. The temperature-0 default fires only when both are omitted. If you want low-variance verdicts AND a custom translator, pass both explicitly (e.g.validator: createAnthropicClient(apiKey, { temperature: 0 })).
Proper nouns stay verbatim. People names, place names, brand names, and numeric/unit literals (
1972,8%,7am) map to themselves invalueLabelsand remain unchanged in clue text. Descriptive words (colors, animals, common-noun categories) translate, with grammatical inflection in clue text expected (yellow→ bare labelgelb, inflected formsgelben/gelbeare correct in clue context).
If validation fails on every attempt, translate throws a TranslationError carrying structured errors with stable codes:
| Code | Surface | Meaning |
|---|---|---|
wrong_clue_count |
clues | AI returned a different number of clues than the source |
non_string_clue |
clues | A clue entry is not a string |
empty_translation |
clues | A clue is empty or whitespace-only |
long_translation |
clues | A clue exceeds the per-clue length budget |
duplicate_translation |
clues | Two clues are identical (case-insensitive) |
missing_category_name |
categoryNames | A canonical category from the source has no entry in categoryNames |
empty_category_name |
categoryNames | A categoryNames entry is empty or non-string |
long_category_name |
categoryNames | A categoryNames entry exceeds the per-label length budget |
duplicate_category_name |
categoryNames | Two canonical categories map to the same localized name (case-insensitive) |
missing_value_label |
valueLabels | A canonical value from the source has no entry in valueLabels |
empty_value_label |
valueLabels | A valueLabels entry is empty or non-string |
long_value_label |
valueLabels | A valueLabels entry exceeds the per-label length budget |
duplicate_value_label |
valueLabels | Two canonical values map to the same localized label (case-insensitive) |
verdict_index_mismatch |
validator | Validator returned verdicts in a different order than the source clues |
constraint_type_mismatch |
clue semantics | Validator round-trip parsed the translation as a different constraint |
direction_flip |
clue semantics | before / left_of subject/object reversed |
between_middle_swapped |
clue semantics | between / not_between middle entity swapped with an outer |
numeric_changed |
clue semantics | Numbers or units in a clue differ from the source |
proper_noun_dropped |
clue semantics | A proper noun in a clue was changed |
import { translate, TranslationError } from "logic-grid-ai";
try {
const localized = await translate({ puzzle, locale: "German" });
} catch (err) {
if (err instanceof TranslationError) {
if (err.errors.some((e) => e.code === "direction_flip")) {
// Translator flipped the subject/object on a `before` or `left_of` clue.
}
}
throw err;
}valueLabelsis checked structurally only — semantic validation is on clue text. The proper-noun-preservation check covers the translated clue text. If the AI mistranslates a proper noun invalueLabels(e.g."Alice" → "Alise") but uses it correctly in clue text, the structural validator can't distinguish a faithful from a drifted label, and the semantic validator never sees the labels. In practice every value tends to appear in at least one clue (so clue-text checks catch drift), but a value that's only ever shown viavalueLabels(in the grid header) and never referenced by a clue is a blind spot.Category.noun/verb/valueSuffix/orderingPhrasesstay English on the canonical grid. Translation rewrites clue text and provides display-label maps; it doesn't deeply translate the renderer-side metadata used byrenderClue/rewriteClues. Downstream calls to those functions on a translated puzzle will regenerate English clues from the English category fields, overwriting the translation. Plan accordingly: translate as the last step of an AOT pipeline.
AnthropicClientOptions accepts an optional temperature (default 0.8). Use 0 for low-variance responses (greedy decoding — near-deterministic, but Anthropic doesn't expose a seed so minor cross-run variance can still occur) — typically the right default for validator clients in translate():
const validator = createAnthropicClient(undefined, { temperature: 0 });- A detailed prompt describes the puzzle structure, category contract, and ordering semantics
- The AI responds via tool_use with structured JSON matching a strict schema
- The response is validated (category count, value uniqueness, noun consistency, ordered category presence, comparator completeness, etc.)
- If validation fails, errors are fed back to the AI for up to 3 retries
MIT