Generate term-to-article lists from unfoldingWord en_tw archive for Bible books. Works in both Node.js (CLI) and React.js (browser) environments with intelligent caching.
- ✅ Universal: Works in Node.js and browser environments
- ✅ Smart Caching: File system (Node.js) or localStorage/sessionStorage (browser)
- ✅ Performance: Optimized matching with PrefixTrie algorithm
- ✅ Case Sensitivity: Proper God/god distinction (God→kt/god, god→kt/falsegod)
- ✅ Morphological Variants: Handles plurals, possessives, verb forms
- ✅ Parentheses Normalization: "Joseph (OT)" → "Joseph" for better coverage
Install globally:
npm install -g twl-generatorGenerate a TWL TSV for a Bible book (downloads USFM from Door43):
twl-generator --book rutGenerate a TWL TSV from a local USFM file:
twl-generator --usfm ./myfile.usfmSpecify output file:
twl-generator --usfm ./myfile.usfm --output ./output.tsvYou can also combine --book and --usfm (book is used for output filename and context):
twl-generator --usfm ./myfile.usfm --book rutGenerate per-verse keywords with TW matches (one JSON file per book). The output uses a flat keyed object per book: keys are "C:V" and values are ordered arrays of { surface, tw_match } using the existing trie and term logic.
Command:
twl-generator keywords [options]Options:
--books <ids>: Comma-separated book ids (e.g.,gen,exo,mat).--testament <t>:old|new|all(default:all).--outdir <path>: Output directory (default:./keywords).
Examples:
# Whole Bible, split per book
twl-generator keywords --outdir ./datasets
# Old Testament only
twl-generator keywords --testament old --outdir ./datasets
# Specific books
twl-generator keywords --books gen,exo,mat --outdir ./datasetsPer-book output shape (example):
{
"1:1": [
{ "surface": "God", "tw_match": "God" },
{ "surface": "created", "tw_match": "created" },
{ "surface": "heavens", "tw_match": "heavens" },
{ "surface": "earth", "tw_match": "earth" }
],
"1:2": [
{ "surface": "earth", "tw_match": "earth" },
{ "surface": "Spirit", "tw_match": "Spirit" },
{ "surface": "God", "tw_match": "God" }
]
}Notes:
- The dataset uses canonical TW terms for
tw_matchand preserves the verse’s surface casing forsurface. - Multi-word terms and morphological variants are supported via the existing trie matcher.
- Files are named
keywords_<USFMBOOK>.json(e.g.,keywords_GEN.json).
Install as a dependency:
npm install twl-generatorimport { generateTWLWithUsfm } from 'twl-generator';
// USFM string (can be loaded from file, API, etc.)
const usfmContent = `
\\id MAT
\\c 1
\\v 1 In the beginning...
`;
const book = 'mat';
const tsv = await generateTWLWithUsfm(book, usfmContent);
// tsv is a string in TSV format, ready to save or process
console.log(tsv);import { generateTWLWithUsfm } from 'twl-generator';
const book = 'rut'; // Book code
const tsv = await generateTWLWithUsfm(book);
// This will fetch the USFM for the book from Door43 and return the TSV string
console.log(tsv);book: (string) Book code (e.g., 'mat', 'rut'). Required ifusfmContentis not provided.usfmContent: (string, optional) USFM file content. If provided, this is used instead of fetching from Door43.- Returns:
Promise<string>— TSV string of TWL matches.
MIT