Skip to content

feat: Add llms.txt + all-anchors.adoc generation, fix HTML entities#16

Merged
raifdmueller merged 2 commits intomainfrom
feature/llms-txt-issue-109
Feb 20, 2026
Merged

feat: Add llms.txt + all-anchors.adoc generation, fix HTML entities#16
raifdmueller merged 2 commits intomainfrom
feature/llms-txt-issue-109

Conversation

@raifdmueller
Copy link
Owner

Summary

  • Adds scripts/generate-llms-txt.js: generates two new files:
    • docs/all-anchors.adoc — AsciiDoc include-based complete reference document (organized by category)
    • website/public/llms.txt — Clean Markdown for LLM consumption (~79 KB, all 50 anchors)
  • Fixes HTML entity encoding bug in anchor titles (e.g., Devil’s AdvocateDevil's Advocate) by adding decodeHtmlEntities() to scripts/extract-metadata.js
  • Regenerates website/public/data/*.json with entity-decoded titles
  • Adds generate-llms-txt step to CI workflow to keep llms.txt fresh on every build

Closes

Test plan

  • node scripts/generate-llms-txt.js runs without errors
  • website/public/llms.txt contains all 50 anchors (~79 KB)
  • docs/all-anchors.adoc contains AsciiDoc includes for all categories
  • No HTML entities in generated JSON data or llms.txt
  • CI passes (generate-llms-txt step added to e2e-tests job)

🤖 Generated with Claude Code

…LM-Coding#109, LLM-Coding#110)

- Add scripts/generate-llms-txt.js: generates docs/all-anchors.adoc
  (AsciiDoc include-based full reference) and website/public/llms.txt
  (clean Markdown for LLM consumption, ~79 KB, 50 anchors)
- Fix HTML entity encoding in anchor titles (e.g., "Devil’s Advocate"
  → "Devil's Advocate") by adding decodeHtmlEntities() to extract-metadata.js
- Regenerate website/public/data/*.json with entity-decoded titles
- Add generate-llms-txt step to CI workflow (test.yml)

Closes LLM-Coding#109, Closes LLM-Coding#110

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… (CodeQL)

Replaced chained .replace() calls with a single regex pass that handles
both numeric (&LLM-Coding#38;) and named (&amp;) entities simultaneously.
This prevents CodeQL's "double-unescaping" alert where &LLM-Coding#38;amp; could
be decoded twice through sequential replacements.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@raifdmueller raifdmueller merged commit 829fb95 into main Feb 20, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: HTML entities in anchor titles (e.g. Devil&#8217;s Advocate) feat: All-in-one page + llms.txt generation

1 participant