Skip to content

feat: LLMs.txt exports, automation, and expanded router#405

Merged
gregnazario merged 6 commits intomainfrom
cleanup-llms
Mar 20, 2026
Merged

feat: LLMs.txt exports, automation, and expanded router#405
gregnazario merged 6 commits intomainfrom
cleanup-llms

Conversation

@gregnazario
Copy link
Collaborator

Summary

Machine-readable docs for LLMs and coding agents: curated /llms.txt, llms-small.txt, llms-full.txt, per-page .md exports, tests, and an expanded router (OpenAPI, REST, MCP, Agent Skills, Explorer, standards, Indexer GraphQL).

Key changes

  • Routes: llms-index, llms-small/llms-full endpoints, [...slug].md for Markdown; starlight-llms-txt + llmsTxtIndex override
  • Curation: src/lib/llms-curated-ids.ts, strict llms-index resolution (throws on missing ids), draft pages excluded from .md static paths
  • Pipeline: llms-html-sanitize.ts + Vitest for HTML→MD sanitization
  • CI: tests/llms-curated-ids.test.ts, llms-dist-smoke.test.ts (expects fresh pnpm build), llms-html-sanitize.test.ts
  • Config: Minimal starlightLlmsTxt, link validator excludes llms-small/llms-full links; Head.astro rel=llms-txt; /.well-known/llms.txt redirect
  • Router extras: /aptos-spec.json, /rest-api, npm MCP, Agent Skills repo, AI hub .md, GitHub org, Explorer, AIPs, Indexer GraphQL .md
  • Docs: llms-txt, build/ai (en/zh); CLAUDE.md machine-readable section + translation policy (no es/ docs; Spanish white paper PDF link allowed)
  • i18n: Remove Spanish documentation tree (es/llms-txt); Vercel /es → English redirects; remove Spanish beta banner and related UI bits

Test plan

  • pnpm build && pnpm test && pnpm lint on CI
  • Spot-check https://deploy-preview/llms.txt for new sections

Made with Cursor

- Add curated ids (llms-curated-ids), llms-small/full endpoints, html→md pipeline
- Override starlight-llms-txt routes via llmsTxtIndex; trim unused plugin options
- Per-page .md exports; exclude drafts; strict llms.txt section resolution
- Vitest: curated id validation, dist smoke, html sanitize fidelity
- Document .well-known/llms.txt in en/es/zh; agent guidelines (CLAUDE.md)
- Link validator excludes llms-small/full; robots.txt LLM route comments

Made-with: Cursor
- Drop src/content/docs/es/ and SpanishBetaBanner; site locales stay en/zh only
- Redirect /es and /es/* to English on Vercel
- Trim es from Head metadata, SearchFallback, LanguageSelect cookie logic
- Update CLAUDE.md: zh translations required; no es/ pages; Spanish PDF ok on white paper page
- Refresh fix-i18n-links copy and mermaid test filters

Made-with: Cursor
- List aptos-spec.json and /rest-api in llms-index alongside corpus exports
- Document in llms-txt and build/ai (en/zh); extend dist smoke test

Made-with: Cursor
- Add Agent tooling and canonical sources block (npm MCP, Agent Skills, AI hub .md, GitHub, Explorer, AIPs, Indexer GraphQL .md)
- Document in llms-txt (en/zh) and build/ai (en/zh); extend dist smoke assertions

Made-with: Cursor
Copilot AI review requested due to automatic review settings March 20, 2026 15:27
@vercel
Copy link

vercel bot commented Mar 20, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
aptos-docs Ready Ready Preview, Comment Mar 20, 2026 4:09pm

Request Review

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds and curates machine-readable documentation outputs for LLMs/coding agents (llms.txt index + small/full corpora + per-page .md exports), expands the router with key API/tooling links, and updates i18n to drop Spanish docs (redirect /es → English), with new tests to keep exports stable.

Changes:

  • Introduces curated /llms.txt, /llms-small.txt, /llms-full.txt generation + shared rendering/sanitization utilities.
  • Adds per-page rendered Markdown exports via [...slug].md and tests/smoke checks to validate curated IDs and built outputs.
  • Removes Spanish-docs UI bits and updates i18n tooling/config/docs/redirects to reflect only en + zh.

Reviewed changes

Copilot reviewed 27 out of 27 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
vercel.json Adds /es and /es/* permanent redirects to English; adds /.well-known/llms.txt redirect.
tests/mermaid-rendering.test.ts Updates Mermaid test filtering to reflect removal of Spanish docs.
tests/llms-html-sanitize.test.ts Adds unit tests for HTML sanitization + HTML→Markdown conversion.
tests/llms-dist-smoke.test.ts Adds post-build smoke checks for generated llms outputs and server route modules.
tests/llms-curated-ids.test.ts Adds tests to ensure curated doc IDs exist, are English-only, and not draft.
src/starlight-overrides/PageFrame.astro Removes Spanish beta banner injection.
src/starlight-overrides/LanguageSelect.astro Updates locale cookie detection to only recognize zh vs en.
src/starlight-overrides/Head.astro Removes Spanish keyword/breadcrumb entries; adds rel="llms-txt" discovery link.
src/pages/llms-index.ts Replaces auto-indexing with curated section index using shared llms utilities.
src/pages/[...slug].md.ts Serves per-page rendered Markdown; excludes draft pages from static paths.
src/lib/llms.ts Adds shared helpers for doc filtering/ordering, rendering-to-Markdown, and cache headers.
src/lib/llms-html-sanitize.ts Adds HTML stripping + Turndown conversion for Markdown export/minify.
src/lib/llms-curated-ids.ts Defines curated doc ID sets and English-doc inclusion rules.
src/integrations/llms-txt-index.ts Overrides starlight-llms-txt injected routes to point to local handlers.
src/endpoints/llms-small.txt.ts Implements curated, minified low-token corpus export.
src/endpoints/llms-full.txt.ts Implements full rendered documentation corpus export with priority ordering.
src/content/docs/zh/llms-txt.mdx Updates Chinese LLMs.txt documentation to match new routing/exports.
src/content/docs/zh/build/ai.mdx Updates Chinese AI tools hub with well-known redirect + API/tooling links.
src/content/docs/llms-txt.mdx Updates English LLMs.txt documentation to match new routing/exports.
src/content/docs/build/ai.mdx Updates English AI tools hub with well-known redirect + API/tooling links.
src/components/SpanishBetaBanner.astro Removes Spanish beta banner component (deleted).
src/components/SearchFallback.astro Removes Spanish search fallback mapping.
scripts/fix-i18n-links/src/main.rs Updates locale-discovery messaging to reflect only zh docs tree.
scripts/fix-i18n-links/README.md Updates README examples to reflect only zh localization.
public/robots.txt Adds explicit allow rules for common AI crawlers; references llms-small.txt.
astro.config.mjs Excludes llms-small/full from link validation; reduces starlightLlmsTxt config to minimal and documents override behavior.
CLAUDE.md Updates agent guidance: only en + zh, adds LLMs routing info, and documents “no Spanish docs tree” policy.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot review: replacing all \s+ collapsed newlines inside fenced code and broke Markdown structure. Collapse spaces/tabs only; extend tests.

Made-with: Cursor
- Add project Cursor skill for LLM exports and SEO checklists\n- Align CLAUDE.md with agent guidelines (zh localization; no es maintenance)\n- Link skill from Resources

Made-with: Cursor
@gregnazario
Copy link
Collaborator Author

Review follow-up (Copilot inline on `llms-html-sanitize.ts`): Already fixed on this branch in `14fe1b6` — minify uses `/[ \t]+/g` only so newlines and Markdown structure (fenced code, lists, headings) are preserved; `tests/llms-html-sanitize.test.ts` asserts the behavior. Resolved the review thread.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 28 out of 28 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@gregnazario gregnazario merged commit aa86467 into main Mar 20, 2026
11 checks passed
@gregnazario gregnazario deleted the cleanup-llms branch March 20, 2026 16:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants