Skip to content

guide llms.txt improvements #414

@thibaudcolas

Description

@thibaudcolas

Follow-up improvements to #412 / #413, based on further testing of the files and feedback (thread: llms.txt feedback #13648).

  • Extract as a package
  • Set up CDN-level caching of text/plain and text/markdown responses
  • Remove "releases" pages from the files (they shouldn’t be relevant to answer questions about the CMS)
  • Track files’ token lengths
  • Set up evals suite for guide contents with common questions #415
  • Unit tests for lack of HTML escaping
  • Trial LLM-focused information
    • Information only present in llms.txt
    • Information only present in pages’ markdown representations
    • Information only present in llms-full.txt
    • Page only linked from llms.txt
  • Make the files discoverable directly from the site: Implement Markdown / LLM content copy features #416

To be confirmed

Potential improvements where I’m not sure how high the ROI is:

  • TBC: remove "About" and "Contributing" pages to increase signal to noise
  • TBC: set a target of token count so the file fits in small models’ context windows (for example to fit in a 32k context)
  • TBC: add inline heading links to Markdown to encourage direct linking to relevant sections
  • TBC: change the "full" format so there is less ambiguity about documents’ start and end
  • TBC: docs-focused MCP server

Metadata

Metadata

Assignees

Labels

djangodocumentationImprovements or additions to documentation

Type

No type

Projects

Status

No status

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions