-
Notifications
You must be signed in to change notification settings - Fork 32
Add llms.txt template and generator #1928
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
src/tooling/docs-assembler/Navigation/LlmsNavigationEnhancer.cs
Outdated
Show resolved
Hide resolved
src/tooling/docs-assembler/Navigation/LlmsNavigationEnhancer.cs
Outdated
Show resolved
Hide resolved
Co-authored-by: Jan Calanog <[email protected]>
src/tooling/docs-assembler/Navigation/LlmsNavigationEnhancer.cs
Outdated
Show resolved
Hide resolved
src/tooling/docs-assembler/Navigation/LlmsNavigationEnhancer.cs
Outdated
Show resolved
Hide resolved
src/tooling/docs-assembler/Navigation/LlmsNavigationEnhancer.cs
Outdated
Show resolved
Hide resolved
❤️ thank you for getting this done @theletterf! |
Co-authored-by: Martijn Laarman <[email protected]>
Co-authored-by: Martijn Laarman <[email protected]>
src/Elastic.Markdown/Myst/Renderers/LlmMarkdown/LlmInlineRenderers.cs
Outdated
Show resolved
Hide resolved
src/Elastic.Markdown/Myst/Renderers/LlmMarkdown/LlmBlockRenderers.cs
Outdated
Show resolved
Hide resolved
src/Elastic.Markdown/Myst/Renderers/LlmMarkdown/LlmInlineRenderers.cs
Outdated
Show resolved
Hide resolved
…ers.cs Co-authored-by: Martijn Laarman <[email protected]>
@Mpdreamz @reakaleek After careful consideration, I reverted the .md append logic in this commit, because it was proving too complex to handle. The current PR generates the LLMs.txt file as intended and doesn't affect LLM links generation. I think we could follow up in another PR to see how to best handle converting the links to .md links. I still think it's a good idea, as it would make things easier for the LLM, but it requires some special handling of various cases, and that risks making this PR too complicated, IMHO. Also related to this, we could consider opening a separate PR to raise warnings when absolute URLs to our docs are being used (they should not). |
…r into add-llmstxt-template
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements an llms.txt
template and generator to provide structured documentation overviews for LLM consumption, following the llmstxt.org recommendations. The implementation creates a boilerplate template for the root documentation and dynamically generates navigation sections listing first-level content within each documentation category.
Key changes include:
- Added a new
LlmsNavigationEnhancer
class to generate structured navigation sections from the documentation hierarchy - Enhanced the assembler build service to append navigation content to existing llms.txt files
- Extended LLM markdown rendering utilities to handle URL conversion for both localhost and relative URLs
- Updated the LLM markdown exporter to use a predefined template for root index files
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
File | Description |
---|---|
LlmsNavigationEnhancer.cs |
New class that extracts first-level navigation items and generates structured markdown sections with titles, URLs, and descriptions |
AssemblerBuildService.cs |
Integration point that enhances llms.txt files by appending generated navigation sections after assembly |
LlmInlineRenderers.cs |
Minor formatting change adding extra whitespace |
LlmBlockRenderers.cs |
Enhanced URL handling with localhost-to-canonical conversion and extracted utility methods for reuse |
LlmMarkdownExporter.cs |
Added llms.txt template constant and logic to use template for root index files |
src/services/Elastic.Documentation.Assembler/Navigation/LlmsNavigationEnhancer.cs
Show resolved
Hide resolved
src/services/Elastic.Documentation.Assembler/Building/AssemblerBuildService.cs
Outdated
Show resolved
Hide resolved
src/Elastic.Markdown/Myst/Renderers/LlmMarkdown/LlmInlineRenderers.cs
Outdated
Show resolved
Hide resolved
src/services/Elastic.Documentation.Assembler/Building/AssemblerBuildService.cs
Outdated
Show resolved
Hide resolved
@Mpdreamz Do we have your blessing? 🙏🏻 |
Co-authored-by: Martijn Laarman <[email protected]>
…rers.cs Co-authored-by: Copilot <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spot a few cases for improvements but not blocking this PR lets get v1 of llms.txt out the door :) its been too long.
Fixes #1689
The approach here is to create a template based on https://llmstxt.org/ recommendations, and then append a number of generated H2 sections after the boilerplate. The sections should contain a list, ordered following the same navigation order in the site, of the first level sections of each category, with a Markdown link to their .md files and, after a colon, their description, extracted from the description: frontmatter of those pages. This process should happen after assembly, more or less around the time we create the sitemap.xml.
For clarity. If, for example, we have
## Solutions
as the category (https://www.elastic.co/docs/solutions), then I would expect the list to contain the first three subsections you see when opening it, and no more. The goal is for an LLM to consume this file and get an overview of the first level of navigation within each section.