-
Notifications
You must be signed in to change notification settings - Fork 7
LLMs.txt generation on every PR #556
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
7b5af51 to
c960b1b
Compare
|
@sdserranog and @torresmateo what do you think about this pattern? |
|
I like this pattern! I like |
Yeah, but what I was worried about was the page content changing, and the summary being inaccurate. |
torresmateo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
![]()
| ) | ||
| ); | ||
| return new Set(); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Error handler claims to process all files but doesn't
When getChangedFilesSince() fails (e.g., the previous SHA was rebased away), it logs "processing all files" but returns an empty Set. This causes changedFiles.has(page.path) in determinePagesToSummarize() to return false for all pages. Combined with the condition if (isChanged || !existingSummary), pages with existing summaries will be kept unchanged rather than re-processed. The actual behavior is the opposite of what the log message states - modified files silently retain stale summaries instead of being re-summarized.
| title: existingSummary.title, | ||
| description: existingSummary.description, | ||
| }); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Duplicate entries when source files share same URL
The incremental update logic iterates over discovered pages by file path but looks up existing summaries by URL. When two source files map to the same URL (e.g., how-arcade-helps.mdx and how-arcade-helps/page.mdx), both pages pass through the loop independently. If one file is changed and another isn't, one ends up in pagesToSummarize and the other in pagesToKeep, resulting in duplicate entries in the final output. The generated llms.txt shows this happening with the "How Arcade helps with Agent Authorization" page appearing twice.
Rather than wait until the end of the week, we can run the generation of LLMs.txt on every PR, looking only for what has changed. Changes will be auto-commited back to your PR
Note
Runs LLMs.txt generation on pull requests with auto-commit and updates the generator to perform incremental, git-aware regeneration with embedded metadata; refreshes llms.txt.
pull_request(opened/synchronize/reopened).public/llms.txtback to the PR branch.scripts/generate-llmstxt.ts)llms.txtmetadata andgit diffto only summarize changed/new pages; remove deleted pages implicitly.git-sha,generation-date) in output; preserve if no changes.public/llms.txt)Written by Cursor Bugbot for commit 04fbfae. This will update automatically on new commits. Configure here.