-
-
Notifications
You must be signed in to change notification settings - Fork 183
feat: add docs/experimental/sections.json
endpoint to list all docs sections
#1556
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
The latest updates on your projects. Learn more about Vercel for GitHub.
|
docs/sections
endpoint to list all docs sectionsdocs/sections.json
endpoint to list all docs sections
Is there any standard for this kind of document? Reads like a "sitemap light". What other tools do you have in mind that would benefit from this? Also what kind of traffic does this create for the json file itself and the docs sections afterwards? |
Not that I know of...it's just part of our needs to be able to serve them in the MCP, which again is, at the end of the day, just a normal program.
Every app that wants to link to all the documentation sections of the docs. We technically have something similar already, which search uses (and that I'm using in the Svelte Raycast extension and in Sveltelab) https://svelte.dev/content.json ...now that I think about it, we could just include the rest of the relevant info on that page, but that would lead to a loooot of wasted data going over the wire for no reason.
This doc would be fetched when the MCP server starts up and cached as long as the server is alive...being hosted on serverless, however, it would probably be accessed decently frequently (if we move off of serverless, we would still need to set a TTL to fetch fresh data every once in a while. As per the single docs sections, it really depends on when the user or the LLM decides to include them in the context. But tbf it shouldn't matter that much since both of those resources are prerendered and served from a CDN with a long TTL. |
We should actually have some sort of invalidation mechanism here, because if people are running the MCP on a long-running server they won't ever get docs updates. (I usually use the I really don't like the idea of forcing everything into one generic endpoint like |
Yeah as I've specified later, we should probably specify a TTL in case we move to a long-running server or locally.
I kinda agree on |
What would be the future use case? Isn't this premature optimization? 😉 |
Well, yes and no...does it really need to be called |
How useful is this, really? It's effectively just a list of titles. If I'm an MCP then sometimes those titles will be enough to know which documents are relevant to my current task, but it feels like it's bound to be pretty hit-or-miss. What if the documentation lived in the package? Then the MCP server would have direct access to it, since it has a dependency on the package. |
The list serves the purpose of giving an initial hint to the LLM but most importantly to the user: it's also used for the user to add resources through the MCP. Which means that it can manually add the docs it needs with a possibly higher degree of confidence. Other than that it's just a way to have a list the LLM can pick from. Wdym with "lived in the package"? |
@Rich-Harris one of the ideas is to surface condensed documentation "use cases" for each docs file. This would give hints to LLMs as to when specific documentation files are useful to fetch without having to fetch the entire documentation file and eating context window for no use. You can see an early PoC of this in https://github.com/sveltejs/svelte/pull/16867/files that uses the |
This is also early days for MCPs and LLMs so trying to determine a "valid" rationale for some specific feature is moot in my view. We're all here trying to make the Svelte better and we will need to experiment to find the best way forward. |
I mean that the
I totally agree, but by the same token (hehe) we should be careful about not adding a bunch of stuff that turns out to not be useful, but which still creates a maintenance burden. Like, I don't want us to be on the hook for maintaining a bunch of I think https://svelte.dev/docs/llms is a great example of this. Are those documents useful? Has All of which of course is also a reason not to put |
Both fetching from node_modules and from GitHub could work but tbf it feels way worse than adding an endpoint on svelte.dev. lllms.txt is indeed useful for a lot of people and it will actually be used for this very purpose. And as I've said I specifically created the endpoint this way because it's simple and generic, meaning it could be useful for something else too. It's really not that different from content.json used for search (actually being so simple it's even better because there's close to 0 maintenance burden). What bun is doing is good and we should probably do that too...but searching the web is still too chaotic for LLMs and having an organized list coming from the MCP is much much better for sure. if we really really don't want to include this endpoint we should at least setup a GitHub webhook and store the new docs in a svelte MCP db as both adding the files to the packages and fetching from GitHub have, imho, way worse tradeoffs |
e18e would like to have a word if we added the documentation to the svelte npm package itself. If it needs to be on npm i'd rather release a buddy package svelte-docs that comes with the same version as svelte always. But i like the thought of including the docs with the mcp cli outright. Can this be made to work offline then with local models and a a local mcp installation? |
The idea is to allow the user to run a command if they want to download the last version of the docs...once you do that you can work on the local version totally offline (with local models). But the default would be to fetch so you can get the very latest docs. |
How would you match docs version to svelte version, if the users project is behind and the mcp uses latest docs, it might suggest a feature that doesn't even work in the users app. Or is it smart enough to evaluate the "since x.y.z" comments in docs? If it always needs to download a version of the docs first thing, i'd argue it makes even more sense to bundle them or make them a dependency "svelte-docs":"^5.0.0" |
It should be smart enough but we can even hint at it in the responses so saying that it should check the installed version before adding code specific to a certain version. But after all it doesn't matter too much, it the user see a feature that doesn't fit his version it's gonna tell the LLM itself.
It downloads on demand unless you specifically want to download latest. |
One wrinkle with publishing the docs separately is that it would be harder to evolve the docs without publishing a new version of the library — if you you fix a typo in the docs, do you then need to publish a patch version of both |
Agreed...honestly, there's no reason to complicate things: this endpoint is a very simple endpoint, not hard to maintain, that covers what the MCP needs, and if we ever decide it's not really useful, we can always remove it...it's not subject to semver. All the rest of the solutions will make the process slower, more bug-prone, and more annoying to code. |
At a bare minimum the URL should reflect that — |
I'm good with that...pushing the change rn 🤟🏻 |
docs/sections.json
endpoint to list all docs sectionsdocs/experimental/sections.json
endpoint to list all docs sections
This adds an endpoint to list all the available sections of documentation. We are gonna use this in the MCP to fetch the various sections to feed the LLM.
I've also included the full documentation, including the content with the?complete
query param. We could use this with the stdio MCP to load all the data at startup and store that somewhere so that it is accessible offline.Duh, I forgot we can't access query params in prerendering...I've removed this for now, we can always fetch doc by doc in the CLI, and if we decide, we can create another endpoint for the whole docs.
And apparently @khromov was also working on this lol #1557
I'm a bit torn which one is better...the other adds the
use_case
metadata but is also very specific for LLM. This is a more general endpoint that could be used for something else, too. 🤷🏼EDIT: i've also added the
use_cases
metadata since I think it could be a very good addition for LLMs.txt anyway. We don't have to compile all the use cases for every section but it's nice to have the ability to do so. I've also renamed to.json
so that it's better to visit this in the browser.Let's see which one feels better.
Before submitting the PR, please make sure you do the following
feat:
,fix:
,chore:
, ordocs:
.