Can I support multiple tree sitter grammars in a single source file? #2951
-
Hi all, I’m working on a Node.js framework called NodeKit that combines Svelte pages with server-side executed code in For example, the following code creates a basic server-side memory-persisted page access count: <data>
let count = 1
export default () => {
return {count: count++}
}
</data>
<script>
export let data
</script>
<h1>Hello, world!</h1>
<p>I’ve greeted you {data.count} times.</p> So, the format is: <data>
// Node.js (JavaScript)
</data>
<!-- Anything else is just regular Svelte. --> When I was using Codium, I forked the Svelte Language Server and was able to hack it so it doesn’t barf when it sees the data block. I can re-use that language server for Helix (I’m sure my hack can be improved but it’s better than nothing right now.) My question is about implementing the tree-sitter grammar: There is already a Svelte tree sitter grammar included in Helix and there is also, of course, a tree sitter grammar for JavaScript. Instead of forking the Svelte one and hacking it like I had to do for the LSP (given that, last I checked, you couldn’t have multiple LSPs acting on different regions of the same source file), I’d love to be able to use the grammars that exist since NodeKit pages are simply a combination of two sections with different grammars. So what I want to be able to tell Helix is: For
Thanks! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 3 replies
-
Tree-sitter has injections that are built for this case. This works decently well on simple ; runtime/queries/svelte/injections.scm
((element
(start_tag (tag_name) @_tag_name)
(text) @injection.content)
(#match? @_tag_name "data")
(#set! injection.language "javascript")) This is how grammars like svelte and vue highlight javascript to begin with: they don't have a full javascript grammar built-in. Instead they hand off javascript blocks to the javascript parser using injections. However it doesn't work like how you describe: there's no way to tell tree-sitter to stop parsing for a certain block. It parses using the language for the file as the first layer, then runs any injected parsers over nodes captured with In order to make this work robustly you would need to fork tree-sitter-svelte and parse data elements the same way as script elements so that the entire inner contents are parsed as a single node. |
Beta Was this translation helpful? Give feedback.
Tree-sitter has injections that are built for this case. This works decently well on simple
data
contents like the one you list:This is how grammars like svelte and vue highlight javascript to begin with: they don't have a full javascript grammar built-in. Instead they hand off javascript blocks to the javascript parser using injections.
However it doesn't work like how you describe: there's no way to tell tree-sitter to stop parsing for a certain block. It parses using the language for the file as the f…