feat(fsharp): route .fsi files through the dedicated signature grammar#1162
Conversation
The tree-sitter-fsharp package ships two distinct grammars: LANGUAGE_FSHARP for .fs / .fsx source files and LANGUAGE_SIGNATURE for .fsi signature files. Both engines previously routed all three extensions through the source grammar, so bare `val` declarations in .fsi files surfaced as ERROR nodes and yielded no symbols. This change adds a separate `fsharp-signature` language for .fsi: * native: new `FSharpSignature` LanguageKind wired to LANGUAGE_SIGNATURE * WASM: new `fsharp-signature` registry entry using tree-sitter-fsharp_signature.wasm (build script now produces it) * shared F# extractor handles `value_definition` only when its first child is the `val` keyword, distinguishing signature `val foo : type` from source `let foo = ...` * function vs variable kind is inferred from the type shape; supports both `function_type` (WASM npm 0.1.0) and `curried_spec` (cargo 0.3.0) node shapes for engine parity docs check acknowledged: README's F# row already covers .fs/.fsx/.fsi and the user-facing language count is unchanged; fsharp-signature is an internal id that mirrors how ocaml-interface backs .mli files. Closes #1114
|
Claude finished @carlos-alm's task —— View job PR Review: F# Signature File SupportTodo List
Root Cause AnalysisProblem Being Solved: F# signature files ( Why This Approach: Adding a dedicated Risk Assessment: Low risk - this is an additive change that doesn't modify existing F# source file parsing. Backlog Compliance
Critical ConcernsNone identified. This is a well-architected additive feature. Code Quality Assessment✅ Excellent Architecture
✅ Comprehensive ImplementationThe implementation touches all necessary layers:
✅ Excellent Test CoverageNative tests (
WASM tests (
✅ Implementation DetailsSmart Grammar Compatibility (lines fsharp.rs:357-386): // Handles both grammar versions elegantly:
// WASM: looks for `function_type` nodes
// Native: looks for `curried_spec` containing `arguments_spec`
fn has_function_type(node: &Node) -> bool {
// Both patterns checked for cross-engine compatibility
}Proper Symbol Extraction (lines fsharp.ts:258-318):
Regression Protection:
Minor Observations
Final Recommendation
The handling of grammar version differences between WASM and native engines demonstrates excellent engineering practices. This is the kind of high-quality, well-tested feature addition that strengthens the codebase. |
Greptile SummaryThis PR adds a dedicated
Confidence Score: 5/5Safe to merge — the changes are well-isolated to the F# extractor path and all affected registries are updated consistently. All changed code paths are covered by new Rust unit tests and WASM integration tests. The val-first-child guard ensures no regression for .fs source files; the member_defn empirical finding is locked in by a regression test. The only finding is a stale sentence in a doc-comment that became inaccurate after the package bump within this same PR. The enclosing_module_name doc-comment in fsharp.rs has a sentence that no longer accurately describes the WASM grammar behaviour after the 0.3.0 tarball upgrade. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[".fsi file"] -->|extension lookup| B["fsharp-signature\nlanguage ID"]
C[".fs / .fsx file"] -->|extension lookup| D["fsharp\nlanguage ID"]
B --> E["tree-sitter-fsharp_signature.wasm\n(WASM)"]
B --> F["LANGUAGE_SIGNATURE\n(native cargo)"]
D --> G["tree-sitter-fsharp.wasm\n(WASM)"]
D --> H["LANGUAGE_FSHARP\n(native cargo)"]
E --> I["FSharpExtractor / extractFSharpSymbols"]
F --> I
G --> I
H --> I
I --> J{node.kind}
J -->|value_definition\nfirst child = 'val'| K["handleValueDefinition\nfunction_type OR curried_spec+arguments_spec"]
J -->|module_defn| L["handleModuleDefn\ndotted path accumulation"]
J -->|named_module| M["handleNamedModule"]
J -->|function_declaration_left| N["handleFunctionDecl"]
K --> O["Definition: variable or function\nwith qualified name"]
L --> P["Definition: module\n+ nextModule for children"]
Reviews (13): Last reviewed commit: "Merge branch 'main' into fix/1114-fsharp..." | Re-trigger Greptile |
| function handleValueDefinition( | ||
| node: TreeSitterNode, | ||
| ctx: ExtractorOutput, | ||
| currentModule: string | null, | ||
| ): void { | ||
| const first = node.child(0); | ||
| if (!first || first.type !== 'val') return; |
There was a problem hiding this comment.
val mutable class fields may produce false extractions in .fs source files
The value_definition handler now fires for any value_definition node whose first child is val. In F# source files, explicit class fields (val mutable count: int = 0) are also valid, and the source grammar may represent them as value_definition nodes with val as the first child—the same structure this handler targets. If so, these would silently be extracted as variable definitions in .fs files, a case not covered by the existing parity test (which only checks let x = 5). Adding a test like type C() =\n val mutable count: int = 0 through parse_source would confirm or rule out this path in both the Rust and TypeScript extractors.
There was a problem hiding this comment.
Empirically verified in both grammars (cargo 0.3.0 and WASM 0.1.0): val mutable count: int = 0 inside a class is parsed as a member_defn node — NOT value_definition — so the new val-style handler never sees it. The first-child=val guard reliably distinguishes signature val declarations from any other shape.
Added regression tests in both engines (9cfee7b) so a future grammar change cannot silently start mis-classifying class fields as variables.
| const name = currentModule ? `${currentModule}.${ident.text}` : ident.text; | ||
| ctx.definitions.push({ | ||
| name, | ||
| kind: hasFunctionType ? 'function' : 'variable', |
There was a problem hiding this comment.
Module qualification silently dropped for
val inside module Foo = ... signatures
In .fsi files, val add : int -> int nested inside module Foo = ... is indexed as add rather than Foo.add. The signature grammar's module-signature nodes appear to use a node kind other than named_module, so handleNamedModule never fires and currentModule stays null for these declarations. The test explicitly asserts name: 'add', confirming the behavior, but it means any consumer searching for the fully-qualified name Foo.add will miss it. Adding a comment to the test that this is a known limitation (and perhaps a follow-up ticket) would make the intent clearer. The same gap exists in the Rust extractor via enclosing_module_name.
There was a problem hiding this comment.
Good catch — this was a real bug, not a documentation gap. Fixed in 9cfee7b:
The cargo 0.3.0 signature grammar wraps module Foo = ... as a module_defn node (distinct from named_module), so enclosing_module_name never reached it. Both engines now handle module_defn, emit it as a module definition with the dotted parent path, and qualify nested val declarations as Foo.add. New native test signature_qualifies_val_inside_nested_module_defn covers this.
The WASM 0.1.0 grammar still emits ERROR nodes for the same construct, so the WASM-only test continues to assert add — clarifying comment now points at #1161 (the grammar version bump that will let WASM reach the new code path too).
Codegraph Impact Analysis23 functions changed → 12 callers affected across 3 files
|
…#1162) Greptile review caught two .fsi extraction corners: 1. **Module qualification dropped for `val` inside `module Foo = ...`.** The cargo 0.3.0 signature grammar wraps nested signature modules in a `module_defn` node (distinct from `named_module`), so the existing `enclosing_module_name` walk never reached it and `val add : int -> int` was indexed as `add` instead of `Foo.add`. Both engines now handle `module_defn`, emit it as a `module` definition with the dotted parent path, and qualify nested `val` declarations accordingly. The WASM 0.1.0 signature grammar still emits ERROR nodes for the same construct, so the WASM-only test continues to assert `add` (with an explicit comment pointing at the grammar bump tracked under #1161). 2. **`val mutable count: int = 0` in `.fs` source files.** Empirically confirmed in both engines that the source grammar parses this as a `member_defn` node (NOT a `value_definition`), so the new `val`-style handler never sees it. Added regression tests in both engines so a future grammar change cannot silently start mis-classifying class fields as variables.
* chore(fsharp): align npm grammar with cargo at v0.3.0 The WASM engine pulled tree-sitter-fsharp 0.1.0 from npm while the native engine used 0.3.0 from crates.io. The two versions diverged in how they parse type signatures in .fsi files: 0.1.0 emits `function_type` nodes for `a -> b` types, while 0.3.0 wraps every signature in `curried_spec` with `arguments_spec` children for function shapes. The F# extractor was forced to detect both shapes simultaneously, which is fragile — future grammar churn could silently desync further. * package.json now installs tree-sitter-fsharp from the ionide v0.3.0 GitHub tarball (npm has no 0.3.0 release; ionide is the upstream the cargo crate also tracks). Lockfile pins via SRI hash. * Both extractors now check only `curried_spec` → `arguments_spec`, removing the dead `function_type` branch from each. docs check acknowledged: README's F# row already covers .fs/.fsx/.fsi and the user-facing language count is unchanged; the grammar version is an internal implementation detail. Closes #1161 * docs(fsharp): explain tree-sitter-fsharp tarball pin (#1165)
#1162) The npm and cargo tree-sitter-fsharp 0.3.0 grammars — though sharing a version tag — still emit type signatures with different node shapes: WASM 0.3.0 produces `function_type` directly under `value_definition`, while cargo 0.3.0 wraps every signature in `curried_spec` with `arguments_spec` children for function types. #1165 removed the `function_type` branch on the assumption that both grammars had converged at v0.3.0, which broke WASM extraction: every `val name : a -> b` declaration was being indexed as a `variable` instead of a `function`. Restore the dual-shape detection in the TypeScript extractor and update the documentation accordingly. Also clarifies the nested-module test comment in fsharp-signature.test to reflect that the WASM signature grammar is now at v0.3.0 but still emits ERROR nodes for `module Foo = ...` (the fix is still pending, tracked under #1161).
|
Addressed Greptile latest review:
While verifying the test change, I discovered that the merge of #1165 into this branch silently broke WASM extraction — |
…1162) The WASM tree-sitter-fsharp signature grammar was upgraded from v0.1.0 to v0.3.0 in adcaf40. v0.3.0 emits `module_defn` for nested `module Foo = ...` blocks (v0.1.0 emitted ERROR nodes), so the existing qualification logic now fires for the WASM engine too — `val` symbols get the parent module prefix in both engines. The signature test still expected the pre-bump behaviour (bare `add`), which made it fail in CI where the grammar bump landed. Update the assertion to lock in engine parity: - assert the qualified `Foo.add` function and the outer `Foo` module - assert the unqualified `add` is NOT emitted, so any future regression where the walker drops the enclosing module is caught Also refresh the `module_defn` comment in src/extractors/fsharp.ts — it still claimed the WASM grammar emitted ERROR nodes for this construct, which became stale after the v0.3.0 bump.
Summary
fsharp-signaturelanguage id mapped toLANGUAGE_SIGNATURE(native) andtree-sitter-fsharp_signature.wasm(WASM), so.fsifiles no longer go through the.fssource grammar.value_definitionhandler that fires only when the first child is thevalkeyword, distinguishing signatureval foo : typefrom sourcelet foo = ....fsharp-signatureid sharing the F# string config.function_type(WASM npm 0.1.0) andcurried_spec(cargo 0.3.0) shapes so the two engines stay in parity despite a grammar version skew (F# tree-sitter grammar version skew: npm 0.1.0 vs cargo 0.3.0 #1161 tracks the bump). Since merged: chore(fsharp): align npm grammar with cargo at v0.3.0 #1165 (stacked on this branch) bumps the npm grammar to v0.3.0 and removes the dual-shape detection, so only thecurried_spec/arguments_specpath remains.Test plan
cargo test --package codegraph-core extractors::fsharp(3 Rust unit tests: signature val extraction, bare val extraction, source.fsparity guard)npx vitest run tests/parsers/fsharp-signature.test.ts(4 WASM tests: clean parse, bare val, nested module, error recovery)npx vitest run tests/parsers/fsharp.test.ts(5 existing.fstests still green — no regression)cargo test --package codegraph-core parser_registry(ABI check coversLANGUAGE_SIGNATURE)tests/unit/snapshot.test.tsare tracked under test: snapshot concurrent save tests fail locally on main #1117)npm run lintclean on changed filesCloses #1114
Closes #1161