fix(julia): port parameterized-type / qualified-def / qualified-import fixes to WASM#1128
fix(julia): port parameterized-type / qualified-def / qualified-import fixes to WASM#1128carlos-alm wants to merge 4 commits into
Conversation
…t fixes to WASM The native Julia extractor was fixed in #1098 for three issues that were already latent in the WASM extractor but not surfaced by the existing fixtures. Per the dual-engine policy, port the fixes so both engines produce identical results. 1. Parameterized struct names (`struct Vec{T} <: AbstractArray{T,1}`) no longer silently emit the raw type-head text as the definition name — `findBaseName` recurses through `binary_expression`, `parametrized_type_expression`, and related wrappers to locate the base identifier. 2. Qualified function defs / short-form methods inside a module no longer get double-prefixed: `function Base.show ... end` inside `module Foo` now records `Base.show` (not `Foo.Base.show`); same for short-form `Foo.bar(x, y) = x + y` inside `module Outer`. 3. `selected_import` with a qualified module (`import LinearAlgebra.BLAS: gemm`) now correctly records `LinearAlgebra.BLAS` as the import source and `gemm` as the imported name. Also fixes a related latent bug: `findChild(node, 'call_expression')` on a `function_definition` was matching the body's first call (e.g. `println(...)`) instead of the signature, because the signature is wrapped in a `signature` node. Added a `signatureCall` helper mirroring the native code. Closes #1111
|
Claude finished @carlos-alm's task —— View job PR Review: Julia WASM Extractor Bug FixesTodo List:
Root Cause AnalysisProblem Being Solved: This PR addresses real dual-engine parity bugs in the Julia WASM extractor. Three specific issues from #1098 (native engine fixes) needed to be ported to WASM:
Why This Approach: The fix correctly implements dual-engine parity by mirroring the native Rust implementation. The approach adds targeted helper functions ( Risk Assessment: Low risk. This is correcting extraction accuracy to match the native engine - no behavioral changes to the query layer or API contracts. Backlog Compliance
Critical ConcernsNone identified. This is a well-executed bug fix that addresses real dual-engine parity issues. Strengths:
Code quality highlights:
Final Recommendation
|
Codegraph Impact Analysis11 functions changed → 11 callers affected across 2 files
|
Greptile SummaryThis PR ports three Julia extractor correctness fixes from the native Rust engine (
Confidence Score: 5/5Safe to merge — all changes are correctness fixes with direct test coverage and a clean port from the already-verified native engine. Every changed code path is covered by a new or updated test (10 tests, all passing). The logic mirrors the native Rust engine exactly, and the previous discussion threads have been addressed. No new behavioral regressions were introduced. No files require special attention. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[struct_definition node] --> B{findChild type_head?}
B -- no --> SKIP[return early]
B -- yes --> C{findChild binary_expression?}
C -- yes --> D[collect non-operator sides]
D --> E[findBaseName sides-0 -> nameNode]
D --> F[findBaseName sides-1 -> supertypeNode]
C -- no --> G[findBaseName typeHead -> nameNode]
G --> H[supertypeNode = null]
E --> I{nameNode null?}
F --> I
H --> I
I -- yes --> SKIP
I -- no --> J[iterate struct_definition children]
J --> K{child type?}
K -- typed_expression --> L[field property]
K -- identifier --> M[untyped field property]
K -- other --> N[skip]
L --> O{supertypeNode?}
M --> O
N --> O
O -- yes --> P[ctx.classes.push extends]
O -- no --> Q[skip extends]
P --> R[ctx.definitions.push struct]
Q --> R
Reviews (6): Last reviewed commit: "fix(julia): port abstract-def / macro-de..." | Re-trigger Greptile |
| const binary = findChild(typeHead, 'binary_expression'); | ||
| if (binary) { | ||
| // Walk into each side of the binary expression to find the base-name | ||
| // identifier — handles parameterized forms like `Vec{T} <: AbstractArray{T,1}`. | ||
| const sides: TreeSitterNode[] = []; | ||
| for (let i = 0; i < binary.childCount; i++) { | ||
| const c = binary.child(i); | ||
| if (c && c.type !== 'operator') sides.push(c); | ||
| } | ||
| nameNode = sides[0] ? findBaseName(sides[0]) : null; | ||
| supertypeNode = sides[1] ? findBaseName(sides[1]) : null; | ||
| } else { | ||
| nameNode = findBaseName(typeHead); | ||
| } |
There was a problem hiding this comment.
Missing test for non-parameterized struct inheritance
The old code explicitly looked for a subtype_expression node (findChild(typeHead, 'subtype_expression')) to detect Point <: AbstractPoint. That path has been entirely removed and replaced with a binary_expression lookup. If the tree-sitter-julia grammar represents simple non-parameterized inheritance with a subtype_expression node (rather than binary_expression), findChild(typeHead, 'binary_expression') returns null, findBaseName(typeHead) recurses without entering subtype_expression (not in TYPE_HEAD_WRAPPERS), and returns null — so the entire struct is silently dropped from ctx.definitions. The new parameterized test (Vec{T} <: AbstractArray{T,1}) confirms the grammar uses binary_expression for that form, but there is no test for the simple case struct Point <: AbstractPoint to verify the same grammar node is used and the extends relationship is still recorded.
There was a problem hiding this comment.
Fixed in 8c2e148 — added a test for non-parameterized struct inheritance (struct Point <: AbstractPoint) in tests/parsers/julia.test.ts. Confirmed via AST inspection that the Julia grammar wraps both the simple and parameterized cases in a binary_expression node, so the new code path handles both correctly. The native engine already had this test (crates/codegraph-core/src/extractors/julia.rs:592), so this brings WASM to parity.
| const TYPE_HEAD_WRAPPERS: ReadonlySet<string> = new Set([ | ||
| 'binary_expression', | ||
| 'parametrized_type_expression', | ||
| 'parameterized_identifier', | ||
| 'type_parameter_list', | ||
| 'type_argument_list', | ||
| ]); | ||
|
|
||
| function findBaseName(node: TreeSitterNode): TreeSitterNode | null { | ||
| if (node.type === 'identifier') return node; | ||
| const direct = findChild(node, 'identifier'); | ||
| if (direct) return direct; | ||
| for (let i = 0; i < node.childCount; i++) { | ||
| const child = node.child(i); | ||
| if (!child) continue; | ||
| if (TYPE_HEAD_WRAPPERS.has(child.type)) { | ||
| const found = findBaseName(child); | ||
| if (found) return found; | ||
| } | ||
| } | ||
| return null; | ||
| } |
There was a problem hiding this comment.
type_parameter_list / type_argument_list in TYPE_HEAD_WRAPPERS can yield the wrong identifier
findBaseName checks findChild(node, 'identifier') before recursing, so in practice the struct name is found before the loop reaches a type_parameter_list or type_argument_list. However, if findBaseName is ever called with a node that lacks a direct identifier child and does have one of those wrapper types as a child — for example, a future call site or an unusual parameterized form — the function will recurse into type_parameter_list and return the first type-parameter identifier (e.g. T) instead of the struct name. Removing those two entries from TYPE_HEAD_WRAPPERS would eliminate the risk without affecting correctness.
There was a problem hiding this comment.
Fixed in 8c2e148 — removed type_parameter_list and type_argument_list from TYPE_HEAD_WRAPPERS in both the WASM and native engines (preserving dual-engine parity per CLAUDE.md). AST inspection confirmed Julia's grammar uses curly_expression for {T} constructs, not those node kinds, so the entries were dead code. Removing them eliminates the risk of recursing into a type-parameter list and returning a type variable as the struct name, as you noted.
…ype test (#1128) - Remove type_parameter_list / type_argument_list from TYPE_HEAD_WRAPPERS in both WASM and native engines. Julia grammar uses curly_expression for {T} constructs, so these were dead code. Removing them prevents findBaseName from ever recursing into a type-parameter list and returning a type variable (e.g. T) instead of the struct name. - Add WASM test for non-parameterized struct inheritance (struct Point <: AbstractPoint). The native engine already covers this case; the WASM side now has parity.
…lback (#1128) - Strengthen the `import Base: show` test to assert the corrected source/names shape (was only checking that imports were emitted at all, so a regression to the broken pre-fix shape would have slipped through). - Document the grammar assumption behind `signatureCall` / `signature_call` in both engines: the call_expression fallback exists only for defensive grammar-drift protection, not as a routine path — if it ever fires on a real definition, the function name will silently match the first body call_expression instead.
|
Addressed Greptile's round-2 feedback in 92ce81b:
Both Rust and TypeScript still in parity; all julia tests pass (11 WASM, 16 native). |
…1130) * fix(julia): port abstract-def / macro-def / signature-call WASM bugs The WASM Julia extractor diverged from the native Rust extractor in three ways that no existing WASM fixture exercised: - handleAbstractDef: `findChild(node, 'identifier')` only looks at direct children of `abstract_definition`, but tree-sitter-julia nests the identifier inside `type_head`. Result: no abstract type was ever recorded. Fall back to `findBaseName(typeHead)` like the native code. - handleMacroDef: `findChild(node, 'identifier')` resolves to the body's first identifier rather than the macro name (e.g. `macro mymac(x) x end` recorded `@x` instead of `@mymac`). Unwrap via `signatureCall` to reach the call_expression name. - handleCall: the guard `parent.type === 'function_definition'` never matched — the signature's call_expression is parented by `signature`, whose own parent is the function/macro definition. Result: every long-form `function greet(...) ... end` recorded `greet` as both a definition and a call. Match the native walk: skip when parent is `signature` and grandparent is `function_definition` or `macro_definition`. Adds WASM tests mirroring the native cases: extracts_abstract_type, extracts_parameterized_abstract_type_base_name, extracts_macro_def, and does_not_record_function_signature_as_call. Closes #1126 * fix(julia): align abstract-def name resolution with struct-def (#1130)
|
CI status: 26/27 checks green. The single failure ( |
Summary
src/extractors/julia.tsper the dual-engine policy.findBaseNamehelper that recurses throughbinary_expression/parametrized_type_expression/ parameterized-identifier / type-parameter / type-argument wrappers, mirroring the nativefind_base_name.!base.includes('.')inhandleFunctionDefandhandleAssignmentso qualified names likeBase.showinsidemodule Foono longer becomeFoo.Base.show.handleImportto walk intoselected_importchildren, treatingscoped_identifiermodules as the source and the trailing segment as the imported display name.signatureCallhelper to unwrapfunction_definition → signature → call_expression. Without this,findChild(node, 'call_expression')was matching the function body's first call (e.g.println(...)) and recording it as the function name — a latent bug not surfaced by the existing fixtures, but required for the qualified-def test to pass.handleAbstractDeffalls back tofindBaseName(typeHead),handleMacroDefunwraps viasignatureCall, andhandleCallskips signature-parentedcall_expressionnodes to avoid recording every long-form function name as a self-call.Test plan
npx vitest run tests/parsers/julia.test.ts— 10/10 pass, including the four new parity tests (extracts parameterized struct base name,qualified short-form method does not double-prefix,qualified function def does not double-prefix,selected_import handles qualified module).cargo test --lib julia— native side still green (16/16).Closes #1111
Closes #1126