Skip to content

feat(graph): full Dart support via tree-sitter AST#73

Merged
giancarloerra merged 2 commits into
mainfrom
feat/dart-symbol-graph
Jun 11, 2026
Merged

feat(graph): full Dart support via tree-sitter AST#73
giancarloerra merged 2 commits into
mainfrom
feat/dart-symbol-graph

Conversation

@giancarloerra

@giancarloerra giancarloerra commented Jun 11, 2026

Copy link
Copy Markdown
Owner

Summary

Dart moves from the regex symbol fallback to full tree-sitter support. The regex fallback matches function NAME / def NAME style declarations and cannot see Dart's type-first signatures (void foo(), Future<int> baz() async), so until now classes, mixins, methods, and call sites were invisible to the symbol graph: codebase_symbol and codebase_flow returned empty for Dart, codebase_impact had no source nodes, main() was never detected as an entry point, and files were chunked by line count instead of declaration boundaries. This PR fixes all five gaps reported in the issue in one pass.

Changes

  • Grammar: register @ast-grep/lang-dart@0.0.7 in ensureDynamicLanguages (same family and version as the Lua grammar shipped in feat(graph): Lua symbol/call extraction + fix file discovery for whitelist .gitignore #67). The existing per-grammar libraryPath pre-validation isolates platforms without a prebuilt binary, which degrade to today's regex behavior instead of breaking.
  • Symbols (extractFromDart in graph-symbols.ts): classes, mixins (as trait), enums, extensions, typedefs (as interface), type-first top-level functions, getters and setters, and constructors in all three forms (plain, named, factory) as constructor with dotted qualified names. Two Dart grammar quirks are handled explicitly: a function is a function_signature followed by a SIBLING function_body, so scope ranges are stitched from each pair (otherwise calls inside bodies would attribute to <module>); and plain constructors live inside a generic declaration wrapper.
  • Calls: Dart has no call_expression node kind. Every invocation wraps an argument_part, so calls are recovered from those: bare calls (helper(5)), method calls (f.bar(2)), constructor invocations (Foo(1), no new in modern Dart), prefixed calls (mat.runApp()), and cascades (f..bar(3)..load() via cascade_selector). Each call is attributed to its enclosing function scope.
  • AST chunking: dart added to TOP_LEVEL_KINDS. Both function_signature and function_body are listed so the existing overlap-merge in findAstBoundaries fuses each sibling pair into one region; class/mixin/enum/extension nodes already span their bodies.
  • Entry points: dart: main added to ENTRY_POINT_NAMES; detection is name-based per language, so Dart main() now surfaces through the existing well-known-name heuristic with no new path logic.
  • README: Dart moved to Full Support in the language matrix, with a note that import/export/part edges remain regex-extracted (unchanged, they already worked and the issue asked to leave them as-is).

Out of scope, per the issue's own framing: the Flutter-specific layer (Widget awareness, pubspec-based resolution) and plain field symbols (not callable, no call-graph value).

The node kinds and ranges used by the extractor were derived empirically from the actual @ast-grep/lang-dart tree output, not assumed.

Type of change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Refactoring (no functional changes)
  • Test coverage improvement

Testing

  • Unit tests pass (npm run test:unit): 846/846
  • Integration tests pass (npm run test:integration) — if applicable
  • TypeScript compiles cleanly (npx tsc --noEmit)
  • New tests added for new/changed functionality

New tests encode the behavior that matters, not just coverage: the symbol test asserts the exact declarations the regex fallback could never match (constructors in all three forms, getter/setter pairs, mixin as trait, type-first signatures); a dedicated test proves a top-level function's scope reaches its sibling body's closing brace (the failure mode being calls attributing to <module>); the call test covers method calls, cascades, constructor invocations, and prefixed calls; the chunking test proves an entire large class lands in one chunk where a fixed 100-line window would split it mid-body, and that a signature/body pair is never severed; the entry-point test asserts Dart main() fires the well-known-name heuristic on a non-orphan file. The old "handles Dart via regex fallback" test is removed and the fallback block retitled, since Dart no longer routes there.

Backward compatibility: no migration needed. Existing indexed Dart chunks are untouched until a file changes (hash-skip), changed files have old chunks deleted before re-insert, cached symbol graphs keep working and pick up AST symbols on their next rebuild, and platforms without a Dart prebuild keep exactly the current regex behavior.

Checklist

  • My code follows the existing code style and conventions
  • I have added/updated JSDoc comments where appropriate
  • I have updated documentation (README.md / DEVELOPER.md) if needed
  • I have addressed all CodeRabbit review comments (or marked as resolved with explanation)
  • I have read the Contributing Guide
  • I agree to the Contributor License Agreement

Related issues

Fixes #71

Summary by CodeRabbit

  • New Features

    • Dart now has full AST-based support for symbol extraction, call-site attribution, and entry-point detection.
  • Documentation

    • Language support matrix updated to show Dart as fully supported and clarify analysis behavior.
  • Tests

    • Added unit tests validating Dart entry-point detection, symbol extraction, and AST-aware chunking behavior.

Dart previously fell through to the regex symbol fallback, which cannot
match type-first signatures (void foo(), Future<int> baz() async), so
classes, methods, and calls were invisible to codebase_symbol,
codebase_flow, and codebase_impact, and files were chunked by line
count instead of declaration boundaries.

- Register @ast-grep/lang-dart as a dynamic grammar (per-grammar
  failure isolation keeps missing prebuilds on the regex fallback).
- Add extractFromDart: classes, mixins (trait), enums, extensions,
  typedefs, type-first top-level functions, getters/setters, and
  constructors including named and factory forms. Scope ranges are
  stitched from Dart's sibling function_signature/function_body pairs.
  Calls are recovered from argument_part nodes: method calls, bare
  calls, constructor invocations, prefixed calls, and cascades.
- Add dart to TOP_LEVEL_KINDS so chunking follows declaration
  boundaries; signature and body kinds are both listed so the
  overlap-merge fuses each pair into one region.
- Add dart main() to ENTRY_POINT_NAMES for entry-point detection.
- Move Dart to Full Support in the README language matrix.
@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b363c6cc-312e-42bb-8edc-1cc983ab0ca5

📥 Commits

Reviewing files that changed from the base of the PR and between 7bd9eb5 and c5a706c.

📒 Files selected for processing (2)
  • src/services/graph-symbols.ts
  • tests/unit/graph-symbols.test.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • tests/unit/graph-symbols.test.ts
  • src/services/graph-symbols.ts

📝 Walkthrough

Walkthrough

This PR upgrades Dart to full AST-driven support: adds the ast-grep Dart package and registers it; implements extractFromDart to parse symbols and calls; extends TOP_LEVEL_KINDS and ENTRY_POINT_NAMES for AST chunking and main() detection; and adds tests for symbols, calls, chunking, and entrypoints.

Changes

Full Dart Support via AST Extraction

Layer / File(s) Summary
Dependency and docs
package.json, src/services/code-graph.ts, README.md
Adds @ast-grep/lang-dart to dependencies and registers it in ensureDynamicLanguages(). README is updated to move Dart into "Full Support (indexing + code graph + AST chunking)" and documents which edges remain regex-extracted.
Dart symbol and call-site extraction
src/services/graph-symbols.ts
extractSymbolsAndCalls dispatches Dart files to a new extractFromDart which parses Dart with ast-grep, creates a <module> scope, extracts top-level types/functions and class members (constructors, methods, getters/setters), stitches signature/body ranges, and extracts raw call sites (including cascades and selector-based calls) attributed to enclosing scopes.
AST chunking and entry-point configuration
src/services/indexer.ts, src/constants.ts
TOP_LEVEL_KINDS is extended with Dart AST node kinds (function_signature, function_body, class/mixin/enum/extension/type aliases) to enable semantic chunking; ENTRY_POINT_NAMES adds dart: Set('main') so detectEntryPoints recognizes main().
Tests
tests/unit/graph-entrypoints.test.ts, tests/unit/graph-symbols.test.ts, tests/unit/indexer.test.ts
Adds tests verifying Dart main() entry detection, comprehensive symbol extraction and call attribution (constructors, methods, cascades, prefixed calls, handling unsupported constructs), and AST-aware chunking that keeps classes and signature+body pairs in single chunks. Also updates regex-fallback tests to exclude Dart.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • giancarloerra/SocratiCode#67: Both PRs add language-specific ast-grep extractors and modify dynamic language registration (Lua in #67, Dart in this PR).

"I hopped through code, with trees in sight,
parsing Dart tokens by soft moonlight.
Cascades and mains, symbols in a row,
ast-grep digs roots where regex couldn't go. 🐇🌳"

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat(graph): full Dart support via tree-sitter AST' accurately and concisely describes the main change—adding comprehensive Dart language support through tree-sitter AST extraction.
Description check ✅ Passed The PR description is comprehensive and follows the template structure, covering summary, detailed changes, type of change selection, testing confirmation, and all checklist items.
Linked Issues check ✅ Passed The PR successfully addresses all primary coding objectives from issue #71: AST-aware chunking via TOP_LEVEL_KINDS [#71], symbol extraction with extractFromDart covering classes/mixins/enums/extensions/typedefs/functions/constructors [#71], call-site recovery from argument_part and cascade_selector nodes [#71], entry-point detection via ENTRY_POINT_NAMES [#71], and grammar registration in ensureDynamicLanguages [#71].
Out of Scope Changes check ✅ Passed All code changes are scoped to the five gaps identified in issue #71: symbol extraction, call graph, entry-point detection, AST chunking, and grammar registration. Flutter-specific symbols, pubspec resolution, and field symbols are explicitly deferred as out-of-scope, consistent with the issue's framing.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/dart-symbol-graph

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/services/graph-symbols.ts`:
- Around line 407-414: The class/extension handling in extractSymbols misses
Dart "extension type" nodes; update the nodeKind check in the block that
currently tests for "class_definition" | "mixin_declaration" |
"extension_declaration" (the code that finds nameNode, calls addSym, and uses
childOfKind(..., "class_body") ?? childOfKind(..., "extension_body") then
walkMembers) to also include "extension_type_declaration" and ensure you look
for its body via childOfKind(..., "extension_type_body"); additionally, in
src/services/indexer.ts add "extension_type_declaration" to the Dart
TOP_LEVEL_KINDS constant so chunking treats these declarations as top-level
boundaries.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 11a130bf-c3ec-45c3-98a1-7d6195ab5893

📥 Commits

Reviewing files that changed from the base of the PR and between ec97746 and 7bd9eb5.

⛔ Files ignored due to path filters (1)
  • package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (9)
  • README.md
  • package.json
  • src/constants.ts
  • src/services/code-graph.ts
  • src/services/graph-symbols.ts
  • src/services/indexer.ts
  • tests/unit/graph-entrypoints.test.ts
  • tests/unit/graph-symbols.test.ts
  • tests/unit/indexer.test.ts

Comment thread src/services/graph-symbols.ts
The vendored @ast-grep/lang-dart 0.0.7 grammar predates Dart 3.3
extension types and parses them to ERROR nodes; the kinds a newer
tree-sitter-dart exposes (extension_type_declaration) do not exist in
this version, so matching on them would be unreachable code. Document
the limitation at the walk site and lock in the contract: no throw, no
bogus symbols from the ERROR region, and the rest of the file still
extracts normally.
@giancarloerra giancarloerra self-assigned this Jun 11, 2026
@giancarloerra giancarloerra merged commit 8108677 into main Jun 11, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: Full Dart support in code graph and AST chunking

1 participant