This roadmap describes the next major capabilities needed to evolve GitNexus's type-resolution layer from a strong receiver-disambiguation aid into a broader static-analysis foundation.
The roadmap assumes the current system already provides:
- explicit type extraction from declarations and parameters
- initializer / constructor inference
- loop element inference for many languages
- selected pattern binding and narrowing
- comment-based fallbacks in JS/TS, PHP, and Ruby
- constrained return-type-aware receiver inference during call processing
The remaining work is about generalisation, deeper structure modelling, and better propagation.
The type system should continue to preserve the qualities that make it practical today:
- stay conservative
- prefer explainable inference over clever but brittle inference
- limit performance overhead during ingestion
- keep per-language extractors explicit rather than over-generic
- separate "better receiver resolution" from "compiler-grade typing"
The goal is not to build a compiler. The goal is to support high-value static analysis for call graphs, impact analysis, context gathering, and downstream graph features.
The next biggest gain is not inventing a new type system layer. It is expanding the inference the system already performs so more constructs can benefit from it.
Today, return-type-aware inference already exists in constrained form inside call-processor.ts, and loop element inference already handles many identifier-based iterables.
The most valuable next move is to let those signals participate in more places, especially:
- iterable expressions rather than only iterable identifiers
- assignment propagation from call results
- doc-comment-derived file-scope bindings where local scope is insufficient
Status: COMPLETE — shipped in
feat/phase7-type-resolution(commitsed767e3,ca4c6c1,d79237e).
Allow loop inference and assignment inference to see more than the current function-local environment.
for _, user := range getUsers() {
user.Save()
}The iterable is a call expression, not an identifier with a local binding.
Resolved: ReturnTypeLookup introduced in Phase 7.1 exposes lookupRawReturnType. All seven typed-iteration languages (Go, TypeScript, Python, Rust, Java, Kotlin, C#) now unwrap the raw container type string to extract the element type when the iterable is a direct function call.
foreach ($this->users as $user) {
$user->save();
}If $this->users is typed through a class property annotation or file/class-scope doc-comment information, the current local-scope-only path may not be enough.
Resolved: Strategy C in the PHP extractForLoopBinding walks up the AST to the enclosing class_declaration, scans the declaration_list for a matching property_declaration, and extracts the element type from the @var PHPDoc comment (or PHP 7.4+ native type field). The @param workaround previously required in the fixture is gone.
The system can already infer receiver types from uniquely resolved call results in call-processor.ts. That needs to be generalised so TypeEnv can benefit from it too.
Resolved: ReturnTypeLookup (Phase 7.1) encapsulates lookupReturnType / lookupRawReturnType and is threaded through ForLoopExtractorContext (Phase 7.2) to all for-loop extractors. Phase 7.2 also added the pendingCallResults infrastructure (the PendingAssignment discriminated union in types.ts and the Tier 2b processing loop in type-env.ts), but no extractor populates it yet — var x = f() propagation is Phase 9 work.
- introduced
ReturnTypeLookupinterface andbuildReturnTypeLookupfactory intype-env.ts - replaced per-extractor
(node, env)signature withForLoopExtractorContextcontext object for extensibility - added
extractElementTypeFromStringtoshared.tsas the canonical raw-string container unwrapper - added PHP Strategy C helper (
findClassPropertyElementType) scoped to the PHP extractor - kept all changes backwards-compatible — explicit-type paths are untouched
- loop inference now works for direct function call iterables in all 7 typed-iteration languages
- PHP
$this->propertyforeach is resolved from class-level@varwithout requiring@paramworkarounds pendingCallResultsinfrastructure is in place (Tier 2b loop +PendingAssignmentunion) — dormant until an extractor emits{ kind: 'callResult' }(Phase 9)
Medium (as predicted)
The interface change touched all extractors but remained additive — no existing paths were changed.
Model class / struct fields so chained member access can be resolved more accurately.
user.address.cityToday the system may resolve user -> User, but it cannot generally resolve:
address -> Addresscity -> Cityor scalar type
user.address.save()Without field typing, the resolver cannot reliably identify the receiver type of address.
This is especially relevant for:
- Rust struct-pattern destructuring
- PHP chained property access
- richer TypeScript or Python object-based destructuring in future work
- parse field / property declarations per class or struct
- build a field-type map keyed by owning type
- teach lookup and chain-resolution logic to walk member segments
- keep this separate from the base variable-binding layer where possible
This is the biggest unlock for richer static analysis because it allows the graph to model more than just top-level receivers.
It would materially improve:
- chained property resolution
- member-based call disambiguation
- deeper context extraction for downstream tooling
High
This is the first phase that pushes the system from variable typing into structural object modelling. It will likely require:
- schema expansion or new internal maps
- careful handling of inheritance / embedding / language-specific member semantics
- broader test coverage than earlier phases
Make return-type-driven inference a first-class input to TypeEnv, not just a downstream verification path.
const users = repo.getUsers()Desired binding:
users -> List<User>
for (const user of getUsers()) {
user.save()
}Desired binding:
user -> User
repo.getUsers().first()If return types can propagate more systematically, later chain stages become much more resolvable.
- expose return types as reusable inference inputs inside
TypeEnv - distinguish raw textual return types from normalized receiver-usable types
- make method-call return inference receiver-aware where necessary
- avoid over-eager propagation when multiple call targets remain ambiguous
This phase would make the type system feel much closer to a static-analysis substrate rather than a set of local heuristics.
It will especially improve codebases that rely heavily on:
- service-returned collections
- builder APIs
- repository methods
- chain-heavy fluent interfaces
Medium to High
The conceptual basis already exists, but generalising it without introducing false bindings requires careful ambiguity rules.
Current support remains relatively minimal.
Missing or weak areas include:
- for-loop element binding
- pattern binding
- assignment-chain propagation
- broader expression-based inference
Priority: Medium
Reason: It matters for parity, but the biggest global analysis gains are elsewhere.
Key remaining gap:
iterable call expressions in range loops✓ shipped in Phase 7.3
Priority: Medium (chained property access remains for Phase 8)
Key remaining gaps:
file/class-scope iterable propagation✓ shipped in Phase 7.4 (Strategy C)- chained property access
Priority: High Reason: PHP heavily benefits from doc-comment-aware field and property modelling.
Key remaining gap:
- struct-pattern field destructuring
Priority: Medium
Reason: Important for completeness, but field-type infrastructure is the real prerequisite.
Shared missing capabilities:
- field / property type resolution
- generalised return-type-aware binding in
TypeEnv
Priority: Very High
Reason: These are the biggest remaining blockers to deeper static analysis.
This is the best cost-to-value step.
Deliverables:
- iterable call-expression support
- wider access to return-type maps
- file-scope binding visibility where needed
This unlocks the next class of analysis depth.
Deliverables:
- per-type field metadata
- chained property resolution
- better destructuring support
This converts existing downstream validation into a broader inference capability.
Deliverables:
- call-result variable binding
- loop inference from call results
- broader chain propagation
After the structural work lands, selective branch refinement becomes more valuable and easier to reason about.
For GitNexus, production-grade does not mean replacing a language compiler.
A realistic target is:
- strong receiver-constrained call resolution across common language idioms
- reliable handling of typed loops, constructor-like initializers, and common patterns
- useful return-type propagation for service/repository style code
- enough field/property knowledge to support chained-member analysis
- conservative behavior under ambiguity
- predictable performance during indexing
That would be sufficient for:
- better call graphs
- more accurate impact analysis
- stronger context assembly for AI workflows
- more trustworthy graph traversal features
Success looks like:
- loop inference works for identifier iterables and common call-expression iterables
- simple call-result assignments benefit from return types more broadly
- no major regression in ambiguity handling
Success looks like:
- field/property maps exist for class-like types
- chained access can resolve at least one segment beyond the base receiver
- field-aware member-call resolution works in the most important languages
Success looks like:
- return-type-aware variable binding is a first-class part of environment construction
- chains, loops, and assignments share a coherent propagation model
- downstream graph features can rely on more than local receiver heuristics
These should be resolved before or during implementation of the later phases.
-
Where should field-type metadata live?
InTypeEnv, inSymbolTable, or in a dedicated side structure? -
How should ambiguity be represented?
Isundefinedsufficient, or do later phases need a richer "known ambiguous" state? -
How much receiver context should return-type inference require?
Some methods only become meaningful once the receiver type is already partially known. -
How much branch sensitivity is worth the complexity?
Some narrowing gives clear value; full control-flow typing likely does not. -
Should field typing and chain typing be one phase or two?
Keeping them separate may reduce risk and make regressions easier to isolate.
The next stage of the type system should focus on generalising what already works before attempting compiler-like sophistication.
The most important path is:
- extend return-type and iterable inference
- add field/property type knowledge
- promote return-type-aware inference into
TypeEnv
That path preserves the current strengths of the system while moving GitNexus materially closer to a robust, production-grade static-analysis foundation.