fix: Fix graph path lookup for review and impact tools#469
Open
hy2850 wants to merge 1 commit into
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR fixes graph lookups for review and impact tools when the file paths passed by the user do not exactly match the file paths stored in the graph database.
The issue showed up while comparing
git diff <sha>^ <sha>withget_review_context(changed_files=...)for eval test commits. Even when the graph DB was populated,get_review_contextoften returned:That made the graph payload look empty and caused review context to fall back mostly to source snippets.
Root Cause
get_review_contextand related tools converted changed files into absolute paths before graph lookup:However, existing graph DBs may store file paths in different formats depending on how they were built, for example:
evaluate/test_repos/fastapi/fastapi/routing.pyBecause
get_nodes_by_file()requires an exact file path match, a valid changed file like:could fail to match both:
and the stored graph path:
So the graph store had nodes, but the lookup path format prevented them from being found.
Fix
This PR adds path resolution before graph lookup.
The new resolver maps user-facing changed file paths to the actual path format stored in the graph by checking:
The resolver is then used by:
get_review_contextget_impact_radiusquery_graph(..., pattern="file_summary")This makes the tools robust to graph DBs built with different path styles.
Evidence
Before this fix, graph lookup returned zero graph payload for many eval commits:
f41d09a30/0/0209/473/510cec5780d0/0/0200/472/425223815580/0/0162/500/494749cefde0/0/0161/500/527c2705ffd0/0/0386/403/11060ec7f7130/0/0195/388/7410a192fb00/0/0877/252/1964ac0ad2fe0/0/0577/255/14158e36f2bc0/0/0484/482/1284ee37a7620/0/0119/500/701d81d5ab70/0/01192/500/3067d86e19770/0/01154/500/2469Format:
changed nodes / impacted nodes / edges.(One commit,
gin 0feaf8cb, still resolves to0/0/0because it mostlyremoves or splits example files that are not represented as current graph nodes)
Why This Matters
Without this fix,
get_review_contextcan incorrectly report that a change has no graph impact even when the graph contains matching nodes.This affects:
After this fix, graph-based review tools can use the existing graph data instead of silently falling back to empty graph payloads.
Test Plan
get_review_contextresolves repo-relative changed files to storedgraph paths.
get_impact_radiusresolves changed file paths before graph lookup.query_graph(..., pattern="file_summary")resolves file pathsconsistently.