Skip to content

Commit ce83931

Browse files
authored
Merge pull request #8 from obiwancenobi/fix/line-number-highlighting
fix/line number highlighting
2 parents 9104d1f + dbd460b commit ce83931

File tree

14 files changed

+1287
-42
lines changed

14 files changed

+1287
-42
lines changed

.github/marketplace/metadata.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ long_description: |
3939
review:
4040
runs-on: ubuntu-latest
4141
steps:
42-
- uses: obiwancenovi/ai-code-reviewer@v1.0.20
42+
- uses: obiwancenovi/ai-code-reviewer@v1.0.21
4343
with:
4444
pr-number: ${{ github.event.pull_request.number }}
4545
repository: ${{ github.repository }}

.github/workflows/ai-review.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -100,7 +100,7 @@ jobs:
100100
fi
101101
102102
- name: AI Code Review
103-
uses: obiwancenobi/ai-code-reviewer@v1.0.20
103+
uses: obiwancenobi/ai-code-reviewer@v1.0.21
104104
with:
105105
pr-number: ${{ inputs.pr-number || github.event.pull_request.number }}
106106
repository: ${{ inputs.repository || github.repository }}

README.md

Lines changed: 12 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,9 @@
1+
<div align="center">
2+
13
# BugBeaver Code Reviewer
24

5+
</div>
6+
37
<div align="center">
48
<img src=".github/marketplace//bugbeaver.png" width="200" alt="BugBeaver Logo">
59
</div>
@@ -106,7 +110,7 @@ Add AI code review to any repository with one simple step:
106110

107111
steps:
108112
- name: AI Code Review
109-
uses: obiwancenobi/ai-code-reviewer@v1.0.20
113+
uses: obiwancenobi/ai-code-reviewer@v1.0.21
110114
with:
111115
pr-number: ${{ github.event.pull_request.number }}
112116
repository: ${{ github.repository }}
@@ -142,7 +146,7 @@ Use repository variables for organization-wide settings:
142146

143147
```yaml
144148
- name: AI Code Review
145-
uses: obiwancenobi/ai-code-reviewer@v1.0.20
149+
uses: obiwancenobi/ai-code-reviewer@v1.0.21
146150
with:
147151
pr-number: ${{ github.event.pull_request.number }}
148152
repository: ${{ github.repository }}
@@ -488,7 +492,7 @@ Settings are applied in this priority order (highest to lowest):
488492

489493
**Workflow sets:**
490494
```yaml
491-
- uses: obiwancenobi/ai-code-reviewer@v1.0.20
495+
- uses: obiwancenobi/ai-code-reviewer@v1.0.21
492496
with:
493497
ai-provider: ${{ vars.AI_PROVIDER || 'anthropic' }}
494498
ai-model: ${{ vars.AI_MODEL || 'claude-3-sonnet' }}
@@ -568,7 +572,7 @@ jobs:
568572
569573
steps:
570574
- name: AI Code Review
571-
uses: obiwancenobi/ai-code-reviewer@v1.0.20
575+
uses: obiwancenobi/ai-code-reviewer@v1.0.21
572576
with:
573577
pr-number: ${{ github.event.pull_request.number }}
574578
repository: ${{ github.repository }}
@@ -598,7 +602,7 @@ jobs:
598602
599603
steps:
600604
- name: AI Code Review
601-
uses: obiwancenobi/ai-code-reviewer@v1.0.20
605+
uses: obiwancenobi/ai-code-reviewer@v1.0.21
602606
with:
603607
pr-number: ${{ github.event.pull_request.number }}
604608
repository: ${{ github.repository }}
@@ -611,7 +615,7 @@ jobs:
611615
#### Python Projects
612616
```yaml
613617
- name: AI Code Review
614-
uses: obiwancenobi/ai-code-reviewer@v1.0.20
618+
uses: obiwancenobi/ai-code-reviewer@v1.0.21
615619
with:
616620
pr-number: ${{ github.event.pull_request.number }}
617621
repository: ${{ github.repository }}
@@ -624,7 +628,7 @@ jobs:
624628
#### Java/.NET Projects
625629
```yaml
626630
- name: AI Code Review
627-
uses: obiwancenobi/ai-code-reviewer@v1.0.20
631+
uses: obiwancenobi/ai-code-reviewer@v1.0.21
628632
with:
629633
pr-number: ${{ github.event.pull_request.number }}
630634
repository: ${{ github.repository }}
@@ -646,7 +650,7 @@ Set these in repository Settings → Actions → Variables:
646650

647651
```yaml
648652
- name: AI Code Review
649-
uses: obiwancenobi/ai-code-reviewer@v1.0.20
653+
uses: obiwancenobi/ai-code-reviewer@v1.0.21
650654
with:
651655
pr-number: ${{ github.event.pull_request.number }}
652656
repository: ${{ github.repository }}

index.js

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ const program = new Command();
1616
program
1717
.name('ai-code-reviewer')
1818
.description('AI-powered code review for GitHub pull requests')
19-
.version('1.0.20');
19+
.version('1.0.21');
2020

2121
// Review command for GitHub Actions
2222
program

line-number-analysis.md

Lines changed: 119 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,119 @@
1+
# AI Code Review Line Number Mismatch Analysis
2+
3+
## Problem Description
4+
The AI code review system sometimes highlights comments on wrong line numbers. For example, the changes are on line 4, and the review context is correct, but the highlighting appears on line 3.
5+
6+
## Root Causes Identified
7+
8+
### 1. Content Type Inconsistency
9+
- **File Source**: In `aiReviewService.js:138`, the system uses `content = file.patch || ''`
10+
- **Issue**: When `file.patch` exists, it contains GitHub diff format with hunks and line prefixes
11+
- **Impact**: AI reviews patch content but line numbers are calculated as if it's full file content
12+
13+
### 2. Simplified Chunking Logic
14+
- **Current Implementation**: `aiReviewService.js:160`
15+
```javascript
16+
line_number: comment.line_number ? comment.line_number + (i * Math.floor(content.split('\n').length / chunks.length)) : null
17+
```
18+
- **Issues**:
19+
- Assumes uniform chunk sizes (doesn't account for actual content boundaries)
20+
- Doesn't track which specific lines are included in each chunk
21+
- Cannot handle variable-length hunks in patch content
22+
23+
### 3. Dual Chunking Systems
24+
- **FileChunker** (`fileChunker.js`): Sophisticated chunking with proper line tracking
25+
- **FileProcessor** (`fileProcessor.js`): Simple string-based chunking without line awareness
26+
- **Current Usage**: System uses FileProcessor's `splitIntoChunks` which lacks line number tracking
27+
28+
### 4. Patch Format Misunderstanding
29+
- **GitHub Patch Format**: Contains hunks with headers like `@@ -1,3 +1,3 @@`
30+
- **AI Processing**: AI sees content with `+`/`-` prefixes but produces line numbers for original file
31+
- **Gap**: No conversion between patch line numbers and original file line numbers
32+
33+
### 5. Line Number Validation Issues
34+
- **GitHubClient** (`client.js:73-76`): Validates line numbers but uses different calculation
35+
- **Mismatch**: Review service calculates differently than GitHub API expects
36+
37+
## Detailed Analysis
38+
39+
### Patch Content Handling
40+
```javascript
41+
// aiReviewService.js:138
42+
content = file.patch || '';
43+
```
44+
45+
When `file.patch` exists:
46+
1. AI reviews diff hunks with line prefixes (`+`, `-`, ` `)
47+
2. AI produces line numbers relative to patch content
48+
3. System tries to map these to original file lines (incorrect approach)
49+
50+
### Chunking Line Number Calculation
51+
```javascript
52+
// aiReviewService.js:160
53+
line_number: comment.line_number ?
54+
comment.line_number + (i * Math.floor(content.split('\n').length / chunks.length)) :
55+
null
56+
```
57+
58+
**Problems**:
59+
- Assumes chunks have equal line counts
60+
- Doesn't consider actual content structure
61+
- No awareness of where chunks start/end in original file
62+
63+
### FileChunker Usage Gap
64+
The sophisticated `FileChunker` class exists but isn't used by the main review flow. It has:
65+
- Proper line number tracking (`startLine`, `endLine`)
66+
- Chunk overlap management
67+
- Content validation
68+
69+
## Solution Recommendations
70+
71+
### 1. Content Strategy Selection
72+
- **When full content available**: Use complete file content for AI review
73+
- **When only patch available**: Parse patch to extract changed lines and context
74+
- **Fallback**: If neither available, skip line-specific comments
75+
76+
### 2. Unified Chunking System
77+
- Use `FileChunker` throughout the application
78+
- Maintain proper line number mapping for all chunks
79+
- Remove duplicate chunking logic from `FileProcessor`
80+
81+
### 3. Patch Content Parser
82+
Implement proper patch parsing to:
83+
- Extract hunks and their original line ranges
84+
- Map patch line numbers to original file line numbers
85+
- Handle context lines in hunks correctly
86+
87+
### 4. Line Number Validation
88+
- Validate AI line numbers against file content bounds
89+
- Add logging for line number calculation steps
90+
- Fall back to general comments when line numbers are invalid
91+
92+
### 5. Testing and Verification
93+
- Add unit tests for line number calculation
94+
- Create integration tests with sample PR diffs
95+
- Add logging to trace line number adjustments
96+
97+
## Impact Assessment
98+
99+
**High Impact Issues**:
100+
- Incorrect line highlighting confuses developers
101+
- Reduces trust in AI review comments
102+
- May cause important issues to be missed
103+
104+
**Medium Impact Issues**:
105+
- Inconsistent behavior across different file types
106+
- Increased support requests
107+
108+
**Technical Debt**:
109+
- Duplicate chunking implementations
110+
- Lack of comprehensive testing
111+
- Poor error handling for edge cases
112+
113+
## Next Steps
114+
115+
1. **Immediate Fix**: Use FileChunker instead of FileProcessor for chunking
116+
2. **Patch Parser**: Implement GitHub patch format parsing
117+
3. **Line Number Validation**: Add bounds checking and validation
118+
4. **Testing**: Create comprehensive test suite
119+
5. **Documentation**: Update code comments explaining line number handling

line-number-fix-summary.md

Lines changed: 146 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,146 @@
1+
# Line Number Highlighting Fix - Implementation Summary
2+
3+
## Problem Overview
4+
The AI code review system was highlighting comments on incorrect line numbers. For example, when changes were made on line 4, the review context was correct but the highlighting appeared on line 3.
5+
6+
## Root Causes Identified and Fixed
7+
8+
### 1. Patch Content vs Full Content Mismatch
9+
**Problem**: The system used `content = file.patch || ''` but treated patch content as if it was full file content.
10+
**Solution**: Implemented proper patch parsing with `PatchParser` class that:
11+
- Parses GitHub unified diff format
12+
- Maps patch line numbers to original file line numbers
13+
- Extracts reviewable content with proper context
14+
15+
### 2. Inadequate Chunking System
16+
**Problem**: Used `fileProcessor.splitIntoChunks()` which lacked line number tracking.
17+
**Solution**: Replaced with `FileChunker` which provides:
18+
- Proper line number tracking (`startLine`, `endLine`)
19+
- Chunk overlap management
20+
- Accurate line number adjustment for chunked content
21+
22+
### 3. Simplified Line Number Calculation
23+
**Problem**: Line numbers were calculated with oversimplified formula:
24+
```javascript
25+
line_number + (i * Math.floor(content.split('\n').length / chunks.length))
26+
```
27+
**Solution**: Implemented proper mapping:
28+
- Use `FileChunker.adjustCommentLineNumbers()` for chunk-relative line numbers
29+
- Use `PatchParser.mapPatchLineToOriginalFile()` for patch-to-original mapping
30+
- Add validation with `validateCommentLineNumber()`
31+
32+
## Key Components Implemented
33+
34+
### 1. PatchParser (`src/utils/patchParser.js`)
35+
- **parsePatch()**: Parses GitHub patch format into structured hunks
36+
- **mapPatchLineToOriginalFile()**: Maps AI line numbers from patch to original file
37+
- **extractReviewContent()**: Creates AI-reviewable content from patch
38+
- **isValidLineNumber()**: Validates line numbers against file boundaries
39+
40+
### 2. Enhanced AIReviewService (`src/services/aiReviewService.js`)
41+
- **reviewFile()**: Now properly handles both patch and full content
42+
- **reviewCodeChunk()**: Enhanced validation and context
43+
- **validateCommentLineNumber()**: New validation method
44+
- Uses FileChunker for all chunking operations
45+
- Implements proper line number mapping for patch content
46+
47+
### 3. Comprehensive Testing
48+
- **Unit Tests**: `tests/unit/utils/patchParser.test.js` - Tests patch parsing logic
49+
- **Integration Tests**: `tests/integration/lineNumberAccuracy.test.js` - Tests end-to-end line number accuracy
50+
51+
## How the Fix Works
52+
53+
### Before (Problematic Flow):
54+
```
55+
GitHub Patch → AI Review → Incorrect Line Numbers → Wrong Highlighting
56+
```
57+
58+
### After (Fixed Flow):
59+
```
60+
GitHub Patch → Parse with PatchParser → AI Review → Map Lines → Correct Highlighting
61+
```
62+
63+
### Step-by-Step Process:
64+
1. **Content Detection**: System detects if content is patch or full file
65+
2. **Patch Parsing**: If patch, `PatchParser` extracts hunks and mappings
66+
3. **Content Extraction**: Reviewable content is prepared for AI
67+
4. **AI Review**: AI analyzes content and provides line numbers
68+
5. **Line Mapping**: Patch line numbers are mapped to original file lines
69+
6. **Validation**: Line numbers are validated against file boundaries
70+
7. **Comment Creation**: Final comments have accurate line numbers
71+
72+
## Testing the Fix
73+
74+
### Run Unit Tests:
75+
```bash
76+
npm test tests/unit/utils/patchParser.test.js
77+
```
78+
79+
### Run Integration Tests:
80+
```bash
81+
npm test tests/integration/lineNumberAccuracy.test.js
82+
```
83+
84+
### Manual Testing Scenarios:
85+
1. **Single Hunk Patch**: Test with `@@ -5,3 +5,3 @@`
86+
2. **Multiple Hunks**: Test with complex patches
87+
3. **Edge Cases**: Test with empty patches, large files, etc.
88+
89+
## Key Benefits
90+
91+
### 1. Accurate Line Highlighting
92+
- Comments now appear on the exact lines where issues were identified
93+
- Developers can quickly locate the problematic code
94+
95+
### 2. Robust Error Handling
96+
- Graceful fallback when patch parsing fails
97+
- Validation prevents invalid line numbers
98+
99+
### 3. Better AI Context
100+
- AI receives properly formatted content for analysis
101+
- Enhanced context includes chunk metadata
102+
103+
### 4. Comprehensive Testing
104+
- Unit tests ensure individual component correctness
105+
- Integration tests verify end-to-end functionality
106+
107+
## Backward Compatibility
108+
109+
The fix maintains backward compatibility:
110+
- Existing configurations continue to work
111+
- Full file content handling is unchanged
112+
- Error handling gracefully degrades to old behavior if needed
113+
114+
## Performance Impact
115+
116+
- **Minimal Overhead**: Patch parsing is lightweight
117+
- **Efficient Chunking**: FileChunker is optimized for large files
118+
- **Caching Opportunities**: Parsed patches can be cached for multiple reviews
119+
120+
## Monitoring and Debugging
121+
122+
The implementation includes extensive logging:
123+
```javascript
124+
logger.debug(`Parsed patch with ${result.hunks.length} hunks...`);
125+
logger.warn(`Could not map patch line ${aiLineNumber} to original file line`);
126+
```
127+
128+
This helps troubleshoot any remaining edge cases in production.
129+
130+
## Future Enhancements
131+
132+
Potential improvements for future versions:
133+
1. **Enhanced Context**: Include more file context around changes
134+
2. **Smart Chunking**: Adaptive chunk sizes based on code structure
135+
3. **Caching**: Cache parsed patches for repeated reviews
136+
4. **Machine Learning**: Use historical data to improve line number accuracy
137+
138+
## Conclusion
139+
140+
The line number highlighting issue has been comprehensively addressed through:
141+
- Proper patch parsing and line number mapping
142+
- Replacement of inadequate chunking with FileChunker
143+
- Extensive validation and error handling
144+
- Comprehensive testing to prevent regressions
145+
146+
This fix ensures that AI code review comments appear on the correct lines, improving developer trust and reducing confusion during code reviews.

package-lock.json

Lines changed: 2 additions & 2 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)