vortex-data
diff --git a/‎.github/workflows/README-fuzzer-analysis.md‎
Lines changed: 166 additions & 24 deletions b/‎.github/workflows/README-fuzzer-analysis.md‎
Lines changed: 166 additions & 24 deletions
@@ -1,10 +1,13 @@
 # Automated Fuzzer Crash Analysis with Claude Code
 
-This directory contains workflows for automated fuzzer crash detection, analysis, and issue creation using the Claude Code bot.
+This directory contains workflows for automated fuzzer crash detection, analysis, issue creation, and fix automation using the Claude Code bot.
 
 ## Overview
 
-The fuzzing infrastructure automatically detects crashes and uses Claude to analyze them and create/update GitHub issues with duplicate detection.
+The fuzzing infrastructure has **two stages**:
+
+1. **Stage 1 (fuzz.yml)**: Detects crashes and uses Claude to analyze and create/update GitHub issues with smart duplicate detection
+2. **Stage 2 (fuzzer-fix-automation.yml)**: When a fuzzer issue is created, automatically attempts to fix it, create regression tests, and post findings
 
 ## How It Works
 
@@ -161,17 +164,78 @@ Claude has access to:
 - ✅ Source code analysis at crash locations
 - ✅ Cargo fuzz for crash reproduction (optional)
 
-Claude uses:
+**Stage 1 (Issue Creation)** uses:
 - **Model**: Claude Sonnet 4.5 (`claude-sonnet-4-5-20250929`)
 - **Cost**: ~$0.03-0.05 per crash analysis
 - **Max turns**: 25 (for complex analysis)
 
+**Stage 2 (Fix Automation)** uses:
+- **Model**: Claude Opus 4 (`claude-opus-4-20250514`) - more capable for code generation
+- **Cost**: ~$0.15-0.25 per fix attempt
+- **Max turns**: 40 (allows for iterative fixing and testing)
+
+### 3. Automated Fix Attempt (fuzzer-fix-automation.yml)
+
+When a fuzzer issue is created (labeled with `fuzzer`), Claude automatically:
+
+1. **Extracts Crash Details** - Parses the issue body for:
+   - Target name
+   - Crash file name
+   - Artifact download URL
+   - Stack trace and error message
+
+2. **Downloads and Reproduces** - Attempts to:
+   - Download the crash artifact
+   - Reproduce the crash locally with the fuzzer
+   - Verify the panic/error occurs
+
+3. **Analyzes Root Cause** - Deep analysis of:
+   - Source code at crash location
+   - Stack trace to understand call path
+   - Debug output to see problematic input
+   - Determines the underlying bug
+
+4. **Assesses Fixability** - Decides if this is fixable automatically:
+   - **CAN FIX**: Missing bounds check, validation, edge case handling, simple panics
+   - **CANNOT FIX**: Architectural issues, complex logic, requires domain knowledge
+
+5. **Creates Fix (if straightforward)**:
+   - Modifies source code with minimal changes
+   - Adds validation or bounds checks
+   - Handles the edge case properly
+   - Follows project code style guidelines
+
+6. **Writes Regression Tests**:
+   - Creates test using the actual fuzzer input that triggered the crash
+   - Test fails before the fix, passes after
+   - Placed in appropriate test module
+   - Named clearly (e.g., `test_fuzzer_crash_issue_123`)
+
+7. **Verifies the Fix**:
+   - Runs regression test
+   - Runs fuzzer with crash file (should not panic)
+   - Runs related tests
+   - Checks with clippy
+   - Formats code
+
+8. **Posts Findings** - Comments on the issue with:
+   - Root cause analysis
+   - Fix description (if created)
+   - Regression test details
+   - Verification results
+   - OR explanation of why it can't be fixed automatically
+
 ## Workflow Structure
 
 ```
+┌─────────────────────────────────────────┐
+│ STAGE 1: Detection & Issue Creation    │
+│ (fuzz.yml)                              │
+└─────────────────────────────────────────┘
+
 ┌─────────────────────────────────────────┐
 │ io_fuzz / ops_fuzz                      │
-│ - Run fuzzing target                    │
+│ - Run fuzzing target (2 hours)          │
 │ - Check for crashes                     │
 │ - Archive artifacts + logs              │
 │ - Output: crashes_found, first_crash   │
@@ -184,6 +248,29 @@ Claude uses:
 │ - Download fuzzer logs                  │
 │ - Run Claude with analysis prompt       │
 │ - Claude creates/updates issues         │
+│   • Smart duplicate detection           │
+│   • Occurrence tracking                 │
+│   • Detailed crash analysis             │
+└──────────────┬──────────────────────────┘
+               │
+               │ Issue created with 'fuzzer' label
+               │
+               ▼
+┌─────────────────────────────────────────┐
+│ STAGE 2: Automated Fix Attempt         │
+│ (fuzzer-fix-automation.yml)             │
+└─────────────────────────────────────────┘
+
+┌─────────────────────────────────────────┐
+│ attempt-fix                             │
+│ - Triggered by issue with 'fuzzer' label│
+│ - Download crash artifact               │
+│ - Reproduce the crash                   │
+│ - Analyze root cause                    │
+│ - Create fix if straightforward         │
+│ - Write regression tests                │
+│ - Verify fix works                      │
+│ - Post analysis comment                 │
 └─────────────────────────────────────────┘
 ```
 
@@ -211,6 +298,12 @@ Claude uses:
 - `id-token: write` - OIDC token for authentication
 - `pull-requests: read` - Read PR context if needed
 
+**Fix automation job (fuzzer-fix-automation.yml):**
+- `contents: write` - Modify source files to create fixes
+- `pull-requests: write` - Create PRs if requested
+- `issues: write` - Comment on issues with findings
+- `id-token: write` - OIDC token for authentication
+
 ## Monitoring
 
 ### View Fuzzing Runs
@@ -305,40 +398,75 @@ Check:
 
 ## Examples
 
-### Example 1: New Crash Detected
+### Example 1: New Crash Detected and Auto-Fixed
 
-1. Fuzzer detects crash in `file_io` target
+**Stage 1 - Issue Creation:**
+1. Fuzzer detects crash in `file_io` target: "index out of bounds"
 2. `io_fuzz` job archives crash files and logs
 3. `report-io-fuzz-failures` job triggered
-4. Claude analyzes fuzzer log, identifies panic in `vortex_io::read_header`
+4. Claude analyzes fuzzer log, identifies panic in `vortex_io::read_header` at line 45
 5. Claude searches existing issues, finds no match
-6. Claude creates new issue with detailed analysis
-7. Issue includes stack trace, root cause, reproduction steps
-
-### Example 2: Duplicate Crash
-
+6. Claude creates issue #789 with detailed analysis, labels it `bug,fuzzer`
+
+**Stage 2 - Fix Automation:**
+7. `fuzzer-fix-automation` workflow triggers on issue #789
+8. Claude extracts crash details from issue body
+9. Claude downloads crash artifact
+10. Claude reproduces the crash locally
+11. Claude analyzes source code at `vortex_io::read_header:45`
+12. Root cause: Missing bounds check before indexing into buffer
+13. Claude creates fix: Adds validation `if index >= buffer.len() { return Err(...) }`
+14. Claude writes regression test: `test_fuzzer_crash_issue_789()`
+15. Claude verifies: test passes, fuzzer doesn't crash, clippy passes
+16. Claude comments on issue #789 with full analysis and fix details
+17. Human reviews and merges the fix
+
+### Example 2: Duplicate Crash (No Fix Attempted)
+
+**Stage 1 - Duplicate Detection:**
 1. Fuzzer detects crash (same as issue #123)
 2. Claude analyzes, recognizes same crash location and error pattern
 3. Claude finds existing issue #123
 4. Claude updates tracking comment: "Crash seen 5 time(s)"
 5. No new issue created, keeping issue list clean
 
-### Example 3: Similar but Different
+**Stage 2 - No trigger:**
+6. Fix automation doesn't trigger (no new issue created)
+7. Human can manually trigger on issue #123 if desired
+
+### Example 3: Complex Crash (Analysis Only)
 
-1. Fuzzer detects crash in same function as issue #456
-2. Claude analyzes, sees same location but different error pattern
-3. Claude determines it's SIMILAR (medium confidence)
-4. Claude adds comment to issue #456 explaining the similarity
-5. Human reviews and decides if it's truly the same or needs new issue
+**Stage 1 - Issue Creation:**
+1. Fuzzer detects crash in `array_ops` target
+2. Claude creates issue #790 with analysis
+
+**Stage 2 - Cannot Auto-Fix:**
+3. `fuzzer-fix-automation` triggers on issue #790
+4. Claude analyzes the crash
+5. Determines it's an architectural issue requiring refactoring
+6. Claude comments: "This requires human intervention" with detailed analysis
+7. Provides suggestions for how to approach the fix
+8. Human developer takes over from Claude's analysis
 
 ## Best Practices
 
+### For Stage 1 (Issue Creation)
+
 1. **Review Claude's Classifications** - Especially "similar" cases
 2. **Close True Duplicates** - If Claude missed one, close and reference the original
 3. **Add Labels** - Tag issues with severity (`P0`, `P1`, etc.)
 4. **Track Frequency** - High occurrence counts indicate priority bugs
-5. **Minimize Test Cases** - Use `cargo fuzz tmin` to create minimal reproducers
-6. **Update Corpus** - Add interesting crashes to corpus after fixing
+
+### For Stage 2 (Fix Automation)
+
+1. **Review All Fixes** - Claude's fixes are suggestions, always review before merging
+2. **Test Thoroughly** - Run the regression test and broader test suite
+3. **Check Edge Cases** - Verify Claude considered all edge cases, not just the crash
+4. **Assess Test Quality** - Ensure regression tests actually catch the bug
+5. **Consider Broader Impact** - Check if the same issue exists elsewhere in the codebase
+6. **Minimize Test Cases** - Use `cargo fuzz tmin` to create minimal reproducers
+7. **Update Corpus** - Add interesting crashes to corpus after fixing
+8. **Close Issues** - Once merged, close the issue and reference the PR
 
 ## Limitations
 
@@ -356,23 +484,37 @@ In these cases:
 
 ## Cost and Performance
 
+### Stage 1 (Issue Creation)
 - **Analysis Time**: 1-3 minutes per crash
 - **Cost**: ~$0.03-0.05 per crash (using Sonnet 4.5)
 - **Accuracy**: High for duplicate detection (based on source code analysis)
 - **False Positives**: Low (conservative by default)
 
-## Future Enhancements
+### Stage 2 (Fix Automation)
+- **Analysis Time**: 5-15 minutes per crash (includes reproduction, analysis, fixing, testing)
+- **Cost**: ~$0.15-0.25 per fix attempt (using Opus 4)
+- **Success Rate**: Depends on crash complexity
+  - Simple bugs (bounds checks, validation): ~70-80% fix rate
+  - Medium complexity: ~30-50% fix rate
+  - Complex bugs: Analysis only, human intervention needed
+- **False Fixes**: Very low (Claude is conservative about committing changes)
 
-Potential improvements:
+## Future Enhancements
 
+### Stage 1 Enhancements
 - [ ] Automatic crash minimization before reporting
-- [ ] Severity classification (security vs stability)
-- [ ] Automatic PR creation for simple fixes
 - [ ] Integration with coverage reports
 - [ ] Historical crash trend analysis
 - [ ] Cross-target duplicate detection
 - [ ] Automatic corpus optimization
 
+### Stage 2 Enhancements
+- [ ] Automatic PR creation (currently just posts fix)
+- [ ] Severity classification (security vs stability)
+- [ ] Suggest fixes to similar code patterns across codebase
+- [ ] Batch fix multiple similar crashes
+- [ ] Learn from accepted/rejected fixes to improve future attempts
+
 ## Support
 
 For issues with the fuzzing infrastructure: