feat(ci): add investigation-only claim backstop (ADR-034 Phase 3)#1214
feat(ci): add investigation-only claim backstop (ADR-034 Phase 3)#1214rjmurillo-bot wants to merge 6 commits intomainfrom
Conversation
Add CI workflow to validate investigation-only QA skip claims. Detects commits claiming "SKIPPED: investigation-only" but containing files outside the allowed paths. Files: - validate_investigation_claims.py: Script to check claims - test_validate_investigation_claims.py: 22 pytest tests - investigation-claim-backstop.yml: CI workflow for PR validation Partial fix for #652: Implements P2 CI backstop acceptance criterion. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
PR Validation ReportCaution ❌ Status: FAIL Description Validation
QA Validation
|
There was a problem hiding this comment.
Code Review
The pull request introduces a CI backstop for 'investigation-only' QA skip claims, ensuring that commits with such claims only modify allowed files. The changes include a Python script for validation, comprehensive unit tests, and a GitHub Actions workflow to trigger the validation. The implementation appears robust, with good test coverage for various scenarios. The use of subprocess.run for git commands is appropriate, and error handling for file parsing is present. The solution aligns with the ADR-034 Phase 3 requirements. Two security review comments regarding command injection prevention in subprocess.run calls have been retained.
|
Caution Review failedFailed to post review comments 📝 WalkthroughWalkthroughAdds a CI backstop and validator for ADR-034 "investigation-only" skip claims: new script scans Changes
Sequence DiagramsequenceDiagram
participant PR as Pull Request
participant GA as GitHub Actions
participant VAL as Validator
participant SESS as Session Files
participant GIT as Git CLI
participant API as GitHub Comment API
PR->>GA: Trigger workflow for changed session JSONs
GA->>VAL: Run validator targeting session-dir
VAL->>SESS: Read session JSON files
VAL->>VAL: Detect "SKIPPED: investigation-only" claims (regex)
VAL->>GIT: git log -1 -- <session_file> -> get commit SHA
GIT-->>VAL: Return commit SHA
VAL->>GIT: git show --name-only <commit> -> list changed files
GIT-->>VAL: Return changed files
VAL->>VAL: Compare changed files to allowlist -> collect violations
VAL-->>GA: Emit verdict, violation_count, violations (stdout / outputs)
alt Violations found
GA->>API: Post INVESTIGATION-CLAIM-BACKSTOP comment with details
else No violations
GA-->>PR: Workflow completes without comment
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Comment |
Remove <code> tag inside <details><summary> that violated markdownlint MD033 rule (no-inline-html). Plain text is sufficient in this context. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
AI Quality Gate ReviewWarning WalkthroughThis PR was reviewed by six AI agents in parallel, analyzing different aspects of the changes:
Review Summary
💡 Quick Access: Click on individual agent jobs (e.g., "🔒 security Review", "🧪 qa Review") in the workflow run to see detailed findings and step summaries. Security Review DetailsNow I have enough context to complete the security review. Security Review: PR #1214PR Type Classification
Primary categories: CODE, WORKFLOW - Full security review required. Findings
Security Controls Implemented [PASS]
Low-Severity Observation (Non-Blocking)The
Recommendation: No action required. Current validation is appropriate for the threat model. RecommendationsNone required. The implementation follows secure coding practices. VerdictQA Review DetailsNow let me check the README changes and verify test execution details: Let me use view instead: Based on my analysis, I have sufficient information to provide the QA verdict. QA Review: PR #1214Test Coverage Assessment
Quality Concerns
Regression Risk Assessment
Code Quality Review
Workflow Compliance (ADR-006)
Pre-executed Test Results
Analyst Review DetailsLet me check the local repository for these files directly. Now I have enough context to complete the analysis. Let me also check the I now have sufficient context to complete the PR analysis. Let me produce the final report. PR #1214 Analysis: Investigation-Only Claim Backstop (ADR-034 Phase 3)Code Quality Score
Overall: 4.5/5 Impact Assessment
Findings
Recommendations
Architectural Alignment
VerdictArchitect Review DetailsI have sufficient information. Let me provide the architectural review. Design Quality Assessment
Overall Design Score: 4/5 Architectural Concerns
Allowlist Consistency AnalysisThe PR introduces a local Local patterns (validate_investigation_claims.py):
Shared module (investigation_allowlist.py):
The script imports ADR-034 Compliance CheckADR-034 states
However, Breaking Change Assessment
Technical Debt Analysis
ADR Assessment
Recommendations
VerdictDevOps Review DetailsLet me also check the README changes and verify any shared module imports. Now I have all the information I need to complete the DevOps review. Let me compile my analysis: DevOps Review: PR #1214PR Scope Detection
Review Focus: Full CI/CD and script quality review for WORKFLOW and SCRIPT files. Pipeline Impact Assessment
CI/CD Quality Checks
Findings
Note: The script imports from Template Assessment
Automation Opportunities
Recommendations
VerdictDetails:
Roadmap Review DetailsBased on my review of the PR, ADR-034, the product roadmap, and the implementation details, I can now provide the strategic alignment assessment. Roadmap Review: PR #1214Strategic Alignment Assessment
Feature Completeness
Impact Analysis
Concerns
Recommendations
VerdictRun Details
Powered by AI Quality Gate workflow |
|
Re-triggering CI after description fix |
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Combine both the branch's session-log-based validation (dataclasses, allowlist patterns, session file parsing) and main's shared-allowlist diff-based validation into a unified script with --mode flag. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fix mypy type error (variable shadowing ClaimViolation with str), guard against bool values in session JSON fields, extract validation modes into separate functions, and default to diff mode for backward compatibility with existing tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Session Protocol Compliance ReportTip ✅ Overall Verdict: PASS All session protocol requirements satisfied. What is Session Protocol?Session logs document agent work sessions and must comply with RFC 2119 requirements:
See Compliance Summary
Detailed Validation ResultsClick each session to see the complete validation report with specific requirement failures. 📄 sessions-2026-02-25-session-1=== Session Validation === [PASS] Session log is valid ✨ Zero-Token ValidationThis validation uses deterministic script analysis instead of AI:
Powered by 📊 Run Details
Powered by Session Protocol Validator workflow |
Review Triage RequiredNote Priority: NORMAL - Human approval required before bot responds Review Summary
Next Steps
Powered by PR Maintenance workflow - Add triage:approved label |
Summary
Implements CI backstop for investigation-only QA skip claims per ADR-034 Phase 3 (Issue #652).
Changes
.github/scripts/validate_investigation_claims.py: Python script that validates commits claiming "SKIPPED: investigation-only" have only allowed files (.agents/sessions/, .agents/analysis/, .serena/memories/, etc.).github/scripts/tests/test_validate_investigation_claims.py: 22 pytest tests covering allowlist matching, claim detection, and validation logic.github/workflows/investigation-claim-backstop.yml: GitHub Actions workflow triggered on session log changesREADME.md: Updated project documentationHow It Works
.agents/sessions/*.jsonTest Plan
Fixes #652 (P2: CI backstop acceptance criterion)
Note: P1 (agent self-service skill) was already completed in PR #811.