diff --git a/docs/ADR/0027-ai-assisted-review.md b/docs/ADR/0027-ai-assisted-review.md new file mode 100644 index 0000000..45ee940 --- /dev/null +++ b/docs/ADR/0027-ai-assisted-review.md @@ -0,0 +1,424 @@ +# ADR_0027: Automated Code Review Solution Selection + +## Date + +Decision date: +Last status update: 2025-11-03 + +## Status + +- [x] Proposed +- [ ] Accepted +- [ ] Deprecated +- [ ] Superseded + +### Implementation Status + +- [x] Planned +- [ ] In Development +- [ ] Implemented +- [ ] Verified +- [ ] Discontinued + +## People + +### Author/Decision Owner (Single Point of Accountability) +- Alexei Karyagin + +### Consulted (Subject Matter Experts) +- Sviatoslav Alekseev + +### Informed (Affected Parties) +[People/teams affected by this decision who should be aware] + + - [ ] [Person 1] + - [ ] [Person 2] + - [ ] [Person 3] + +## Decision + +We will evaluate and select an automated code review solution that combines static code analysis with AI-powered insights to enforce quality standards, detect security vulnerabilities, and provide intelligent remediation guidance in our CI/CD pipeline. + +## Context + +Manual code reviews face significant challenges in modern software development environments: + +### Current Challenges + +* **Inconsistent quality standards** across different reviewers and projects +* **Technical debt accumulation** due to missed code smells and anti-patterns +* **Security vulnerabilities** that slip through manual inspection +* **Limited scalability** as codebases and teams grow +* **Delayed feedback loops** requiring human availability for reviews +* **Knowledge silos** where code quality depends on specific reviewers +* **Review fatigue** leading to cursory inspections of large PRs + +### Business Drivers + +* Need to maintain code quality while accelerating delivery velocity +* Regulatory requirements for security and compliance in code +* Growing team size requiring scalable review processes +* Desire to automate repetitive aspects of code review +* Goal to provide learning opportunities for junior developers through automated feedback + +### Technical Context + +* Using GitHub for version control with PR-based workflow +* GitHub Actions available for CI/CD automation +* Multiple programming languages in use across projects +* Existing test automation but limited static analysis +* Team distributed across time zones + +### Decision Criteria + +The selected solution should meet these criteria: + +1. **Quality Coverage**: Detect bugs, security vulnerabilities, code smells, and technical debt +2. **Intelligence**: Provide contextual explanations and remediation guidance, not just issue detection +3. **Integration**: Seamlessly integrate with GitHub Actions and PR workflow +4. **Scalability**: Handle growing codebase and team size +5. **Learning**: Support developer education and knowledge transfer +6. **Cost-effectiveness**: Reasonable ROI considering team size and project complexity +7. **Automation**: Minimize manual configuration and maintenance + +## Options + +We have identified five distinct approaches to automated code review: + +1. **SonarQube + Claude AI** - Static analysis with AI-powered interpretation +2. **GitHub Copilot** - Inline AI code generation during development +3. **OpenAI Codex/ChatGPT** - Conversational AI for code assistance +4. **CodeRabbit** - Automated AI-powered PR review +5. **Status Quo** - Continue with manual code reviews only + +## Consequences + +### Comprehensive Comparison + +| Aspect | SonarQube + Claude | GitHub Copilot | OpenAI Codex/ChatGPT | CodeRabbit | Status Quo | +| :---- | :---- | :---- | :---- | :---- | :---- | +| **Primary Purpose** | Quality analysis & review | Code generation | Conversational coding help | Automated PR review | Human review | +| **Development Phase** | Post-commit (CI/CD) | During coding (inline) | During coding (chat) | Post-commit (PR) | Post-commit | +| **Feedback Type** | Reactive analysis | Proactive suggestions | On-demand assistance | Reactive analysis | Reactive human | +| **Quality Focus** | Comprehensive | Limited | Variable | Moderate | Depends on reviewer | +| **Security Analysis** | OWASP mapped | None | None | Basic | Depends on reviewer | +| **Technical Debt** | Quantified | None | None | Identified | Inconsistent | +| **AI Insights** | Deep explanations | None | Conversational | PR comments | None | +| **Integration Point** | CI/CD pipeline | IDE | External chat | GitHub PR | Manual process | +| **Learning Support** | Educational feedback | By example | Interactive teaching | Code suggestions | Depends on reviewer | +| **Automation Level** | Fully automated | Semi-automated | Manual queries | Fully automated | Manual | +| **Cost Model** | Infrastructure + API | Per-developer subscription | Per-developer or usage | Per-repository subscription | Developer time | +| **Setup Complexity** | Moderate | Easy | Easy | Easy | None | +| **Best For** | Quality gates & standards | Fast coding | Learning & debugging | Quick PR feedback | Small teams | + +### Option 1: SonarQube + Claude AI (Under Evaluation) + +**Description:** +Combines SonarQube's comprehensive static code analysis with Claude AI's natural language understanding to provide detailed quality analysis and intelligent remediation guidance through GitHub Actions. + +**Workflow:** + +Developer → Commit → GitHub Actions → SonarQube Scan → +Claude Analysis → PR Comments → Quality Gate → Merge + +**Strengths:** + +* **Comprehensive Quality Coverage**: Detects bugs, vulnerabilities, code smells, duplications, and complexity issues +* **Security-First**: Maps findings to OWASP Top 10, CWE standards +* **Technical Debt Quantification**: Provides metrics and trends over time +* **Intelligent Explanations**: Claude translates technical issues into actionable insights +* **Educational Value**: Developers learn why issues matter and how to fix them +* **Enforced Standards**: Quality gates prevent merging substandard code +* **Language Agnostic**: Supports 30+ programming languages +* **Historical Analytics**: Track code quality trends across releases +* **Customizable Rules**: Adapt to team-specific coding standards + +**Limitations:** + +* **Infrastructure Required**: Needs SonarQube server deployment (or SonarQube Cloud subscription) +* **Initial Setup**: Requires configuration of quality profiles and gates +* **API Costs**: Claude API usage costs per analysis +* **Learning Curve**: Team needs to understand SonarQube metrics and concepts +* **Post-Commit Only**: Analysis happens after code is written + +**Cost Considerations:** + +* SonarQube Community Edition: Free (self-hosted) +* SonarQube Cloud: $10/month per 100K lines of code +* Claude API: ~$0.01-0.03 per analysis (depending on issue count) +* Infrastructure: Server hosting costs for self-hosted option + +**Implementation Complexity:** Moderate (requires SonarQube setup and Claude API integration) + +**Best For:** + +* Teams prioritizing quality and security standards +* Organizations with compliance requirements +* Teams wanting to reduce technical debt systematically +* Projects requiring detailed quality metrics and trends + +**Risks and Mitigations:** + +* **Risk**: High initial setup time + * **Mitigation**: Use SonarQube Cloud for faster start, migrate to self-hosted later if needed +* **Risk**: Claude API costs scaling with team size + * **Mitigation**: Implement caching, only analyze changed files, set usage limits +* **Risk**: Alert fatigue from too many findings + * **Mitigation**: Start with critical issues only, gradually expand rules + +### Option 2: GitHub Copilot + +**Description:** +AI-powered code completion tool integrated directly into IDEs, suggesting code as developers type based on context and patterns learned from billions of lines of public code. + +**Workflow:** + +Developer types → Copilot suggests → Developer accepts/modifies → Continue coding + +**Strengths:** + +* **Developer Velocity**: Significantly speeds up coding with real-time suggestions +* **Boilerplate Reduction**: Automatically generates common patterns and repetitive code +* **Multi-Language**: Supports dozens of programming languages +* **IDE Integration**: Works directly in VS Code, JetBrains, Neovim +* **Context-Aware**: Understands function signatures, comments, and surrounding code +* **Learning Tool**: Helps developers discover language idioms and libraries +* **Easy Setup**: Simple installation and activation +* **No Infrastructure**: Fully cloud-based, no servers to manage + +**Limitations:** + +* **No Quality Analysis**: Doesn't evaluate existing code or enforce standards +* **No Security Scanning**: Can't detect vulnerabilities in generated or existing code +* **Suggestion Quality**: May suggest insecure or inefficient patterns +* **License Concerns**: Generated code may resemble copyrighted code +* **No Enforcement**: Can't prevent poor quality code from being committed +* **Development Phase Only**: Doesn't help with code review or post-commit analysis +* **Dependency on Network**: Requires internet connection for suggestions + +**Cost Considerations:** + +* $10/month per developer (Individual) +* $19/month per developer (Business) +* Must be paid for every developer who uses it + +**Implementation Complexity:** Very easy (IDE plugin installation) + +**Best For:** + +* Developers writing new code from scratch +* Teams prioritizing development speed +* Learning new frameworks or languages +* Generating boilerplate and test code + +**Why Not Sufficient Alone:** + +While Copilot accelerates development, it doesn't provide the quality assurance and review capabilities needed to meet our decision criteria. It generates code but doesn't analyze, review, or enforce standards. + +--- + +### Option 3: OpenAI Codex/ChatGPT + +**Description:** +Conversational AI interface accessed via chat that can explain code, generate functions, debug issues, and provide coding assistance through natural language interaction. + +**Workflow:** + +Developer writes question → ChatGPT responds → Developer applies suggestions → Manual process + +**Strengths:** + +* **Conversational Interface**: Natural language interaction for coding questions +* **Explanation Capability**: Excellent at explaining complex code and concepts +* **Multi-Purpose**: Handles coding, debugging, architecture questions, documentation +* **No Integration Required**: Accessible via web browser +* **Code Generation**: Can generate complete functions or classes from descriptions +* **Learning Resource**: Helps developers understand unfamiliar code patterns +* **Flexible Usage**: Pay-as-you-go or subscription model + +**Limitations:** + +* **No Automation**: Requires manual copy-paste and interaction +* **No CI/CD Integration**: Can't be part of automated pipeline +* **Inconsistent Quality**: Responses vary based on prompt quality +* **No Codebase Context**: Doesn't have access to full project context +* **No Enforcement**: Can't prevent issues from reaching production +* **Time-Consuming**: Developers must manually query for each concern +* **No Quality Metrics**: Doesn't provide quantitative technical debt measures +* **Context Window Limits**: Can't analyze large codebases at once + +**Cost Considerations:** + +* ChatGPT Plus: $20/month per developer +* API Usage: $0.002-0.06 per 1K tokens (variable based on model) +* No infrastructure costs + +**Implementation Complexity:** None (direct web access) to Easy (API integration) + +**Best For:** + +* Individual developers learning and debugging +* Ad-hoc code explanation and generation +* Architecture brainstorming +* Documentation writing + +**Why Not Sufficient Alone:** + +ChatGPT is a powerful learning and development aid but lacks the automation, integration, and systematic analysis needed for consistent code quality enforcement across a team. + +### Option 4: CodeRabbit + +**Description:** +AI-powered automated code review tool that analyzes pull requests line-by-line, providing suggestions, catching issues, and offering improvements directly in GitHub PRs. + +**Workflow:** + +Developer creates PR → CodeRabbit analyzes → AI comments on PR → +Developer addresses → CodeRabbit re-reviews → Merge + +**Strengths:** + +* **Automated PR Review**: Automatically reviews every pull request +* **Line-by-Line Analysis**: Provides specific feedback on code changes +* **GitHub Integration**: Native integration with GitHub PRs +* **Quick Setup**: Easy to install as GitHub App +* **Learning Feedback**: Explains suggestions and best practices +* **Incremental Review**: Only reviews changed code, not entire codebase +* **Multiple Languages**: Supports major programming languages +* **Conversational**: Can discuss suggestions via PR comments +* **No Infrastructure**: Fully SaaS, no servers to manage + +**Limitations:** + +* **Limited Security Analysis**: Basic security checks, not comprehensive OWASP coverage +* **No Technical Debt Metrics**: Doesn't quantify debt or provide trends +* **No Quality Gates**: Can't enforce hard blockers before merge +* **AI-Only Analysis**: No established rule sets like SonarQube's quality profiles +* **Inconsistent Depth**: Analysis depth varies based on code complexity +* **Cost Scaling**: Costs increase with number of repositories +* **Newer Tool**: Less mature than SonarQube, smaller rule base +* **Limited Customization**: Less ability to customize rules to team standards + +**Cost Considerations:** + +* Free tier: Limited repositories +* Pro: $15/month per repository +* Enterprise: Custom pricing +* Scales with number of active repositories + +**Implementation Complexity:** Very easy (GitHub App installation) + +**Best For:** + +* Teams wanting quick PR feedback +* Projects needing line-level code suggestions +* Teams already using GitHub PRs extensively +* Organizations wanting easy setup + +**Comparison to SonarQube + Claude:** + +* **Breadth**: CodeRabbit provides quicker, lighter feedback; SonarQube provides deeper, more comprehensive analysis +* **Security**: SonarQube has more mature security rules with OWASP mapping +* **Metrics**: SonarQube provides quantitative quality metrics; CodeRabbit is more qualitative +* **Enforcement**: SonarQube quality gates can block merges; CodeRabbit is advisory +* **Learning**: Both provide educational feedback, but Claude can provide deeper context + +### Option 5: Status Quo (Manual Code Reviews Only) + +**Description:** +Continue with current practice of human-only code reviews through GitHub PRs without automated analysis tools. + +**Current State:** + +* Developers create PRs and request reviews from team members +* Reviewers manually inspect code for quality, bugs, and standards +* No automated quality gates or security scanning +* Review quality depends on reviewer expertise and availability + +**Strengths:** + +* **No Additional Costs**: No tool subscriptions or infrastructure +* **Human Judgment**: Reviewers can assess business logic and architecture +* **Contextual Understanding**: Humans understand business requirements and domain logic +* **No Learning Curve**: Team already familiar with current process + +**Limitations:** + +* **Inconsistent Quality**: Review depth varies by reviewer and workload +* **Knowledge Silos**: Quality depends on who reviews +* **Not Scalable**: Review time grows linearly with team size +* **Security Gaps**: Easy to miss vulnerabilities without expertise +* **No Metrics**: Can't quantify technical debt or track trends +* **Delayed Feedback**: Requires synchronous human availability +* **Fatigue Factor**: Large PRs receive cursory reviews +* **No Automation**: Manual process for every PR + +**Cost Considerations:** + +* Direct costs: $0 +* Indirect costs: Significant developer time spent on manual reviews +* Estimated: 2-4 hours per developer per week on code reviews +* Risk costs: Production bugs from missed issues + +**Why Change Is Needed:** + +The status quo doesn't meet our scaling needs or provide the consistency, security analysis, and quality metrics required for our growing team and compliance requirements. + +## Advice + +- "Have you considered the Codex? Because it's I also used the Codex CLI as well. ... Basically, you could pass it some task... or it will create a pull request or it also could participate in a discussion" (Sviatoslav Alekseev, 2025-10-29) +- "About the alternatives. I think was to mention that there are also CodeRabbit. ... I've seen Andrea use that in the pull request. It will also check automatically for the fixes" (Sviatoslav Alekseev, 2025-10-29) +- "I believe we need to collect the context for our repositories. ... What's called the style guides and pallet what we developing or it will analyze" (Sviatoslav Alekseev, 2025-10-29) +- "Does it... catches the our project structure? So we have separate `spec` repository and it have a project. does it... understand that if I submit a pull request to `qf-solochain` repository it should catch like context from... this project from the issue" (Sviatoslav Alekseev, 2025-10-29) +- "Did you measure time for review and then compare tools in your usage or ... compare tools [with] manual review" (Alexander Lygin, 2025-10-29) + +- I think [Codex] have auto code review function. So it's not like manual things... and you could have a code review guidelines and automatically code review the new pull requests from the set of repositories ... Yeah, it obviously requests the at least ChatGPT Plus subscription (Sviatoslav Alekseev, 2025-11-05) +- Same about Copilot... you can add it as your reviewer. At least in an open pull request, I can see that when I request reviews, I can request review from a Copilot added to reviewers list and so it's probably not IDE only (Alisher Khassanov, 2025-11-05) +- What if could I enable ... Codex agent into our repository? ... For example, if I am suggesting the pull request here. ... Is there admin privileges required or what? ... I mean, if we don't want to buy another subscription, I could just share mine and use the Codex inside the pull request ... if you want to test it and check how it works (Sviatoslav Alekseev, 2025-11-05) + +## Glossary + +* **Static Code Analysis**: Analyzing source code without executing it to find bugs, security vulnerabilities, and code quality issues +* **Technical Debt**: The implied cost of additional rework caused by choosing quick solutions over better approaches +* **Quality Gate**: Automated threshold that must be met before code can be merged (e.g., test coverage >80%, zero critical bugs) +* **OWASP Top 10**: Open Web Application Security Project's list of the 10 most critical web application security risks +* **CWE**: Common Weakness Enumeration, a community-developed list of software and hardware weakness types +* **Code Smell**: Surface indication of a deeper problem in the code (e.g., duplicated code, long methods) +* **Cyclomatic Complexity**: Measure of code complexity based on number of independent paths through code +* **SonarQube**: Open-source platform for continuous inspection of code quality +* **Claude**: Large language model developed by Anthropic with strong code understanding capabilities +* **GitHub Actions**: CI/CD platform integrated with GitHub for automating workflows +* **CodeRabbit**: AI-powered automated code review service +* **GitHub Copilot**: AI pair programmer that suggests code completions in real-time + +## References + +### Tool Documentation + +* **SonarQube Server**: [https://docs.sonarsource.com/sonarqube-cloud/](https://docs.sonarsource.com/sonarqube-cloud/) +* **SonarQube Cloud**: [https://docs.sonarsource.com/sonarqube-cloud/](https://docs.sonarsource.com/sonarqube-cloud/) +* **Claude API**: [https://docs.claude.com/en/api/overview](https://docs.claude.com/en/api/overview) +* **GitHub Actions**: [https://docs.github.com/en/actions](https://docs.github.com/en/actions) +* **CodeRabbit**: [https://coderabbit.ai/docs](https://coderabbit.ai/docs) +* **GitHub Copilot**: [https://docs.github.com/en/copilot](https://docs.github.com/en/copilot). [How to request code review from Copilot](https://docs.github.com/en/copilot/how-tos/use-copilot-agents/request-a-code-review/use-code-review) +* **OpenAI Codex/ChatGPT**: [https://developers.openai.com/codex/cloud/code-review/](https://developers.openai.com/codex/cloud/code-review/) + +### Standards and Best Practices + +* **OWASP Top 10**: [https://owasp.org/www-project-top-ten/](https://owasp.org/www-project-top-ten/) +* **CWE**: [https://cwe.mitre.org/](https://cwe.mitre.org/) +* **Clean Code Principles**: [https://clean-code-developer.com/](https://clean-code-developer.com/) +* **SonarQube Quality Profiles**: [https://docs.sonarsource.com/sonarqube-server/quality-standards-administration/managing-quality-profiles/introduction](https://docs.sonarsource.com/sonarqube-server/quality-standards-administration/managing-quality-profiles/introduction) + +## ADR Relationships + +[Fill in the section if applicable or leave blank for further filling.] + +### Supersedes +- ADR #[X]: [Brief description of superseded decision] + +### Superseded By +- ADR #[X]: [Brief description of superseding decision] + +### Related ADRs +- ADR #[X]: [Brief description of relationship] \ No newline at end of file