Add learnings from code review analysis#2280
Conversation
Distilled patterns from recent PR reviews into: - New skill: .openhands/skills/eval-risk-assessment/SKILL.md - AGENTS.md guideline updates (Code Review Learnings section) Analysis summary: - PRs analyzed: 15 merged PRs from the past 30 days - Comments processed: 25+ review comments - Patterns identified: - Eval Risk Assessment (3+ occurrences) - Testing new behavior - Input validation - API documentation Co-authored-by: openhands <openhands@all-hands.dev>
API breakage checks (Griffe)Result: Failed Log excerpt (first 1000 characters) |
Agent server REST API breakage checks (OpenAPI)Result: Passed |
Focus on actionable guidelines rather than implementation details of how the skills were created. Co-authored-by: openhands <openhands@all-hands.dev>
The AGENTS.md file itself is coding guidelines - no need for a separate wrapper section. Co-authored-by: openhands <openhands@all-hands.dev>
|
@OpenHands for eval-risk-assessment could we instead add a single line to AGENTS.md explaining the cases where integration tests should be run? So this whole PR can be one line. |
|
I'm on it! neubig can track my progress at all-hands.dev |
Remove the separate eval-risk-assessment skill and consolidate guidance into a single line in the TESTING section of AGENTS.md. Co-authored-by: openhands <openhands@all-hands.dev>
SummaryI addressed the reviewer's request to simplify the PR from a separate skill file to a single line in AGENTS.md. Checklist✅ Request addressed: Removed the 50-line ✅ Concise changes: The PR is now exactly one line added to the
✅ Already pushed: Changes were committed and pushed to Final Diff- .openhands/skills/eval-risk-assessment/SKILL.md (deleted, 50 lines)
+ AGENTS.md: 1 line added to TESTING sectionThe PR now matches the reviewer's request for "one line" capturing when integration tests should be run. |
The integration tests can be triggered by adding the 'integration-test' label to a PR, which the agent can do. Co-authored-by: openhands <openhands@all-hands.dev>
all-hands-bot
left a comment
There was a problem hiding this comment.
🟢 Good Taste - Pragmatic Documentation
The AGENTS.md guideline addition is clear and actionable. Flagging eval-risk changes for integration testing solves a real problem (preventing benchmark regressions) without over-engineering.
🟡 PR Description Mismatch
The description claims to add .openhands/skills/eval-risk-assessment/SKILL.md, but the commit history shows you simplified it to a single line in AGENTS.md instead (see "Simplify eval-risk to single AGENTS.md line"). The PR description should be updated to reflect this pragmatic decision.
Verdict: Content is solid. Just update the description to match what shipped, then this is ready to merge.
Summary
Adds code review learnings distilled from PR feedback analysis to AGENTS.md and creates a new eval-risk-assessment skill.
Generated using the
learn-from-code-reviewskill.Details
Analysis Summary
New Skill:
.openhands/skills/eval-risk-assessment/SKILL.mdGuidelines for identifying and flagging changes that could affect benchmark/evaluation performance:
AGENTS.md Updates
Added general guidelines distilled from review feedback:
Testing
Content-only PR (documentation/skills). No unit tests applicable.
Verified:
Evidence
Content-only PR - no live run required.
The learnings were validated by cross-referencing with the source PR reviews (links included in the PR files).
Checklist