Skip to content

Commit 3cd69e5

Browse files
zparnoldclaude
andcommitted
Merge upstream/main into main (2026-03-02)
127 upstream commits merged. 2 conflicts resolved: - .gitignore: kept both upstream api_compliance outputs and our sync marker - vscode_service.py: adopted upstream's config-based server_base_path (replaces our env-var approach — same functionality, cleaner design) Upstream lint fixes applied: - Line length violations in build.py, llm.py, test_llm_config.py - Enum str inheritance (ruff unsafe-fixes) Upstream test fixes: - Removed test_check_agent_server_rest_api_breakage.py (references OperationKey removed in OpenHands#2240 oasdiff rewrite) - Fixed test_model -> check_model rename in test_resolve_model_config.py Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2 parents 87eb7f0 + 2d3d96d commit 3cd69e5

File tree

272 files changed

+28604
-6235
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

272 files changed

+28604
-6235
lines changed
Lines changed: 159 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,159 @@
1+
---
2+
name: custom-codereview-guide
3+
description: Repo-specific code review guidelines for OpenHands/software-agent-sdk. Provides SDK-specific review rules in addition to the default code review skill.
4+
triggers:
5+
- /codereview
6+
---
7+
8+
# OpenHands/software-agent-sdk Code Review Guidelines
9+
10+
You are an expert code reviewer for the **OpenHands/software-agent-sdk** repository. This skill provides repo-specific review guidelines. Be direct but constructive.
11+
12+
## Review Decisions
13+
14+
You have permission to **APPROVE** or **COMMENT** on PRs. Do not use REQUEST_CHANGES.
15+
16+
### Review decision policy (eval / benchmark risk)
17+
18+
Do **NOT** submit an **APPROVE** review when the PR changes agent behavior or anything
19+
that could plausibly affect benchmark/evaluation performance.
20+
21+
Examples include: prompt templates, tool calling/execution, planning/loop logic,
22+
memory/condenser behavior, terminal/stdin/stdout handling, or evaluation harness code.
23+
24+
If a PR is in this category (or you are uncertain), leave a **COMMENT** review and
25+
explicitly flag it for a human maintainer to decide after running lightweight evals.
26+
27+
### Default approval policy
28+
29+
**Default to APPROVE**: If your review finds no issues at "important" level or higher,
30+
approve the PR. Minor suggestions or nitpicks alone are not sufficient reason to
31+
withhold approval.
32+
33+
**IMPORTANT:** If you determine a PR is worth merging **and it is not in the eval-risk
34+
category above**, you should approve it. Don’t just say a PR is "worth merging" or
35+
"ready to merge" without actually submitting an approval. Your words and actions should
36+
be consistent.
37+
38+
### When to APPROVE
39+
40+
Examples of straightforward and low-risk PRs you should approve (non-exhaustive):
41+
42+
- **Configuration changes**: Adding models to config files, updating CI/workflow settings
43+
- **CI/Infrastructure changes**: Changing runner types, fixing workflow paths, updating job configurations
44+
- **Cosmetic changes**: Typo fixes, formatting, comment improvements, README updates
45+
- **Documentation-only changes**: Docstring updates, clarifying notes, API documentation improvements
46+
- **Simple additions**: Adding entries to lists/dictionaries following existing patterns
47+
- **Test-only changes**: Adding or updating tests without changing production code
48+
- **Dependency updates**: Version bumps with passing CI
49+
50+
### When NOT to APPROVE - Blocking Issues
51+
52+
**DO NOT APPROVE** PRs that have any of the following issues:
53+
54+
- **Package version bumps in non-release PRs**: If any `pyproject.toml` file has changes to the `version` field (e.g., `version = "1.12.0"``version = "1.13.0"`), and the PR is NOT explicitly a release PR (title/description doesn't indicate it's a release), **DO NOT APPROVE**. Version numbers should only be changed in dedicated release PRs managed by maintainers.
55+
- Check: Look for changes to `version = "..."` in any `*/pyproject.toml` files
56+
- Exception: PRs with titles like "release: v1.x.x" or "chore: bump version to 1.x.x" from maintainers
57+
58+
Examples:
59+
- A PR adding a new model to `resolve_model_config.py` or `verified_models.py` with corresponding test updates
60+
- A PR adding documentation notes to docstrings clarifying method behavior (e.g., security considerations, bypass behaviors)
61+
- A PR changing CI runners or fixing workflow infrastructure issues (e.g., standardizing runner types to fix path inconsistencies)
62+
63+
### When to COMMENT
64+
65+
Use COMMENT when you have feedback or concerns:
66+
67+
- Issues that need attention (bugs, security concerns, missing tests)
68+
- Suggestions for improvement
69+
- Questions about design decisions
70+
- Minor style preferences
71+
72+
If there are significant issues, leave detailed comments explaining the concerns—but let a human maintainer decide whether to block the PR.
73+
74+
## Core Principles
75+
76+
1. **Simplicity First**: Question complexity. If something feels overcomplicated, ask "what's the use case?" and seek simpler alternatives. Features should solve real problems, not imaginary ones.
77+
78+
2. **Pragmatic Testing**: Test what matters. Avoid duplicate test coverage. Don't test library features (e.g., `BaseModel.model_dump()`). Focus on the specific logic implemented in this codebase.
79+
80+
3. **Type Safety**: Avoid `# type: ignore` - treat it as a last resort. Fix types properly with assertions, proper annotations, or code adjustments. Prefer explicit type checking over `getattr`/`hasattr` guards.
81+
82+
4. **Backward Compatibility**: Evaluate breaking change impact carefully. Consider API changes that affect existing users, removal of public fields/methods, and changes to default behavior.
83+
84+
## What to Check
85+
86+
- **Complexity**: Over-engineered solutions, unnecessary abstractions, complex logic that could be refactored
87+
- **Testing**: Duplicate test coverage, tests for library features, missing edge case coverage
88+
- **Type Safety**: `# type: ignore` usage, missing type annotations, `getattr`/`hasattr` guards, mocking non-existent arguments
89+
- **Breaking Changes**: API changes affecting users, removed public fields/methods, changed defaults
90+
- **Code Quality**: Code duplication, missing comments for non-obvious decisions, inline imports (unless necessary for circular deps)
91+
- **Repository Conventions**: Use `pyright` not `mypy`, put fixtures in `conftest.py`, avoid `sys.path.insert` hacks
92+
- **Event Type Deprecation**: Changes to event types (Pydantic models used in serialization) must handle deprecated fields properly
93+
94+
## Event Type Deprecation - Critical Review Checkpoint
95+
96+
When reviewing PRs that modify event types (e.g., `TextContent`, `Message`, `Event`, or any Pydantic model used in event serialization), **DO NOT APPROVE** until the following are verified:
97+
98+
### Required for Removing/Deprecating Fields
99+
100+
1. **Model validator present**: If a field is being removed from an event type with `extra="forbid"`, there MUST be a `@model_validator(mode="before")` that uses `handle_deprecated_model_fields()` to remove the deprecated field before validation. Otherwise, old events will fail to load.
101+
102+
2. **Tests for backward compatibility**: The PR MUST include tests that:
103+
- Load an old event format (with the deprecated field) successfully
104+
- Load a new event format (without the deprecated field) successfully
105+
- Verify both can be loaded in sequence (simulating mixed conversations)
106+
107+
3. **Test naming convention**: The version in the test name should be the **LAST version** where a particular event structure exists. For example, if `enable_truncation` was removed in v1.11.1, the test should be named `test_v1_10_0_...` (the last version with that field), not `test_v1_8_0_...` (when it was introduced). This avoids duplicate tests and clearly documents when a field was last present.
108+
109+
**Important**: Deprecated field handlers are **permanent** and should never be removed. They ensure old conversations can always be loaded.
110+
111+
### Example Pattern (Required)
112+
113+
```python
114+
from openhands.sdk.utils.deprecation import handle_deprecated_model_fields
115+
116+
class MyModel(BaseModel):
117+
model_config = ConfigDict(extra="forbid")
118+
119+
# Deprecated fields that are silently removed for backward compatibility
120+
# when loading old events. These are kept permanently.
121+
_DEPRECATED_FIELDS: ClassVar[tuple[str, ...]] = ("old_field_name",)
122+
123+
@model_validator(mode="before")
124+
@classmethod
125+
def _handle_deprecated_fields(cls, data: Any) -> Any:
126+
"""Remove deprecated fields for backward compatibility with old events."""
127+
return handle_deprecated_model_fields(data, cls._DEPRECATED_FIELDS)
128+
```
129+
130+
### Why This Matters
131+
132+
Production systems resume conversations that may contain events serialized with older SDK versions. If the SDK can't load old events, users will see errors like:
133+
134+
```
135+
pydantic_core.ValidationError: Extra inputs are not permitted
136+
```
137+
138+
**This is a production-breaking change.** Do not approve PRs that modify event types without proper backward compatibility handling and tests.
139+
140+
## What NOT to Comment On
141+
142+
Do not leave comments for:
143+
144+
- **Nitpicks**: Minor style preferences, optional improvements, or "nice-to-haves" that don't affect correctness or maintainability
145+
- **Good behavior observed**: Don't comment just to praise code that follows best practices - this adds noise. Simply approve if the code is good.
146+
- **Suggestions for additional tests on simple changes**: For straightforward PRs (config changes, model additions, etc.), don't suggest adding test coverage unless tests are clearly missing for new logic
147+
- **Obvious or self-explanatory code**: Don't ask for comments on code that is already clear
148+
- **`.pr/` directory artifacts**: Files in the `.pr/` directory are temporary PR-specific documents (design notes, analysis, scripts) that are automatically cleaned up when the PR is approved. Do not comment on their presence or suggest removing them.
149+
150+
If a PR is approvable, just approve it. Don't add "one small suggestion" or "consider doing X" comments that delay merging without adding real value.
151+
152+
## Communication Style
153+
154+
- Be direct and concise - don't over-explain
155+
- Use casual, friendly tone ("lgtm", "WDYT?", emojis are fine 👀)
156+
- Ask questions to understand use cases before suggesting changes
157+
- Suggest alternatives, not mandates
158+
- Approve quickly when code is good ("LGTM!")
159+
- Use GitHub suggestion syntax for code fixes

.openhands/skills/debug-test-examples-workflow/SKILL.md renamed to .agents/skills/debug-test-examples-workflow/SKILL.md

File renamed without changes.

0 commit comments

Comments
 (0)