Add new capabilities abstraction + make agents serializable by DouweM · Pull Request #4640 · pydantic/pydantic-ai

DouweM · 2026-03-13T03:27:58Z

Closes New "capabilities" abstraction + out of the box capabilities to let a general knowledge work agents be defined in <10 lines #4303
Closes Agent Definition from JSON/YAML (like Pydantic models) #4251

Introduces ExecutionEnvironment ABC and three implementations (LocalEnvironment, DockerEnvironment, MemoryEnvironment) along with ExecutionEnvironmentToolset for exposing coding-agent-style tools (ls, shell, read_file, write_file, replace_str, glob, grep). This is the foundation for building coding agents and other agents that need shell and filesystem access, split out from the broader code-mode work for independent review and merge.

When multiple agent.run() calls execute concurrently, a shared environment means they all operate on the same filesystem and processes. The new environment_factory parameter creates a fresh, isolated environment per async-with entry using ContextVar-scoped state. Also renames environment → shared_environment to make concurrency semantics explicit (positional arg, so existing callers still work).

Mark huggingface and outlines-vllm-offline extras as conflicting in uv, and exclude outlines-vllm-offline from --all-extras in CI and Makefile.

- Fix _recv_stream EOF check to distinguish zero-size frames from actual EOF - Make MemoryEnvironment.capabilities dynamic: include 'shell' when command_handler is set - Fix LocalEnvironment.grep to use rglob for recursive file search with glob_pattern - Fix glob_match to use regex for all patterns (fnmatch incorrectly matches '/' with '*') - Fix build_glob_cmd: add parentheses for correct find operator precedence, fix ./ prefix for -path - Add double-enter guard in DockerEnvironment._setup to prevent container leak - Add DockerEnvironment.hardened() convenience constructor for security best practices - Rename docker-sandbox optional dependency to docker-environment - Rename 'env' variable to 'environment' in docs to avoid confusion with env vars - Add lifecycle tip about pre-starting the toolset in docs

Tools are now registered unconditionally at init time and filtered in get_tools() based on the current environment's capabilities. This fixes the issue where environment_factory or use_environment() could expose tools unsupported by the runtime environment. Also unifies the Capability type — removes the toolset-level Capability (with edit_file) and EditStrategy types, using the environment-level Capability (with replace_str/apply_patch) everywhere.

- Add `ToolName` literal type for tool-level names exposed to the model (`edit_file` instead of `edit_file:replace_str`/`edit_file:apply_patch`) - `include`/`exclude` now accept `ToolName` values (e.g. `edit_file`) instead of env-level `Capability` values - Rename `_resolve_capabilities` → `_resolve_tool_names`, which maps env capabilities to tool names then applies include/exclude filtering - Rename `replace_str` tool → `edit_file` (the function exposed to models) - Update `Capability` values: `replace_str` → `edit_file:replace_str`, `apply_patch` → `edit_file:apply_patch` in all environments - Update docs and tests

…rep glob filtering - Rename `Capability` to `EnvCapability` for clarity - Remove unused `instructions()` method from base class - Fix `_resolve_edit_tool` to fall back to auto-detection when env doesn't support the explicit strategy - Fix `MemoryEnvironment.grep` to skip glob filtering for exact file paths, matching `LocalEnvironment` behavior

- Rename `Capability` → `EnvCapability` to free up the name for other use - `_resolve_edit_tool` now falls back to auto-detection when the explicit `edit_strategy` isn't supported by the environment - Remove `instructions` method from base class and DockerEnvironment, along with associated tests - Update all imports and type annotations across environments and tests

Collapse the two separate Literal types (EnvCapability for what environments can do, ToolName for what's exposed to models) into a single EnvToolName, since they now map 1:1. Remove the premature apply_patch method, the edit_strategy parameter, and the _resolve_edit_tool() machinery.

- Move shell_escape, build_read_file_cmd, build_grep_cmd, build_glob_cmd, filter_grep_count_output, parse_glob_output from _base.py to docker.py as private helpers (_shell_escape, etc.) - Fix grep skipping explicitly-specified hidden files in LocalEnvironment and MemoryEnvironment (e.g. grep(pattern, path='.env') now works)

Docker's grep defaults to BRE where |, +, ? are literal characters. Local/Memory environments use Python's re.compile() which is closer to ERE. Adding -E makes Docker grep behavior consistent.

- Add tests for Docker process wait polling, recv_stderr, stream buffering, hardened constructor, setup early return, is_alive, read_file binary fallback, ls edge cases - Add tests for Local recv without timeout, EndOfStream, binary read_file, grep truncation - Add tests for Memory ls dedup, grep truncation - Mark defensive Docker branches with # pragma: no cover - Mark Docker __aenter__/__aexit__ with # pragma: lax no cover

Aligns edit_file exception handling with read_file and write_file, which already catch these errors for path traversal and OS-level failures.

- Raise ValueError when offset exceeds file length, matching Local/Memory - Catch docker.errors.NotFound in _read_file_bytes_sync, convert to FileNotFoundError - Update MockContainer awk handler to simulate offset/limit behavior

ExecutionEnvironmentToolset.get_tools() now pulls tool descriptions from the active environment's method docstrings when present, replacing the generic defaults. This lets each environment document its specific behavior for the LLM (e.g. regex syntax for grep). - DockerEnvironment.grep: documents POSIX ERE (grep -E) limitations - LocalEnvironment.grep / MemoryEnvironment.grep: notes Python re syntax

Avoids a subtle interaction where use_environment() override could be entered into the shared exit stack instead of the actual shared environment.

… Local/Memory environments

find's -path treats the literal / in **/ as requiring at least one directory level. Generalize the existing startswith('**/') handling to cover **/ appearing anywhere in the pattern by generating all collapsed variants.

…, API docs - Mock DockerEnvironment with LocalEnvironment in test harness so 11 of 15 environment doc examples now run in CI (up from 2) - Add public `files` property to MemoryEnvironment for test assertions - Add EnvToolName to API reference members list

… handling - Rename DockerEnvironmentProcess → _DockerEnvironmentProcess (internal impl detail) - Rename LocalEnvironmentProcess → _LocalEnvironmentProcess (internal impl detail) - Rename .container → ._required_container (avoid coupling users to docker-py) - Narrow except Exception → except (DockerException, OSError) in teardown/is_alive - Remove unnecessary r-prefix from ExecutionProcess docstring

github-actions · 2026-03-21T14:40:52Z

pydantic_ai_slim/pydantic_ai/capabilities/thinking.py

+        # thinking settings. Cast needed because ModelSettings is a TypedDict and
+        # these provider-specific keys aren't in the base type.
+        # Providers covered: OpenAI, Anthropic, Google (google.genai SDK), Gemini (direct API)
+        super().__init__(


Thinking extends ModelSettings (a @dataclass) but bypasses the dataclass-generated __init__ with a custom __init__ that calls super().__init__(cast(..., {...})). This is fragile: if ModelSettings gains additional fields in the future, this __init__ won't set them.

A cleaner approach would be to just call the dataclass __init__ directly: super().__init__(settings=cast(_ModelSettings, {...})). That way the dataclass machinery handles field initialization properly.

Fixed: now uses super().__init__(settings=cast(...)) with the keyword arg.

github-actions · 2026-03-21T14:40:54Z

pydantic_ai_slim/pydantic_ai/capabilities/thinking.py

+                'Thinking() does not accept arguments yet — configurable parameters will be available once'
+                ' #3894 lands. Use ModelSettings capability for custom thinking settings.'
+            )
+        return cls()


The from_spec error message references a GitHub issue number (#3894) which is opaque to users who encounter this error. Consider replacing it with a user-friendly message that just says configurable thinking parameters aren't supported yet, and suggests using ModelSettings as a workaround, without the issue reference.

Fixed: removed issue reference from user-facing error message.

github-actions · 2026-03-21T14:40:59Z

pydantic_ai_slim/pydantic_ai/agent/__init__.py

+                        assert r is not None
+                        return r
+
+                    _wrap_task = asyncio.create_task(run_capability.wrap_run(run_ctx, handler=_do_run))


The asyncio.create_task + asyncio.Event cooperative hand-off pattern here (and in the streaming wrap_model_request path in _agent_graph.py) is quite subtle. A comment block explaining the protocol would help future readers:

# wrap_run cooperative hand-off: # 1. _do_run runs before_run, signals readiness, waits for completion # 2. wrap_run wraps _do_run via capability middleware chain # 3. Caller waits for readiness, yields agent_run, then signals completion # 4. On error: cancels the wrap task; on success: awaits wrap result + after_run

Comments already added in a previous round.

github-actions · 2026-03-21T14:41:02Z

docs/capabilities.md

+
+By default, a capability instance is shared across all runs of an agent. If your capability accumulates mutable state that should not leak between runs, override [`for_run`][pydantic_ai.capabilities.AbstractCapability.for_run] to return a fresh instance:
+
+```python {title="per_run_state.py" test="skip"}


The per_run_state.py example uses test="skip" but there's nothing here that requires skipping — it's a pure in-memory example with no external dependencies. Per docs guidelines, test="skip" should be avoided unless unavoidable.

Fixed in 0881700.

github-actions · 2026-03-21T14:41:06Z

pydantic_ai_slim/pydantic_ai/capabilities/__init__.py

+
+# Short name is intentional — passing a dict is enough to get type checking,
+# and users rarely need both this and settings.ModelSettings in the same scope.
+from .model_settings import ModelSettings


This has been discussed before and the short name is kept intentionally, but I want to note one concrete risk: from pydantic_ai.capabilities import ModelSettings and from pydantic_ai import ModelSettings resolve to different types (a @dataclass capability class vs a TypedDict). A user who does from pydantic_ai.capabilities import * alongside from pydantic_ai import ModelSettings (or vice versa) will silently get the wrong one. At minimum, consider adding a # noqa comment that explains the shadowing is intentional, and ensure the capabilities docs page explicitly warns about this when showing the import.

Intentional — discussed with maintainer. The short name is kept for ergonomics.

…able Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

docs/capabilities.md

adtyavrdhn · 2026-03-21T15:58:37Z

docs/capabilities.md

+agent = Agent(
+    'anthropic:claude-sonnet-4-20250514',
+    capabilities=[
+        Instructions('You are a research assistant. Be thorough and cite sources.'),


What do we expect to happen if someone sends in two Instructions capabilities contradicting each other? This example is a little non-deterministic but the point is, for capabilities that interact more with the internals of the code instead of a prompt but enforce different constraints?

We just concatenate all instructions with \n\n, so I guess it's up to the user not to be contradictory, and up to capabilities to only add instructions that relate to them and are unlikely to affect other capabilities' instructions and user instructions. Of course the user could add 2 capabilities that are fundamentally incompatible -- is that what you're thinking of?

Of course the user could add 2 capabilities that are fundamentally incompatible -- is that what you're thinking of?

Yes we should at from_spec() time detect incompatible capabilities. I think capabilities can have a field where they can mark which ones they are incompatible with although I am not sure how that would scale but I would rather the agent crash before rather than do weird stuff on prod. Throwing it out there.

devin-ai-integration

Devin Review found 2 new potential issues.

View 10 additional findings in Devin Review.

devin-ai-integration · 2026-03-21T17:39:01Z

pydantic_ai_slim/pydantic_ai/capabilities/combined.py

+        def resolve(ctx: RunContext[AgentDepsT]) -> ModelSettings:
+            merged = static_settings
+            for func in dynamic_settings:
+                merged = merge_model_settings(merged, func(ctx))
+            return merged if merged is not None else ModelSettings()
+
+        return resolve


🚩 CombinedCapability dynamic model settings don't update ctx.model_settings between capability callables

In CombinedCapability.get_model_settings() (combined.py:56-60), when multiple capabilities provide dynamic (callable) settings, the resolve() closure calls each function sequentially but does NOT update ctx.model_settings between calls. Compare with the agent-level resolver (agent/__init__.py:990-1013) which explicitly sets run_context.model_settings = merged between each layer.

This means if capabilities A and B both provide dynamic settings within the same CombinedCapability, B's callable will see ctx.model_settings from the layer before the entire capability group — it won't see A's contribution. The docs at docs/capabilities.md:176 say the callable sees "the merged result of all layers resolved before this capability", which is ambiguous about whether "this capability" means the individual capability or the combined group. Current behavior treats all capabilities as a single layer, which is internally consistent but could surprise users.

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-03-21T17:39:01Z

pydantic_ai_slim/pydantic_ai/agent/__init__.py

+                    async def _do_run() -> AgentRunResult[Any]:
+                        await run_capability.before_run(run_ctx)
+                        _run_ready.set()
+                        await _run_done.wait()
+                        if _run_error is not None:
+                            raise _run_error
+                        r = agent_run.result
+                        assert r is not None
+                        return r
+
+                    _wrap_task = asyncio.create_task(run_capability.wrap_run(run_ctx, handler=_do_run))
+
+                    # Wait for handler to start or wrap_run to complete (short-circuit)
+                    _ready_waiter = asyncio.create_task(_run_ready.wait())
+                    await asyncio.wait({_ready_waiter, _wrap_task}, return_when=asyncio.FIRST_COMPLETED)
+                    _ready_waiter.cancel()
+
+                    _short_circuited = _wrap_task.done() and not _run_ready.is_set()
+                    if _short_circuited:
+                        _result = _wrap_task.result()
+                        _result = await run_capability.after_run(run_ctx, result=_result)
+                        agent_run._result_override = _result  # pyright: ignore[reportPrivateUsage]
+
                    try:
                        yield agent_run
+                    except BaseException as _exc:
+                        _run_error = _exc
+                        raise
                    finally:
                        if agent_run.result is not None:
                            run_metadata = self._resolve_and_store_metadata(agent_run.ctx, metadata)
                        else:
                            run_metadata = graph_run.state.metadata

+                        if not _short_circuited:
+                            _run_done.set()
+                            if _run_error is None and agent_run.result is not None:
+                                _result = await _wrap_task
+                                _result = await run_capability.after_run(run_ctx, result=_result)
+                                agent_run._result_override = _result  # pyright: ignore[reportPrivateUsage]
+                            elif not _wrap_task.done():
+                                _wrap_task.cancel()
+                                try:
+                                    await _wrap_task
+                                except (asyncio.CancelledError, BaseException):
+                                    pass


🚩 wrap_run task coordination in iter() handles error propagation correctly but doesn't support wrap_run error recovery

The iter() method's _do_run / _wrap_task coordination (agent/__init__.py:1115-1166) correctly propagates user exceptions: when the user's code raises inside async with agent.iter(...) as agent_run:, _run_error is set and re-raised, then _wrap_task is cancelled in the finally block.

However, if a wrap_run implementation catches the error from handler() and returns a recovery result, that result is silently discarded — the user's original exception always propagates. This is because the finally block only awaits _wrap_task for its result when _run_error is None. Whether this is intentional depends on whether wrap_run error recovery is a supported use case. The docs don't mention it, and the test suite doesn't test for it. Worth documenting this limitation if wrap_run is intended to support try/catch patterns around handler().

Was this helpful? React with 👍 or 👎 to provide feedback.

- HistoryProcessorCapability → HistoryProcessor (brevity) - _instructions.Instructions → AgentInstructions (like AgentModelSettings, AgentMetadata) - BeforeModelRequestContext → ModelRequestContext (used in wrap too, not just before) - wrap_run_step → wrap_node_run (distinguishes from ctx.run_step which counts model requests) - Add AgentToolset type alias (AbstractToolset | ToolsetFunc, like AgentModelSettings pattern) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Port serializes_as_string_keyed_dict guard from pydantic_evals to _spec.py so NamedSpec.serialize() doesn't misinterpret a dict with all-string keys as kwargs on round-trip (affects ModelSettings etc.) - Add PrepareTools capability that wraps a ToolsPrepareFunc callable, like Toolset wraps AbstractToolset. Not spec-serializable. - Deduplicate the helper: pydantic_evals now imports from pydantic_ai._spec. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Replace MathTools with pre-built toolset (not dynamically generated) - Make template_instructions.py testable (no test=skip) - Replace AdaptiveTokenLimit with ThinkingOnRetry (more realistic) - Clearer hook tables with full type signatures and validation timing - Add wrap_node_run example (NodeLogger) - Add wrap_run_event_stream example (StreamLogger), reference UI docs - Replace tool approval guardrail with PII redaction guardrail - Move Skip exceptions to their own section before hook tables - Replace cost tracker with logging middleware example using wrap_* - Fix AgentSpec instructions field to show TemplateStr type - Remove unnecessary from_spec override in RateLimit example - Add PrepareTools to built-in capabilities table Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

When multiple capabilities provide dynamic (callable) model settings, update ctx.model_settings between each callable so later capabilities can see earlier capabilities' contributions, matching the agent-level resolver behavior. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Change GraphRun._run_tracked_task to catch exceptions from node execution and send them through the memory stream as error results, instead of letting them propagate into the anyio TaskGroup (which transforms them into CancelledError/ExceptionGroup). The original exception is re-raised in iter_graph on the caller's side. This preserves the original exception through the entire chain, allowing Agent.iter()'s wrap_run hook to catch and recover from errors. If wrap_run catches the error from handler() and returns a recovery result, the exception is suppressed. If not, it propagates normally. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

github-actions · 2026-03-21T23:31:38Z

pydantic_ai_slim/pydantic_ai/agent/spec.py

+            instrument: bool | None = None
+            metadata: dict[str, Any] | None = None
+            if capability_schema_types:  # pragma: no branch
+                capabilities: list[Union[tuple(capability_schema_types)]] = []  # pyright: ignore  # noqa: UP007


_AgentSpecSchema manually duplicates every field from AgentSpec, and even drops TemplateStr from several fields (e.g. description is str | None here but TemplateStr | str | None in AgentSpec, and same for instructions). This has been flagged in multiple previous review rounds but remains unaddressed.

The schema divergence means the generated JSON schema doesn't reflect the actual validation behavior — users who rely on the schema for IDE autocompletion or validation will get incorrect type information, and any future field additions to AgentSpec that aren't mirrored here will silently produce an incomplete schema.

Consider deriving _AgentSpecSchema programmatically from AgentSpec (e.g. AgentSpec.model_fields), or at minimum add a test that asserts the field names match between the two classes to catch drift. @DouweM

github-actions · 2026-03-21T23:31:42Z

pydantic_ai_slim/pydantic_ai/agent/abstract.py

    @description.setter
    @abstractmethod
-    def description(self, value: str | None) -> None:
+    def description(self, value: TemplateStr[AgentDepsT] | str | None) -> None:


The abstract setter declares str | None but the concrete Agent.description.setter at agent/__init__.py:844 accepts TemplateStr[AgentDepsT] | str | None. This means a WrapperAgent or other AbstractAgent subclass that follows the abstract contract would reject TemplateStr values, and Pyright will flag the override as incompatible.

The abstract setter should be updated to match: TemplateStr[AgentDepsT] | str | None.

github-actions · 2026-03-21T23:31:48Z

pydantic_ai_slim/pydantic_ai/capabilities/combined.py

+                ctx.model_settings = merge_model_settings(ctx.model_settings, merged)
+                resolved = func(ctx)
+                merged = merge_model_settings(merged, resolved)
+            return merged if merged is not None else ModelSettings()


merged if merged is not None else ModelSettings() — ModelSettings is a TypedDict, so ModelSettings() creates an empty dict which is falsy. This is fine for the is not None check, but the returned callable's return type annotation is ModelSettings while the overall get_model_settings return type includes | None. When there are no dynamic settings and static_settings is None, the method correctly returns None from line 54.

However, inside this closure, if every dynamic callable returns None-ish settings and static_settings is also None, merged will be None, and this produces an empty ModelSettings() dict. That empty dict is then merged at the agent level as a no-op, which is fine functionally, but it means the callable never returns None — it always returns a dict. This is subtly different from static_settings returning None on line 54. Consider using return merged or None to be consistent (a no-op merge target is indistinguishable from None).

github-actions · 2026-03-21T23:31:53Z

pydantic_ai_slim/pydantic_ai/capabilities/__init__.py

+    ModelSettings,
+    Thinking,
+    WebSearch,
+)


DEFAULT_CAPABILITY_TYPES (used for spec schema generation) includes only Instructions, ModelSettings, Thinking, and WebSearch, while CAPABILITY_TYPES (used for the registry) also includes HistoryProcessor and Toolset. Since DEFAULT_CAPABILITY_TYPES drives the JSON schema output in model_json_schema_with_capabilities, HistoryProcessor and Toolset won't appear in the generated schema unless passed as custom_capability_types.

If the intent is that these two are not useful in YAML/JSON specs (since they take non-serializable callables/objects), that's reasonable but should be documented. A comment explaining why DEFAULT_CAPABILITY_TYPES is a subset of CAPABILITY_TYPES would help future readers.

github-actions · 2026-03-21T23:31:59Z

pydantic_ai_slim/pydantic_ai/capabilities/thinking.py

+            )
+        return cls()
+
+    def __init__(self):


Defining __init__ on a @dataclass subclass bypasses the dataclass-generated __init__, which means Thinking doesn't participate in the standard dataclass field protocol. This works because super().__init__(settings=cast(...)) calls ModelSettings.__init__ directly, but it's fragile — if ModelSettings gains additional fields, Thinking.__init__ won't pass them.

Since Thinking is essentially ModelSettings with hardcoded settings, consider making it a classmethod factory instead, or using field(default_factory=...) on the settings field to avoid the manual __init__ override entirely:

@dataclass class Thinking(ModelSettings[AgentDepsT]): settings: _ModelSettings = field(default_factory=lambda: cast(_ModelSettings, { 'openai_reasoning_effort': 'high', ... }))

github-actions · 2026-03-21T23:32:06Z

pydantic_ai_slim/pydantic_ai/agent/__init__.py

+                        assert r is not None
+                        return r
+
+                    _wrap_task = asyncio.create_task(run_capability.wrap_run(run_ctx, handler=_do_run))


The asyncio.create_task + Event cooperative hand-off pattern for wrap_run is quite complex and has been flagged repeatedly. The pattern has at least one subtle edge case: if wrap_run raises before calling handler() (e.g. a precondition check), _run_ready is never set, and _wrap_task.done() becomes True with _run_ready.is_set() False, triggering the short-circuit path that calls _wrap_task.result() — which will re-raise the exception from wrap_run. That exception isn't caught here, so it'll propagate out of the iter() context manager.

This seems correct but is very hard to reason about. A thorough comment block explaining the state machine (which combinations of _run_ready, _run_done, _run_error, _short_circuited, and _wrap_task.done() are possible and what each means) would make this much more maintainable.

github-actions · 2026-03-21T23:32:12Z

pydantic_ai_slim/pydantic_ai/_template.py

+    for template compilation, schema validation, and rendering.
+
+    Example:
+        ```python {test="skip"}


The docstring example uses test="skip" which is understandable since pydantic-handlebars is an optional dependency. However, the entire TemplateStr class (including __init__, render, __call__, and the Pydantic schema hook) calls _import_pydantic_handlebars() or uses compiled templates that require it. This means these code paths have zero test coverage in CI unless pydantic-handlebars is installed in the test environment.

Looking at pyproject.toml, handlebars = ["pydantic-handlebars>=0.1.0"] is added as an optional dependency group but I don't see it included in the test dependencies. The test_template.py tests presumably need this. Can you confirm that pydantic-handlebars is installed in CI for the test runs that exercise TemplateStr?

github-actions · 2026-03-21T23:32:14Z

docs/capabilities.md

+from pydantic_ai.capabilities import Instructions, ModelSettings, Thinking, WebSearch
+
+agent = Agent(
+    'anthropic:claude-sonnet-4-20250514',


This example uses 'anthropic:claude-sonnet-4-20250514' which is not a frontier model. Per the docs guidelines, use the latest frontier model — e.g. 'anthropic:claude-opus-4-6' or 'openai:gpt-5.2'.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

dmontagu added 30 commits February 20, 2026 17:52

Remove unused variable in doc example

c847585

Fix type errors: use lists instead of sets for include/exclude args

00be4ca

Work around huggingface/vllm dependency conflict

33604e6

Mark huggingface and outlines-vllm-offline extras as conflicting in uv, and exclude outlines-vllm-offline from --all-extras in CI and Makefile.

Merge branch 'main' into execution-environments

9460ea5

Use extended regex (-E) in Docker grep for consistency with Local/Memory

9e28bc1

Docker's grep defaults to BRE where |, +, ? are literal characters. Local/Memory environments use Python's re.compile() which is closer to ERE. Adding -E makes Docker grep behavior consistent.

Catch PermissionError/OSError in edit_file tool for ModelRetry

2b30c3a

Aligns edit_file exception handling with read_file and write_file, which already catch these errors for path traversal and OS-level failures.

Fix coverage

d8d3aba

Address feedback

598a8cf

Simplify docker coverage stuff

37e4d1f

Use anyio.Lock

b91be81

Fix Docker glob missing root-level files for **/ patterns

8392293

Use _shared_environment directly in shared lifecycle __aenter__

e567e98

Avoids a subtle interaction where use_environment() override could be entered into the shared exit stack instead of the actual shared environment.

Add test for __aenter__ with no environment configured

df4fe1a

Strip ./ prefix from Docker glob and grep output for consistency with…

4e01113

… Local/Memory environments

Handle mid-pattern **/ in Docker glob to match zero directories

99ae9c9

find's -path treats the literal / in **/ as requiring at least one directory level. Generalize the existing startswith('**/') handling to cover **/ appearing anywhere in the pattern by generating all collapsed variants.

Remove unreachable guard in _globstar_zero_dir_variants to fix coverage

32a0ce7

github-actions bot reviewed Mar 21, 2026

View reviewed changes

github-actions bot removed the auto-review label Mar 21, 2026

DouweM and others added 2 commits March 21, 2026 14:44

Fix docs review feedback: use public imports, make per_run_state test…

0881700

…able Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Minor: use keyword arg in Thinking.__init__, simplify error message

973193a

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

adtyavrdhn reviewed Mar 21, 2026

View reviewed changes

docs/capabilities.md Show resolved Hide resolved

adtyavrdhn reviewed Mar 21, 2026

View reviewed changes

devin-ai-integration bot reviewed Mar 21, 2026

View reviewed changes

DouweM and others added 8 commits March 21, 2026 20:17

Document wrap_run error recovery support

a97b5d9

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add Agent.from_file convenience method for loading agents from YAML/JSON

023f3cd

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Merge branch 'agent-from-file' into capabilities

9a727e7

DouweM added the auto-review label Mar 21, 2026

github-actions bot reviewed Mar 21, 2026

View reviewed changes

github-actions bot removed the auto-review label Mar 21, 2026

Use frontier model in docs, add comment on wrap_run hand-off protocol

0aecad0

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>


		By default, a capability instance is shared across all runs of an agent. If your capability accumulates mutable state that should not leak between runs, override [`for_run`][pydantic_ai.capabilities.AbstractCapability.for_run] to return a fresh instance:

		```python {title="per_run_state.py" test="skip"}

Conversation

DouweM commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

DouweM commented Mar 13, 2026 •

edited

Loading