Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
121 commits
Select commit Hold shift + click to select a range
fbdfd8d
Add execution environments abstraction and toolset
dmontagu Feb 21, 2026
68b4036
Add environment_factory for per-run environment isolation
dmontagu Feb 21, 2026
c847585
Remove unused variable in doc example
dmontagu Feb 21, 2026
00be4ca
Fix type errors: use lists instead of sets for include/exclude args
dmontagu Feb 21, 2026
33604e6
Work around huggingface/vllm dependency conflict
dmontagu Feb 21, 2026
01a94c3
Address PR review feedback
dmontagu Feb 21, 2026
a735b73
Filter tools at runtime based on environment capabilities
dmontagu Feb 21, 2026
9001b1a
Add ToolName type, rename edit tool to edit_file, filter by tool names
dmontagu Feb 21, 2026
c66ea20
Rename Capability to EnvCapability, fix edit strategy fallback, fix g…
dmontagu Feb 21, 2026
9460ea5
Merge branch 'main' into execution-environments
dmontagu Feb 21, 2026
a7c70d6
Rename Capability to EnvCapability, improve _resolve_edit_tool fallback
dmontagu Feb 21, 2026
282819c
Unify EnvCapability + ToolName into single EnvToolName type
dmontagu Feb 22, 2026
b635b48
Move Docker shell builders to docker.py, fix grep on hidden files
dmontagu Feb 22, 2026
9e28bc1
Use extended regex (-E) in Docker grep for consistency with Local/Memory
dmontagu Feb 22, 2026
94d0329
Add tests for full coverage, add pragmas for defensive branches
dmontagu Feb 22, 2026
2b30c3a
Catch PermissionError/OSError in edit_file tool for ModelRetry
dmontagu Feb 22, 2026
3d8b3a6
Fix Docker read_file offset validation and image NotFound handling
dmontagu Feb 22, 2026
d8d3aba
Fix coverage
dmontagu Feb 22, 2026
598a8cf
Address feedback
dmontagu Feb 22, 2026
37e4d1f
Simplify docker coverage stuff
dmontagu Feb 22, 2026
b91be81
Use anyio.Lock
dmontagu Feb 22, 2026
8392293
Fix Docker glob missing root-level files for **/ patterns
dmontagu Feb 23, 2026
a539b4e
Add environment-specific tool descriptions (regex flavor docs for grep)
dmontagu Feb 23, 2026
e567e98
Use _shared_environment directly in shared lifecycle __aenter__
dmontagu Feb 23, 2026
df4fe1a
Add test for __aenter__ with no environment configured
dmontagu Feb 23, 2026
4e01113
Strip ./ prefix from Docker glob and grep output for consistency with…
dmontagu Feb 23, 2026
99ae9c9
Handle mid-pattern **/ in Docker glob to match zero directories
dmontagu Feb 23, 2026
32a0ce7
Remove unreachable guard in _globstar_zero_dir_variants to fix coverage
dmontagu Feb 23, 2026
561eeda
Address review feedback: testable doc examples, public files accessor…
dmontagu Feb 23, 2026
2ac4084
Make process classes and container property private, narrow exception…
dmontagu Feb 24, 2026
84cfac2
wip
DouweM Feb 24, 2026
425c3e4
wip
DouweM Feb 24, 2026
f8a8e9b
Merge remote-tracking branch 'origin/execution-environments' into cap…
DouweM Feb 24, 2026
f56d4fb
wip
DouweM Feb 25, 2026
819f73b
Agent.from_spec
DouweM Feb 25, 2026
e2c7d72
cli
DouweM Feb 25, 2026
00379df
Add dynamic model_settings support (callable model_settings)
DouweM Mar 9, 2026
b731b91
Merge remote-tracking branch 'origin/main' into dynamic-settings
DouweM Mar 9, 2026
d2cbedd
Fix type annotation, consolidate tests, add docs for dynamic model_se…
DouweM Mar 9, 2026
4b48e77
Revert model_settings type to Any to fix Pydantic schema generation
DouweM Mar 10, 2026
058f3c7
Remove unused RunContext import from doc example
DouweM Mar 10, 2026
4531455
Fix pre-commit failures: undefined Instructions type, missing docstri…
DouweM Mar 13, 2026
c7ee11e
Add top-level AgentSpec fields and full Agent kwargs to from_spec
DouweM Mar 13, 2026
8587253
Merge main into capabilities, fix conflicts and test issues
DouweM Mar 13, 2026
b2ffa12
Fix CI test failures: VCR cassette for test_agent, fix reporting snap…
DouweM Mar 13, 2026
0f1ec96
Merge remote-tracking branch 'origin/capabilities' into spec-name-etc
DouweM Mar 13, 2026
5af0a9c
Merge branch 'dynamic-settings' into capabilities-dynamic-settings
DouweM Mar 16, 2026
e962092
Integrate dynamic model_settings with capabilities branch
DouweM Mar 16, 2026
b8b639d
Add ToolsetFunc support to capabilities and JSON schema generation fo…
DouweM Mar 16, 2026
e45c512
Rename ModelSettingsCapability to ModelSettings
DouweM Mar 16, 2026
f7ef02e
Fix AgentSpec model_settings to use dict for Pydantic schema compat
DouweM Mar 16, 2026
b083d87
Merge branch 'capabilities-dynamic-settings' into capabilities
DouweM Mar 16, 2026
a3a92c9
Merge remote-tracking branch 'origin/main' into spec-name-etc
DouweM Mar 16, 2026
e9542e2
Add description to AgentSpec, add tests for all spec fields
DouweM Mar 16, 2026
dd4d870
Fix CI failures: resolve Instructions type errors, fix run_id on resu…
DouweM Mar 16, 2026
eec5bf2
Merge origin/capabilities into spec-name-etc, fix Instructions type
DouweM Mar 16, 2026
485ec13
Merge branch 'spec-name-etc' into capabilities
DouweM Mar 16, 2026
c4ae7fe
Add from_file/to_file to AgentSpec with $schema support for editor DX
DouweM Mar 16, 2026
c192ecc
Add for_run and for_run_step lifecycle hooks to AbstractToolset
DouweM Mar 16, 2026
1e2b243
Merge branch 'capabilities' into toolset-state
DouweM Mar 16, 2026
ca6d0c4
Add capabilities abstraction and agent spec serialization
DouweM Mar 17, 2026
4fb43b0
Address review feedback: clean up comments, fix duplication, improve …
DouweM Mar 18, 2026
60f2bdb
Address review feedback round 2
DouweM Mar 18, 2026
999a1e0
Add tests and coverage pragmas to reach 100% coverage
DouweM Mar 18, 2026
b845907
Merge main into capabilities, resolve _agent_graph conflict
DouweM Mar 18, 2026
a58d1c6
Omit test_template.py from coverage (requires optional pydantic-handl…
DouweM Mar 18, 2026
40ae5fe
Address review round 3 + coverage
DouweM Mar 19, 2026
8317926
Merge branch 'capabilities' into toolset-state
DouweM Mar 19, 2026
f19618d
Address review round 4
DouweM Mar 19, 2026
6fa2dd9
Fix capability model_settings bug + address review round 5
DouweM Mar 20, 2026
afdadd0
Skip CLI load_agent tests when CLI deps not installed
DouweM Mar 20, 2026
7ec47c2
Support dynamic (callable) model_settings on capabilities
DouweM Mar 20, 2026
2058046
Add before/after/wrap lifecycle hooks and streaming hooks to capabili…
DouweM Mar 20, 2026
371e57d
Remove CLI load_agent tests, restore pragma (CLI deps unavailable in …
DouweM Mar 20, 2026
cd71167
Merge capabilities into hooks: dynamic model_settings, _root_capabili…
DouweM Mar 20, 2026
2247334
Fix temporal test snapshots after removing 'running tools' span
DouweM Mar 20, 2026
19b93b8
Fix dbos test snapshot for removed 'running tools' span
DouweM Mar 20, 2026
6782caa
Add for_run lifecycle hook to AbstractCapability with get_* re-extrac…
DouweM Mar 20, 2026
2fae249
Address review round 6
DouweM Mar 20, 2026
804d001
Merge capabilities into hooks: docstring improvements, tuple return type
DouweM Mar 20, 2026
b237c36
Merge branch 'capabilities' into toolset-state
DouweM Mar 20, 2026
367d5ba
Address review round 7
DouweM Mar 20, 2026
e89d1f1
Merge branch 'capabilities' into toolset-state
DouweM Mar 20, 2026
ba58a0b
Merge capabilities into hooks: review round 7
DouweM Mar 20, 2026
2ee74f1
Fix type errors in capability for_run integration
DouweM Mar 20, 2026
b7192c1
Reset unrelated files to main (stale diffs from branch rebuild)
DouweM Mar 20, 2026
4ed9df0
Fix logfire span snapshots for tool execution hook changes
DouweM Mar 20, 2026
6e703ef
Remove unrelated files from execution-environments branch
DouweM Mar 20, 2026
5c693a7
Merge branch 'capabilities' into toolset-state
DouweM Mar 20, 2026
5e2128d
Merge remote-tracking branch 'origin/capabilities' into hooks
DouweM Mar 20, 2026
312fac2
Merge capabilities, fix logfire snapshots, use BeforeModelRequestCont…
DouweM Mar 20, 2026
eebc4e4
Fix test_template and temporal dynamic toolset after _cap_* separation
DouweM Mar 20, 2026
4d85cf0
Fix model_settings None preservation in wrap_model_request paths
DouweM Mar 20, 2026
5d772e7
Fix temporal dynamic toolset: call_tool on per-activity instance
DouweM Mar 20, 2026
37041dd
Remove accidentally reintroduced 'running tools' span wrapper
DouweM Mar 20, 2026
a052174
Address PR review feedback
DouweM Mar 20, 2026
310c3ce
Address PR review comments
DouweM Mar 20, 2026
580a729
Fix unnecessary type: ignore comment (now pyright: ignore)
DouweM Mar 20, 2026
2c6dfcf
Address review round 8: cache capability settings, clean up toolsets
DouweM Mar 20, 2026
6daf13f
Remove stray analysis file
DouweM Mar 20, 2026
8c1c2a2
Remove unnecessary type ignore comment
DouweM Mar 20, 2026
96123be
Merge branch 'capabilities' into toolset-state
DouweM Mar 20, 2026
6088c22
Merge remote-tracking branch 'origin/capabilities' into hooks
DouweM Mar 21, 2026
761526f
Merge branch 'capabilities' (with hooks) into toolset-state
DouweM Mar 21, 2026
aa61789
Address review round 2
DouweM Mar 21, 2026
c04a93b
Add prepare_tools and wrap_run_step hooks to AbstractCapability
DouweM Mar 21, 2026
479cdd1
Address review round 9
DouweM Mar 21, 2026
4fe4afe
Remove stray analysis file
DouweM Mar 21, 2026
d912f1d
Update docstring examples for prepare_tools/wrap_run_step hook changes
DouweM Mar 21, 2026
abadc4d
Add capabilities documentation
DouweM Mar 21, 2026
0881700
Fix docs review feedback: use public imports, make per_run_state test…
DouweM Mar 21, 2026
973193a
Minor: use keyword arg in Thinking.__init__, simplify error message
DouweM Mar 21, 2026
a36c7f5
Rename capabilities APIs for consistency
DouweM Mar 21, 2026
b745ef0
Fix CapabilitySpec dict-as-first-arg bug, add PrepareTools capability
DouweM Mar 21, 2026
e2a48f8
Rewrite capabilities docs per review feedback
DouweM Mar 21, 2026
8cf33c4
Fix CombinedCapability dynamic settings to update ctx between callables
DouweM Mar 21, 2026
d7a9aa0
Support wrap_run error recovery by routing task errors through stream
DouweM Mar 21, 2026
a97b5d9
Document wrap_run error recovery support
DouweM Mar 21, 2026
023f3cd
Add Agent.from_file convenience method for loading agents from YAML/JSON
DouweM Mar 21, 2026
9a727e7
Merge branch 'agent-from-file' into capabilities
DouweM Mar 21, 2026
0aecad0
Use frontier model in docs, add comment on wrap_run hand-off protocol
DouweM Mar 21, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion clai/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ options:
-m MODEL, --model MODEL
Model to use, in format "<provider>:<model>" e.g. "openai:gpt-5" or "anthropic:claude-sonnet-4-6". Defaults to "openai:gpt-5".
-a AGENT, --agent AGENT
Custom Agent to use, in format "module:variable", e.g. "mymodule.submodule:my_agent"
Custom Agent to use: a module path like "module:variable" or a YAML/JSON spec file like "agent.yml"
-t CODE_THEME, --code-theme CODE_THEME
Which colors to use for code, can be "dark", "light" or any theme from pygments.org/styles/. Defaults to "dark" which works well on dark terminals.
--no-stream Disable streaming from the model
Expand Down
35 changes: 27 additions & 8 deletions docs/agent.md
Original file line number Diff line number Diff line change
Expand Up @@ -309,13 +309,6 @@ async def main():
print(nodes)
"""
[
UserPromptNode(
user_prompt='What is the capital of France?',
instructions_functions=[],
system_prompts=(),
system_prompt_functions=[],
system_prompt_dynamic_functions={},
),
ModelRequestNode(
request=ModelRequest(
parts=[
Expand Down Expand Up @@ -545,7 +538,6 @@ if __name__ == '__main__':
print(output_messages)
"""
[
'=== UserPromptNode: What will the weather be like in Paris on Tuesday? ===',
'=== ModelRequestNode: streaming partial request tokens ===',
"[Request] Starting part 0: ToolCallPart(tool_name='weather_forecast', tool_call_id='0001')",
'[Request] Part 0 args delta: {"location":"Pa',
Expand Down Expand Up @@ -712,6 +704,33 @@ print(result_sync.output)

The final request uses `temperature=0.0` (run-time), `max_tokens=500` (from model), demonstrating how settings merge with run-time taking precedence.

##### Dynamic model settings

Both agent-level and run-level `model_settings` accept a callable that receives a
[`RunContext`][pydantic_ai.tools.RunContext] and returns [`ModelSettings`][pydantic_ai.settings.ModelSettings].
The callable is invoked before each model request, so settings can vary per step.
The current resolved settings so far are available via `ctx.model_settings` inside the callable.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs say "The current resolved settings so far are available via ctx.model_settings inside the callable" — but the semantics of ctx.model_settings depend on where in the resolution chain the callable is. An agent-level callable sees only model defaults, while a run-level callable sees model defaults merged with agent settings. This should be documented more precisely, e.g.:

"When called at the agent level, ctx.model_settings contains the model's default settings. When called at the run level, it contains the model defaults merged with the agent-level settings."

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated docs to explain the full resolution order: model defaults → agent-level → capability-level → run-level, with what ctx.model_settings contains at each stage.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The statement "The current resolved settings so far are available via ctx.model_settings inside the callable" is ambiguous — what "resolved so far" means depends on which layer's callable is running:

  • In an agent-level callable, ctx.model_settings contains only model defaults.
  • In a capability-level callable, it contains model defaults + agent-level.
  • In a run-level callable, it contains all previous layers.

The numbered list below (lines 724-727) is good, but the prose on line 720 should explicitly note this position-dependence, e.g. "Inside a callable at a given layer, ctx.model_settings contains the merged result of all previous layers." The current wording on line 729 partially covers this but is easy to miss since it reads like it's about overriding rather than about what's visible.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated prose to explicitly note position-dependence of ctx.model_settings inside callables.


Settings are resolved in layers, each merged on top of the previous:

1. **Model defaults** (`model.settings`)
2. **Agent-level** (`Agent(model_settings=...)`)
3. **Capability-level** (e.g. from `Thinking()`, `ModelSettings(...)` capabilities)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The capabilities abstraction is the headline feature of this PR, but docs/agent.md only mentions it in passing (in the model settings resolution list). There's no dedicated section explaining what capabilities are, how to use the capabilities parameter, what built-in capabilities are available, or how to create custom ones.

Even for a draft PR, adding at least a basic "Capabilities" section in docs/agent.md (or a standalone docs/capabilities.md) would make the feature more discoverable and testable by reviewers. The AbstractCapability docstring has good content that could form the basis.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same — docs page coming with hooks/toolset-state merge.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The capabilities abstraction is the headline feature of this PR, but it's only mentioned here in passing (in the model settings resolution list). There's no dedicated section explaining what capabilities are, how to use the capabilities parameter, what built-in capabilities are available, or how to create custom ones.

Even for a draft PR, I'd recommend adding at least a stub "Capabilities" section (or a standalone docs/capabilities.md) before this leaves draft. The AbstractCapability docstring already has good content for this. Without docs, this feature will be hard for reviewers to evaluate from the user's perspective, and easy to forget to add later.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dedicated capabilities docs page will come once hooks and toolset-state are merged.

4. **Run-level** (`agent.run(model_settings=...)`)

Inside a callable, `ctx.model_settings` contains the merged result of all *previous* layers (position-dependent). For example, an agent-level callable sees only model defaults, while a run-level callable sees model defaults + agent-level + capability-level settings. To reset a field set by a previous layer, set it explicitly (e.g. `{'temperature': None}`).

```python
from pydantic_ai import Agent, ModelSettings

agent = Agent(
'test',
model_settings=lambda ctx: ModelSettings(
temperature=0.0 if ctx.run_step <= 1 else 0.7,
),
)
```

!!! note "Model Settings Support"
Model-level settings are supported by all concrete model implementations (OpenAI, Anthropic, Google, etc.). Wrapper models like [`FallbackModel`](models/overview.md#fallback-model), [`WrapperModel`][pydantic_ai.models.wrapper.WrapperModel], and [`InstrumentedModel`][pydantic_ai.models.instrumented.InstrumentedModel] don't have their own settings - they use the settings of their underlying models.

Expand Down
601 changes: 601 additions & 0 deletions docs/capabilities.md

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ nav:
- tools.md
- output.md
- message-history.md
- capabilities.md
- direct.md
- Models & Providers:
- Overview: models/overview.md
Expand Down
11 changes: 11 additions & 0 deletions pydantic_ai_slim/pydantic_ai/__init__.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
from importlib.metadata import version as _metadata_version

from ._template import TemplateStr
from .agent import (
Agent,
CallToolsNode,
Expand Down Expand Up @@ -42,6 +43,9 @@
ModelAPIError,
ModelHTTPError,
ModelRetry,
SkipModelRequest,
SkipToolExecution,
SkipToolValidation,
UnexpectedModelBehavior,
UsageLimitExceeded,
UserError,
Expand Down Expand Up @@ -115,6 +119,7 @@
from .tools import DeferredToolRequests, DeferredToolResults, RunContext, Tool, ToolApproved, ToolDefinition, ToolDenied
from .toolsets import (
AbstractToolset,
AgentToolset,
ApprovalRequiredToolset,
CombinedToolset,
ExternalToolset,
Expand Down Expand Up @@ -161,6 +166,9 @@
'ModelHTTPError',
'FallbackExceptionGroup',
'IncompleteToolCall',
'SkipModelRequest',
'SkipToolExecution',
'SkipToolValidation',
'UnexpectedModelBehavior',
'UsageLimitExceeded',
'UserError',
Expand Down Expand Up @@ -233,6 +241,7 @@
'ToolDenied',
# toolsets
'AbstractToolset',
'AgentToolset',
'ApprovalRequiredToolset',
'CombinedToolset',
'ExternalToolset',
Expand Down Expand Up @@ -260,6 +269,8 @@
'PromptedOutput',
'TextOutput',
'StructuredDict',
# template
'TemplateStr',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AgentModelSettings (the new ModelSettings | Callable[[RunContext], ModelSettings] union type alias) is exported from pydantic_ai.agent but not from the top-level pydantic_ai package. Users who want to type-annotate variables or function parameters that accept callable model settings will need to reach into pydantic_ai.agent to import it, which is inconsistent with ModelSettings being available at the top level.

Consider adding AgentModelSettings to this __all__ and the imports at the top of the file.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consistent with AgentMetadata — not exported from top-level.

# format_prompt
'format_as_xml',
# settings
Expand Down
Loading
Loading