Skip to content

Add spec parameter to agent.override() and agent.run()#4769

Draft
DouweM wants to merge 7 commits intocapabilitiesfrom
agent-run-spec
Draft

Add spec parameter to agent.override() and agent.run()#4769
DouweM wants to merge 7 commits intocapabilitiesfrom
agent-run-spec

Conversation

@DouweM
Copy link
Collaborator

@DouweM DouweM commented Mar 21, 2026

Summary

  • Add spec: dict[str, Any] | AgentSpec | None parameter to agent.override() and all agent.run*() / agent.iter() methods
  • Override semantics: spec values serve as defaults; explicit params always win. Root capability and builtin tools from spec capabilities are set via ContextVars.
  • Run semantics: additive merge — model as fallback, instructions added, model_settings merged, metadata merged, capabilities composed via CombinedCapability
  • Make AgentSpec.model optional (str | None = None) so partial specs work at run/override time
  • Add _resolve_spec() helper sharing capability instantiation logic with from_spec()
  • Warn for unsupported fields (retries, end_strategy, etc.) at run/override time
  • Thread spec through WrapperAgent and durable agents (Temporal, DBOS, Prefect)

Test plan

  • test_override_with_spec_instructions_and_model — spec instructions replace agent's via override
  • test_override_with_spec_explicit_param_wins — explicit override param beats spec value
  • test_override_with_spec_capabilities — override with spec capabilities works
  • test_run_with_spec_instructions_added — spec instructions added additively at run time
  • test_run_with_spec_model_as_fallback — spec model used when agent has none
  • test_run_with_spec_model_settings_merged — spec model_settings merged with run
  • test_run_with_spec_partial_no_model — partial spec without model works
  • test_run_with_spec_capabilities — combined root_capability from spec + agent
  • test_run_with_spec_metadata_merged — metadata merge with precedence
  • test_spec_unsupported_fields_warns — warning for non-default unsupported fields
  • All 384 existing tests pass

🤖 Generated with Claude Code

DouweM and others added 2 commits March 21, 2026 22:07
Allows passing an AgentSpec (dict or object) at override time (full
replacement semantics) and run time (additive merge semantics), enabling
agent optimization workflows where candidate specs can be serialized,
loaded, and tried without reconstructing the agent.

Key changes:
- Make AgentSpec.model optional so partial specs work
- Add _resolve_spec() helper that validates spec, instantiates
  capabilities, and extracts contributions
- override(spec=...): spec values as defaults, explicit params win,
  ContextVars for root_capability and builtin_tools
- iter(spec=...): additive — model as fallback, instructions/metadata
  merged, capabilities combined via CombinedCapability
- spec param threaded through all run methods, WrapperAgent, and
  durable agents (Temporal, DBOS, Prefect)
- Unsupported spec fields (retries, end_strategy, etc.) warn at
  run/override time

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions github-actions bot added the size: M Medium PR (101-500 weighted lines) label Mar 21, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Mar 21, 2026

Docs Preview

commit: 1cda25d
Preview URL: https://f68592f8-pydantic-ai-previews.pydantic.workers.dev

devin-ai-integration[bot]

This comment was marked as resolved.

The durable agents (Temporal, DBOS, Prefect) explicitly re-declare
run/run_sync/run_stream/run_stream_events/iter with overloads. Adding
`spec` to AbstractAgent's signatures without also adding it to these
overloads caused pyright "incompatible method override" errors.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
def _merged_meta(ctx: RunContext[AgentDepsT]) -> dict[str, Any]:
return {**(_spec_meta or {}), **metadata(ctx)} # type: ignore[operator]

metadata = _merged_meta
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin correctly identified this: _merged_meta captures metadata by reference, and then metadata = _merged_meta on this line means the closure now recursively calls itself. The fix is to bind the original callable to a local variable before defining the closure:

_orig_metadata = metadata

def _merged_meta(ctx: RunContext[AgentDepsT]) -> dict[str, Any]:
    return {**(_spec_meta or {}), **_orig_metadata(ctx)}  # type: ignore[operator]

metadata = _merged_meta

self,
spec: dict[str, Any] | AgentSpec | None,
custom_capability_types: Sequence[type[AbstractCapability[Any]]] = (),
) -> _ResolvedSpec | None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's significant code duplication between _resolve_spec() and from_spec() — specifically the registry building, capability instantiation, and template context setup. The coding guidelines call for extracting duplicated logic into shared helpers after 2+ occurrences. Consider extracting the shared parts (registry building + _instantiate_cap + capability loading loop) into a helper that both from_spec() and _resolve_spec() can call.


# Set capability and builtin_tools from spec
if resolved is not None and resolved.capability is not None:
cap_token = self._override_root_capability.set(_utils.Some(resolved.capability))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When override(spec=...) provides capabilities, _override_root_capability is set to ONLY the spec's CombinedCapability, which completely replaces the agent's original _root_capability (see line 1130: base_capability = override_cap.value if override_cap is not None else self._root_capability). The PR description says override uses "defaults" semantics, but for capabilities this is a full replacement — an agent constructed with capabilities=[Thinking()] loses Thinking when override(spec={'capabilities': ['Instructions']}) is used.

By contrast, the iter() path at lines 1133-1134 merges additively via CombinedCapability([base_capability, resolved.capability]). The two code paths should have consistent semantics — probably the override should also combine the agent's base capability with the spec capability, rather than replacing.

@DouweM is the replacement behavior intentional for override(), or should this be additive like at run time?

builtin_tools=[
*self._builtin_tools,
*cap_builtin_tools,
*(override_bt.value if (override_bt := self._override_builtin_tools.get()) is not None else []),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When override(spec=...) provides capabilities, the builtin tools from those capabilities will be included twice:

  1. cap_builtin_tools at line 1230: extracted from effective_capability at line 1150, which is the override capability (since override_cap is not None)
  2. override_bt.value at line 1231: set from the same override capability's get_builtin_tools() at line 1677

The _override_builtin_tools context var seems redundant with the existing cap_builtin_tools extraction that already handles the override case at lines 1147-1153. The simplest fix would be to remove _override_builtin_tools entirely and rely on the capability-based extraction.

infer_name: bool = True,
toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None,
spec: dict[str, Any] | AgentSpec | None = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PrefectAgent.iter() accepts spec but doesn't forward it in the super().iter() call at lines 840-854. This means spec is silently dropped for all PrefectAgent runs.

The same issue affects run() (line ~268, doesn't forward spec to super(WrapperAgent, self).run()), run_sync() (line ~403), run_stream() (line ~535), and run_stream_events() (line ~679). All accept spec in their signature but never pass it through.

Either forward spec=spec in all these super calls, or (if spec is intentionally not supported for durable agents) don't add the parameter and let it be handled through the **_deprecated_kwargs mechanism like DBOS/Temporal do.


spec: AgentSpec
capability: CombinedCapability[Any] | None
instructions: list[Any]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instructions: list[Any] is too loose — the type should match what _instructions.normalize_instructions() returns. Per the coding guidelines, avoid Any type annotations; use the actual type for precision.

with agent.override(spec={'capabilities': []}, model='test'):
# Override with empty caps - just make sure it works
result = await agent.run('hello')
assert result.output is not None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Several tests in both TestOverrideWithSpec and TestRunWithSpec only assert result.output is not None (here, and at lines 2433, 2458, 2480, 2494). The test guidelines say to assert meaningful behavior, not just execution or type checks. At minimum, these should verify that the capabilities/spec actually took effect (e.g. check that instructions were applied, that the model was used, etc.), and ideally snapshot result.all_messages() to validate the complete execution trace.

The existing test_override_with_spec_instructions_and_model and test_run_with_spec_instructions_added are good examples of meaningful assertions — the other tests should follow that pattern.


json_schema_path: str | None = Field(default=None, alias='$schema')
model: str
model: str | None = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making model optional changes the contract of AgentSpec — specs that were previously valid (always had a model) now allow None. While the from_spec() method has a runtime check (line 697-700), this is a breaking change for existing code that relies on spec.model being a str. Any serialized specs without a model field will now silently validate as None rather than failing at parse time.

Is this intentional? If model should only be optional at run/override time (not for from_spec()/from_file()), consider keeping model: str on AgentSpec and using a separate type or a model: str | None only in the _resolve_spec path.

toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None,
event_stream_handler: EventStreamHandler[AgentDepsT] | None = None,
spec: dict[str, Any] | AgentSpec | None = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as flagged on PrefectAgent: spec is accepted as a parameter but never forwarded in the self.dbos_wrapped_run_workflow() call at lines 366-383. The same applies to run_sync(), run_stream(), run_stream_events(), iter(), and override() throughout this file and the Temporal agent file — spec is silently dropped in all cases.

devin-ai-integration[bot]

This comment was marked as resolved.

…t class

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 3 new potential issues.

View 5 additional findings in Devin Review.

Open in Devin Review

Comment on lines +1619 to +1630
# Apply spec values as defaults where explicit params are not set
if resolved is not None:
if not _utils.is_set(name) and resolved.name is not None:
name = resolved.name
if not _utils.is_set(model) and resolved.model is not None:
model = resolved.model
if not _utils.is_set(instructions) and resolved.instructions:
instructions = resolved.instructions
if not _utils.is_set(model_settings) and resolved.model_settings is not None:
model_settings = resolved.model_settings
if not _utils.is_set(metadata) and resolved.metadata is not None:
metadata = resolved.metadata
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot Mar 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 Semantic difference between iter(spec=...) and override(spec=...)

There's a notable design asymmetry in how spec is handled:

  • In iter() (pydantic_ai_slim/pydantic_ai/agent/__init__.py:1041-1048): instructions from spec are additive — they extend the existing instructions.
  • In override() (pydantic_ai_slim/pydantic_ai/agent/__init__.py:1625-1626): instructions from spec replace the agent's instructions (they're only applied if the explicit instructions param is unset, but when applied, they become the sole instructions override).

This is a meaningful behavioral difference that could confuse users. The same asymmetry applies to model settings and metadata. This may be intentional (override = replace, run = extend), but it's worth documenting explicitly.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

DouweM and others added 3 commits March 21, 2026 23:50
- Fix infinite recursion when merging callable metadata with spec
  metadata (bind original callable before closure)
- Remove _override_builtin_tools ContextVar (was duplicating builtin
  tools from capabilities already extracted via cap_builtin_tools)
- Make override(spec=...) combine capabilities additively with agent's
  root capability instead of replacing it
- Fix _ResolvedSpec.instructions type from list[Any] to proper type
- Forward spec=spec in all durable agent super() calls (Temporal,
  DBOS, Prefect) so spec is not silently dropped
- Strengthen tests to assert meaningful behavior instead of just
  `is not None`

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…pec/resolve_spec

Both from_spec() and _resolve_spec() had duplicated logic for building
the capability registry, defining _instantiate_cap, and loading
capabilities from the registry. Extracted into a shared module-level
helper function.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
override(spec=...) should replace the agent's root capability, not
combine with it — that's the distinction from iter(spec=...) which is
additive. Updated test to verify replacement behavior.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@DouweM DouweM marked this pull request as draft March 22, 2026 00:06
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 new potential issues.

View 7 additional findings in Devin Review.

Open in Devin Review

Comment on lines 877 to 881
metadata=metadata,
infer_name=infer_name,
toolsets=toolsets,
spec=spec,
) as run:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 PrefectAgent.iter() silently drops builtin_tools parameter

The PrefectAgent.iter() method accepts builtin_tools in its signature (line 781) but does not pass it through to super().iter() in the call at lines 866–881. This means any builtin_tools provided to PrefectAgent.iter() are silently ignored. This is a pre-existing bug, but the PR modified this exact call site (adding spec=spec) while following the same pattern of forwarding all parameters — making it a directly related omission. For comparison, DBOSAgent.iter() (pydantic_ai_slim/pydantic_ai/durable_exec/dbos/_agent.py:961) and TemporalAgent.iter() (pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_agent.py:1000) both correctly pass builtin_tools=builtin_tools.

(Refers to lines 866-881)

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines +1496 to +1503
_unsupported_fields = {
'end_strategy': 'early',
'retries': 1,
'output_retries': None,
'tool_timeout': None,
'output_schema': None,
'deps_schema': None,
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 Spec fields description and instrument are silently ignored at run/override time

The _resolve_spec() method (pydantic_ai_slim/pydantic_ai/agent/__init__.py:1468-1525) captures model, name, instructions, model_settings, metadata, and capability from the spec into _ResolvedSpec. However, the description and instrument fields from AgentSpec are neither captured nor included in the _unsupported_fields warning dict (lines 1496-1503). If a user passes spec={'description': 'foo', 'instrument': True} at run/override time, these values are silently ignored without any warning. Other unsupported fields like end_strategy, retries, etc. correctly produce UserWarning. This inconsistency may confuse users.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size: M Medium PR (101-500 weighted lines)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant