Skip to content

feat(realtime): add input guardrails for RealtimeAgent and RealtimeRunConfig#3721

Open
Skyline-9 wants to merge 4 commits into
openai:mainfrom
Skyline-9:feat/realtime-input-guardrails
Open

feat(realtime): add input guardrails for RealtimeAgent and RealtimeRunConfig#3721
Skyline-9 wants to merge 4 commits into
openai:mainfrom
Skyline-9:feat/realtime-input-guardrails

Conversation

@Skyline-9

Copy link
Copy Markdown

Summary

Adds input guardrails to the realtime API, bringing it closer to parity with the non-realtime Agent/Runner, which already supports input_guardrails. Realtime today only supports output guardrails (RealtimeAgent.output_guardrails / RealtimeRunConfig["output_guardrails"]); there is no first-class way to screen the user's transcribed input.

What changed:

  • RealtimeAgent.input_guardrails (appended at the end of the dataclass, default_factory=list) and RealtimeRunConfig["input_guardrails"] (NotRequired TypedDict key).
  • New RealtimeInputGuardrailTripped session event (appended at the end of the RealtimeSessionEvent union), mirroring RealtimeGuardrailTripped field-for-field but typed to InputGuardrailResult.
  • RealtimeSession runs the combined agent + run-config input guardrails on the completed user transcript (input_audio_transcription_completed), de-duped by id(). It reuses the existing output-guardrail machinery (shared _guardrail_tasks set, _on_guardrail_task_done, _cleanup_guardrail_tasks), so close() cancels in-flight tasks. On a trip it emits input_guardrail_tripped, forces response.cancel, and sends a follow-up user message naming the guardrail.
  • Exported from agents.realtime.__init__ (__all__) with an import regression test.
  • Docs: docs/ref/realtime/events.md renders the new event; docs/realtime/guide.md documents the feature and disambiguates it from the existing tool-level "input guardrails on function-tool calls".

The design deliberately mirrors _run_output_guardrails (argument order verified against InputGuardrail.run(self, agent, input, context)) so the behavior and lifecycle are consistent with what maintainers already review.

Known limitation (documented, not hidden)

The forced cancel reliably interrupts a response that is already in flight. If a guardrail resolves in the narrow window before any response has been created for the tripped turn, the cancel is a no-op and that response may proceed. Eliminating this window cleanly requires response<->user-item correlation at the model layer (for example a response_id on turn-started / response-created) so the session can cancel only the tripped turn's response without also cancelling the intentional guardrail-notification response. This limitation is documented in the RealtimeInputGuardrailTripped docstring, RealtimeAgent.input_guardrails, and the guide rather than papered over with a heuristic that would cancel the wrong response. Scope is also documented: input guardrails run on transcribed audio only; text sent via send_message is not screened. Happy to pursue the model-layer correlation as a follow-up if maintainers prefer.

Test plan

  • Added tests/realtime/test_session.py::TestInputGuardrailFunctionality, including edge cases:
    • a raising guardrail is skipped and does not crash the shared guardrail task,
    • raising + tripping guardrails together still produce exactly one interrupt with the tripping guardrail named,
    • a second transcription for an already-tripped item is de-duplicated,
    • no guardrail task is created when none are configured.
  • Ran the standard verification stack from the repo root:
    • make format, make lint, make typecheck — pass
    • make tests (full) — pass (4797 passed, 2 skipped; serial 27 passed, 5 skipped)
    • make build-docs — pass (new RealtimeInputGuardrailTripped reference resolves clean)

Issue number

Realtime parity with the non-realtime input-guardrail support. Happy to link the relevant tracking issue.

Checks

  • I've added new tests, if relevant
  • I've run .agents/skills/code-change-verification/scripts/run.sh
  • I've confirmed all verification steps pass (ran make format, make lint, make typecheck, make tests, and make build-docs)
  • If using Codex, I've run /review before submitting this PR

Compatibility notes

Additive. New fields are appended at the end of RealtimeAgent (preserving positional compatibility) and are a NotRequired config key; the new event is appended at the end of the RealtimeSessionEvent union. Sessions with no input guardrails configured create no extra tasks per utterance.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1b2c6fc6b1

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/agents/realtime/session.py Outdated
Comment on lines +1294 to +1303
for guardrail in input_guardrails:
try:
result = await guardrail.run(
# TODO (rm) Remove this cast, it's wrong
cast(Agent[Any], self._current_agent),
text,
self._context_wrapper,
)
if result.output.tripwire_triggered:
triggered_results.append(result)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Run realtime input guardrails concurrently

When more than one input guardrail is configured, this loop awaits them serially and only cancels after all earlier guardrails have completed. If a slow/model-backed guardrail comes before one that would trip, the unsafe user transcript can continue generating for that guardrail's entire latency, which largely defeats the forced response cancellation. Please run the input guardrails concurrently or interrupt as soon as the first tripwire result is available.

Useful? React with 👍 / 👎.

Comment thread src/agents/realtime/session.py Outdated
Comment on lines +1277 to +1279
combined_guardrails = self._current_agent.input_guardrails + self._run_config.get(
"input_guardrails", []
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Snapshot the agent for queued input guardrails

Because this background task re-reads self._current_agent when it eventually runs, a session that calls update_agent() or completes a handoff before the task gets CPU can check agent A's transcript using agent B's input guardrails, or no agent-level guardrails at all. That silently bypasses the guardrails configured on the agent that received the transcribed input; capture the agent/guardrail list when handling the transcription event and pass that snapshot into the task.

Useful? React with 👍 / 👎.

@Skyline-9

Copy link
Copy Markdown
Author

@codex review

@Skyline-9 Skyline-9 force-pushed the feat/realtime-input-guardrails branch from 1b2c6fc to bedd7a9 Compare July 1, 2026 23:32

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bedd7a9c37

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/agents/realtime/session.py Outdated

# Run the guardrails concurrently so a slow guardrail cannot delay the forced cancel behind
# unrelated guardrails, which would let the unsafe turn keep generating.
results = await asyncio.gather(*(_run_one(guardrail) for guardrail in input_guardrails))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Cancel realtime input on the first tripped guardrail

Fresh evidence: this version now uses asyncio.gather, but gather still waits for every input guardrail to finish before inspecting triggered_results. When one guardrail trips quickly but another model-backed guardrail is slow, the unsafe audio turn keeps generating until the slowest guardrail returns, delaying the forced response.cancel that this feature relies on. Use asyncio.as_completed/task cancellation like the streamed input guardrail path so the session interrupts as soon as the first tripwire result is available.

Useful? React with 👍 / 👎.

@Skyline-9

Copy link
Copy Markdown
Author

@codex review

@Skyline-9 Skyline-9 force-pushed the feat/realtime-input-guardrails branch from bedd7a9 to 7e2676a Compare July 2, 2026 18:52

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7e2676af1d

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

)

# Interrupt the model, forcing a cancel of any in-progress response.
await self._model.send_event(RealtimeModelSendInterrupt(force_response_cancel=True))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid cancelling later turns for stale input guardrails

When a model-backed input guardrail finishes after the guarded audio turn has already ended and a later user turn is active, this sends an untargeted forced response.cancel; _send_interrupt cancels whatever response is currently ongoing, and these input guardrail tasks are not cancelled on turn_ended. In that latency scenario a trip for item A can interrupt item B's unrelated response and enqueue the guardrail notification into the wrong turn, so the trip should be ignored/cancelled once the guarded turn is over or correlated to the response for this item_id before cancelling.

Useful? React with 👍 / 👎.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7e2676af1d

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

)

# Interrupt the model, forcing a cancel of any in-progress response.
await self._model.send_event(RealtimeModelSendInterrupt(force_response_cancel=True))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid cancelling a later response from stale guardrails

When an input guardrail is slower than the response it is checking (for example, the guarded turn reaches turn_ended and the user starts another turn before the guardrail finishes), this unscoped forced interrupt cancels whatever response is active at completion time; the model interrupt path is not tied to the guarded item_id. A trip for an earlier transcript can therefore interrupt an unrelated later answer and enqueue the guardrail follow-up into the wrong turn, so stale input-guardrail tasks should be ignored/cancelled after their turn ends or correlated to the response they are meant to cancel.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point but this is how output guardrail happens too. For the sake of parity, I will acknowledge this as a drawback but not change this, if this is desired, it should be raised in a separate PR

Skyline-9 added 4 commits July 3, 2026 00:32
Snapshot the active agent and its input guardrails when the transcription event is handled so a concurrent update_agent()/handoff cannot run a different agent's guardrails, and run the input guardrails concurrently so a slow guardrail cannot delay the forced response cancel.
Address Codex review: replace asyncio.gather (which waits for every
guardrail before inspecting results) with asyncio.as_completed so the
forced response cancel fires as soon as the first tripwire is available,
then cancel the remaining guardrail tasks. Mirrors the streamed
input-guardrail path so a slow guardrail can no longer delay the
interrupt.
@Skyline-9 Skyline-9 force-pushed the feat/realtime-input-guardrails branch from 7e2676a to 058565e Compare July 3, 2026 07:39

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 058565e6fd

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

for task in guardrail_tasks:
if not task.done():
task.cancel()
await asyncio.gather(*guardrail_tasks, return_exceptions=True)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Interrupt before awaiting cancelled guardrails

Fresh evidence: this version now uses as_completed, but this await asyncio.gather(...) still runs before the forced cancel at line 1348. When one input guardrail trips quickly and another model-backed guardrail is slow to acknowledge cancellation or does cleanup, the session waits here before sending response.cancel, so the unsafe realtime response can continue generating for that latency; request the interrupt and mark the item interrupted before awaiting sibling task cleanup.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants