Skip to content

feat(think): host bridge, permissions, sandboxed hook dispatch (Phase 3+4)#1284

Merged
threepointone merged 16 commits intomainfrom
now-more-think
Apr 10, 2026
Merged

feat(think): host bridge, permissions, sandboxed hook dispatch (Phase 3+4)#1284
threepointone merged 16 commits intomainfrom
now-more-think

Conversation

@threepointone
Copy link
Copy Markdown
Contributor

@threepointone threepointone commented Apr 9, 2026

Summary

Completes the Think extension system (Phases 3 and 4). Sandboxed extension Workers can now participate in lifecycle hooks, access context blocks and messages via host RPC, and have their capabilities gated by a granular permission model.

Builds on #1278 (Phase 1+2: lifecycle hooks, dynamic context, extension manifest).

Phase 3 — Host bridge + permissions

Host bridge methods on Think — 9 new _host* methods that HostBridgeLoopback calls via DO RPC:

  • Workspace: _hostReadFile, _hostWriteFile, _hostDeleteFile, _hostListFiles (delegate to this.workspace)
  • Context: _hostGetContext, _hostSetContext (delegate to this.session)
  • Messages: _hostGetMessages (serialized history), _hostSendMessage (routes through saveMessages/TurnQueue)
  • Session: _hostGetSessionInfo ({ messageCount })

Expanded permissionsExtensionPermissions now includes:

  • context: { read: string[] | "all"; write: string[] | "own" } — per-label context access
  • messages: "none" | "read" — conversation history access
  • session: { sendMessage: boolean; metadata: boolean } — session capabilities

Expanded HostBridgeLoopback — 6 new permission-gated methods: getContext, setContext, getMessages, sendMessage, getSessionInfo. Each has a dedicated permission check with descriptive errors.

Re-entrancy guard_insideInferenceLoop flag tracks stream consumption in _streamResult and chat(). _hostSendMessage routes through saveMessages which queues via TurnQueue — safe during inference (message executes after current turn completes).

Phase 4 — Sandboxed hook dispatch

Extension source format — structured { tools, hooks }:

({
  tools: {
    greet: {
      description: "Greet someone",
      parameters: { name: { type: "string" } },
      execute: async (args) => "Hello, " + args.name
    }
  },
  hooks: {
    beforeTurn: async (ctx) => {
      if (ctx.messageCount > 50) return { maxSteps: 3 };
    }
  }
})

Hook discoverymanifest() RPC returns { hooks: [...] }. Think discovers hooks at load time. getHookSubscribers(hookName) returns extensions in load order.

Pipeline dispatch_pipelineExtensionBeforeTurn runs after subclass beforeTurn:

  1. Creates TurnContextSnapshot (plain serializable data: system, toolNames, messageCount, continuation, body, modelId)
  2. For each subscriber: calls entrypoint.hook("beforeTurn", snapshot) with timeout
  3. Parses result, merges into accumulated TurnConfig (scalars last-wins, providerOptions deep-merge)
  4. model and tools skipped — not serializable across RPC. Use activeTools instead.
  5. Logs warnings on timeout/error, continues pipeline

TimeoutshookTimeout property (default 5s). Timer properly cleared on success.

load_extension tool — updated description documents { tools, hooks } format and host capabilities.

Test plan

  • 217 tests pass across 8 files
  • Phase 3: 9 host bridge tests (workspace read/write, messages, session info, context set/get, re-entrancy flag)
  • Phase 4: 9 integration tests with real WorkerLoader (structured format, hook discovery, hook invocation, snapshot data flow, error handling, timeout behavior, tools-only extensions)
  • All existing tests still pass (200 from Phase 1+2)
  • Build succeeds
  • No other packages affected

Design decisions

  • Plain data snapshots, not RPC proxiesTurnContextSnapshot is a plain object that survives structured clone. We originally planned RpcTarget proxies with lazy methods, but class instances lose their methods during Workers RPC serialization. Snapshots are simpler and correct.
  • Sandboxed extensions can't override model or tools — these types aren't serializable across the RPC boundary. Extensions control tool availability via activeTools (string array). Only subclass hooks can swap the model or add AI SDK tool objects.
  • result ?? {} not result || {} — prevents falsy coercion when extensions return { maxSteps: 0 }.
  • Snapshot created once per pipeline — all extensions see Think's original assembled context, not each other's modifications. Acceptable trade-off for simplicity.

Made with Cursor


Open with Devin

…uard

Phase 3 of the Think extension system redesign. Think now implements
the _host* methods that HostBridgeLoopback calls via DO RPC,
permissions are expanded with context/messages/session capabilities,
and the inference loop tracks re-entrancy for safe sendMessage.

HOST BRIDGE METHODS ON THINK:

Workspace (delegating to this.workspace):
  - _hostReadFile(path) → readFile
  - _hostWriteFile(path, content) → writeFile
  - _hostDeleteFile(path) → rm (returns false on error)
  - _hostListFiles(dir) → readDir (maps to {name, type, size, path})

Context (delegating to this.session):
  - _hostGetContext(label) → getContextBlock content
  - _hostSetContext(label, content) → replaceContextBlock

Messages:
  - _hostGetMessages(limit?) → serialized history as {id, role, content}
    (extracts text parts, respects limit via slice)
  - _hostSendMessage(content) → creates user message, routes through
    saveMessages which queues via TurnQueue (safe during inference —
    message executes after the current turn completes)

Session metadata:
  - _hostGetSessionInfo() → { messageCount }

EXPANDED PERMISSIONS:

ExtensionPermissions now includes:
  - context: { read: string[] | "all"; write: string[] | "own" }
    Per-label context block access. "own" trusts the extension to
    only write its namespaced labels (not validated against manifest;
    namespace prefixing makes cross-extension writes unlikely).

  - messages: "none" | "read"
    Conversation history access.

  - session: { sendMessage: boolean; metadata: boolean }
    Session-level capabilities.

EXPANDED HOSTBRIDGELOOPBACK:

6 new permission-gated methods:
  - getContext(label) — gated by context.read
  - setContext(label, content) — gated by context.write
  - getMessages(limit?) — gated by messages
  - sendMessage(content) — gated by session.sendMessage
  - getSessionInfo() — gated by session.metadata

Each has a dedicated #require* permission check with descriptive
error messages. Existing workspace methods renamed from
#requirePermission to #requireWorkspace for clarity.

RE-ENTRANCY GUARD:

_insideInferenceLoop flag tracks whether a stream is being consumed:
  - Set to true at start of _streamResult (WebSocket/saveMessages/
    continuation) and chat() stream iteration
  - Cleared in finally blocks after stream consumption completes
  - _hostSendMessage uses saveMessages which routes through TurnQueue
    — naturally queues behind the active turn when flag is true

TESTS:

9 new tests:
  - _hostWriteFile + _hostReadFile delegate to workspace
  - _hostReadFile returns null for missing file
  - _hostGetMessages returns conversation history
  - _hostGetMessages respects limit
  - _hostGetSessionInfo returns message count
  - _insideInferenceLoop false outside a turn
  - _insideInferenceLoop false after completed turn
  - _hostSetContext writes to a context block
  - _hostGetContext returns null for non-existent block

209 total tests pass across 8 files.

Made-with: Cursor
…cycle

Phase 4 of the Think extension system redesign. Sandboxed extension
Workers can now participate in lifecycle hooks alongside subclass
overrides. Extensions declare hooks in their source, Think discovers
them via manifest() RPC, and dispatches them in a pipeline during
the inference loop.

EXTENSION SOURCE FORMAT:

Extensions use a structured { tools, hooks } format:

  ({
    tools: {
      greet: {
        description: "Greet someone",
        parameters: { name: { type: "string" } },
        execute: async (args) => "Hello, " + args.name
      }
    },
    hooks: {
      beforeTurn: async (ctx) => {
        if (ctx.messageCount > 50) return { maxSteps: 3 };
      }
    }
  })

Both tools and hooks are optional. The generated Worker class
exposes describe(), manifest(), execute(), and hook() RPC methods.

HOOK DISCOVERY:

- manifest() RPC returns { hooks: [...] } — discovered at load time
- LoadedExtension stores hooks[] alongside tools[]
- getHookSubscribers(hookName) returns extensions in load order
- Legacy manifest() gracefully degraded (try/catch, no hooks)

PIPELINE DISPATCH:

_pipelineExtensionBeforeTurn runs after the subclass beforeTurn hook:
1. Creates TurnContextSnapshot (plain serializable data)
2. For each subscriber: calls entrypoint.hook("beforeTurn", snapshot)
3. Races against hookTimeout (default 5s)
4. Parses result, merges into accumulated TurnConfig
5. Logs warnings on timeout/error, continues pipeline

Merge semantics for sandboxed extensions:
- system, messages, activeTools, toolChoice, maxSteps — last wins
- providerOptions — deep merge
- model and tools — skipped (not serializable across RPC boundary;
  use activeTools to control tool availability)

SERIALIZABLE SNAPSHOTS:

TurnContextSnapshot is a plain data interface (no methods/classes):
  { system, toolNames, messageCount, continuation, body, modelId }

Created by createTurnContextSnapshot(). Survives Workers RPC
structured clone. Extensions read ctx.toolNames, ctx.system, etc.
and return a plain TurnConfig object.

TIMEOUT:

hookTimeout property on Think (default 5000ms). Each extension hook
invocation uses Promise.race with a timer that is properly cleared
on success (no dangling timers).

load_extension TOOL:

Updated description to document the { tools, hooks } format, host
capabilities (getContext, setContext, getMessages, sendMessage), and
hook support.

TESTS (9 new, real WorkerLoader):

- Structured { tools, hooks } format loads correctly
- Hook discovery via manifest() RPC
- Tool execution with structured format
- Tools-only extension has no hooks
- Hook invocation via entrypoint.hook()
- Hook receives snapshot data and computes from it
- Skipped result for unsubscribed hooks
- Error handling for failing hooks
- Network isolation with structured format

217 total tests pass across 8 files.

Made-with: Cursor
@changeset-bot
Copy link
Copy Markdown

changeset-bot bot commented Apr 9, 2026

⚠️ No Changeset found

Latest commit: 3f38d3b

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

devin-ai-integration[bot]

This comment was marked as resolved.

@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new bot commented Apr 9, 2026

Open in StackBlitz

agents

npm i https://pkg.pr.new/agents@1284

@cloudflare/ai-chat

npm i https://pkg.pr.new/@cloudflare/ai-chat@1284

@cloudflare/codemode

npm i https://pkg.pr.new/@cloudflare/codemode@1284

hono-agents

npm i https://pkg.pr.new/hono-agents@1284

@cloudflare/shell

npm i https://pkg.pr.new/@cloudflare/shell@1284

@cloudflare/think

npm i https://pkg.pr.new/@cloudflare/think@1284

@cloudflare/voice

npm i https://pkg.pr.new/@cloudflare/voice@1284

@cloudflare/worker-bundler

npm i https://pkg.pr.new/@cloudflare/worker-bundler@1284

commit: 3f38d3b

…rkspace

The host binding (env.host) was only injected into extension Workers
when workspace permissions were declared. Extensions with only context,
messages, or session permissions would get no host binding, making
host.getContext(), host.sendMessage(), etc. inaccessible.

Now checks all permission types: workspace, context.read, context.write,
messages, session.sendMessage, session.metadata.

Made-with: Cursor
…p flag

The flag was set at the top of _streamResult and cleared at the
bottom, but without try/finally. If anything threw before the
clear (e.g., _resumableStream.start(), _continuation.activatePending()),
the flag would stay stuck as true permanently.

Now matches the chat() path which already uses try/finally.

Made-with: Cursor
7 new tests covering previously untested functionality:

Host bridge:
- _hostDeleteFile removes a file + returns false for missing
- _hostListFiles lists directory contents
- _hostSendMessage injects a user message into history

beforeTurn TurnConfig overrides:
- maxSteps override applied per-turn
- system prompt override reaches the model
- activeTools override limits tool availability

Also adds setTurnConfigOverride() to ThinkTestAgent for configurable
beforeTurn return values, and hostDeleteFile/hostListFiles/hostSendMessage
test helpers.

224 total tests pass across 8 files.

Made-with: Cursor
devin-ai-integration[bot]

This comment was marked as resolved.

limit=0 was treated as 'no limit' due to falsy check (0 is falsy).
Also, slice(-0) returns the full array since -0 === 0 in JS.
Now explicitly returns [] for limit=0.

Made-with: Cursor
devin-ai-integration[bot]

This comment was marked as resolved.

…ently losing tools

Previously persisted extensions using the flat format ({ greet: {...} })
would silently load with zero tools after the format change. Now throws
a clear error message telling the author to wrap tools in a 'tools' key.

Made-with: Cursor
devin-ai-integration[bot]

This comment was marked as resolved.

_hostSendMessage called saveMessages() which enqueues through TurnQueue.
During an active turn (tool execution → host.sendMessage), this creates
a circular wait: the current turn can't complete because it awaits the
tool result, but saveMessages awaits the current turn via TurnQueue.

Now appends directly to session.appendMessage() — the injected message
is visible in the next turn's history without blocking the current turn.

Made-with: Cursor
The test claimed 'system prompt override reaches the model' but only
verified beforeTurn sees the original prompt (correct behavior — the
hook receives assembled context before overrides are applied). Renamed
to accurately describe what it tests.

Made-with: Cursor
…abels

#requireContextWrite with write:'own' previously allowed writing to
ANY label. Now validates against ownContextLabels carried in
HostBridgeLoopbackProps — only the extension's manifest-declared
namespaced labels are permitted.

- Added ownContextLabels to HostBridgeLoopbackProps
- ExtensionManager passes namespaced labels when creating host binding
- createHostBinding signature expanded to (permissions, ownContextLabels)

Made-with: Cursor
devin-ai-integration[bot]

This comment was marked as resolved.

…ch chat()

The flag wrapped the entire _streamResult method body, including
post-stream operations (persist, broadcast, response hook). Now
wraps only the for-await stream consumption loop — matching chat()'s
pattern. This ensures _insideInferenceLoop is false during persistence
and hooks, consistent across WebSocket and RPC entry paths.

Made-with: Cursor
…line comment

Timer was only cleared on the success path — if the hook RPC rejected
with its own error, the timer leaked until it fired a no-op reject.
Now cleared in a finally block.

Also fixed the pipeline JSDoc: it said 'each extension sees the
accumulated config from prior extensions' but the implementation
passes the same static snapshot to all extensions. Updated to
accurately describe the last-write-wins merge behavior.

Made-with: Cursor
_initializeExtensions created ExtensionManager without createHostBinding,
so extensions loaded via the automatic path (extensionLoader) never
received env.host — making host bridge methods inaccessible.

Now detects if HostBridgeLoopback is re-exported from the worker entry
point (via ctx.exports) and auto-wires createHostBinding with the
agent class name, agent ID, permissions, and ownContextLabels. Falls
back gracefully if HostBridgeLoopback is not exported.

Made-with: Cursor
- _hostGetMessages(0) returns empty array (not full history)
- Flat-format extension source throws clear error message

226 total tests pass.

Made-with: Cursor
devin-ai-integration[bot]

This comment was marked as resolved.

Negative limit caused slice(-(-N)) = slice(N), returning messages
from index N instead of the last N. Now treats any limit <= 0 as
empty result, matching limit=0 behavior.

Made-with: Cursor
@threepointone threepointone merged commit b8450b2 into main Apr 10, 2026
3 checks passed
@threepointone threepointone deleted the now-more-think branch April 10, 2026 07:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant