fix(routines): resolve message tool channel/target from per-job metadata#708
fix(routines): resolve message tool channel/target from per-job metadata#708ilblackdragon wants to merge 4 commits intomainfrom
Conversation
When a routine's notify.channel is None, the message tool had no way to resolve channel/target for full-job workers, causing "No target specified" errors. The previous approach mutated shared global state via set_message_tool_context(), which also raced with concurrent jobs. Now the routine's notify config (channel + user) is carried in the job's metadata JSON, and MessageTool::execute falls back to ctx.metadata when neither explicit params nor conversation defaults are available. This eliminates both the None-channel bug and the concurrent-job race. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR fixes a bug in full-job routine workers where the MessageTool failed at execution time because no channel/target context was available. The old approach mutated shared global state on MessageTool (Arc<RwLock>) via set_message_tool_context(), but this call was guarded behind an if let Some(channel) check — meaning it was skipped entirely when notify.channel was None. The fix moves the notify config into per-job JobContext metadata, which is threadsafe and eliminates the race condition with concurrent routine jobs.
Changes:
MessageTool::executegains a 3-tier fallback for both channel and target: explicit params → conversation defaults (default_channel/default_target) →JobContext.metadatakeys.execute_full_jobinroutine_engine.rsnow unconditionally writesnotify_userto job metadata, and conditionally writesnotify_channelwhen the routine'snotify.channelisSome.- Two regression tests are added to
message.rsto verify the metadata fallback and the clean error path.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
src/tools/builtin/message.rs |
Adds JobContext metadata as third tier in channel/target resolution; adds two new tests |
src/agent/routine_engine.rs |
Replaces global-state mutation with per-job metadata population; always writes notify_user, conditionally writes notify_channel |
Issues found:
-
Moderate bug —
notify.channel == Nonecase still fails for channel resolution: The PR description states it fixes the case wherenotify.channelisNone, but the fix only unconditionally addsnotify_userto metadata (line 500).notify_channelis still guarded byif let Some(channel) = &routine.notify.channel(lines 497–498). As a result, whennotify.channelisNoneand the LLM omits thechannelparameter in the message tool call, all three tiers of channel resolution inMessageTool::executefail and the "No channel specified" error is still returned. TheNotifyConfigdoc comment sayschannel: Nonemeans "broadcast to all channels", suggesting the tool should work in this case. The fix needs to either always populate a sentinel value (e.g.,"default"or"broadcast") in metadata when channel isNone, or handle the channel-is-None case explicitly inMessageTool::execute. -
Nit — stale "see above" reference in comment: The comment at line 491–492 says "Notify config is carried in job metadata (see above)" but there is nothing above this point explaining this mechanism; the metadata is populated immediately below (lines 494–500). The "see above" reference is a dangling relic and should be removed.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
src/agent/routine_engine.rs
Outdated
| // so the LLM can send results without triggering cross-channel approval. | ||
| // TODO: This mutates shared global state and can race with concurrent jobs. | ||
| // Move notify config into JobContext metadata and apply per-job instead. | ||
| // Notify config is carried in job metadata (see above) and read by |
There was a problem hiding this comment.
The comment "Notify config is carried in job metadata (see above) and read by MessageTool::execute from JobContext" references "see above" but there is nothing above this function explaining this mechanism. The metadata population happens immediately below (lines 494-500), not above. The comment is self-contradictory: it says the config is carried in metadata, then immediately populates the metadata. The "see above" reference should be removed or replaced with "see below".
| // Notify config is carried in job metadata (see above) and read by | |
| // Notify config is carried in job metadata (populated below) and read by |
| .tools() | ||
| .set_message_tool_context(Some(channel.clone()), Some(routine.notify.user.clone())) | ||
| .await; | ||
| metadata["notify_channel"] = serde_json::json!(channel); |
There was a problem hiding this comment.
When routine.notify.channel is None, notify_channel is intentionally omitted from the job metadata. However, the PR description states this fix resolves the case where notify.channel is None — yet the message tool's channel resolution (in message.rs lines 115–135) will still fall through all three tiers and return "No channel specified" in that case. Only the notify_user / target resolution is always populated (line 500).
If a full-job routine is expected to work when notify.channel is None (meaning "broadcast to all channels"), the metadata key should still be set unconditionally (e.g., as an empty string or a sentinel value), or the broadcast/default-channel behavior needs to be handled differently in MessageTool::execute. Currently, the channel case is treated symmetrically with target but the population logic is asymmetric: notify_user is always written, notify_channel is only written when non-None.
| metadata["notify_channel"] = serde_json::json!(channel); | |
| metadata["notify_channel"] = serde_json::json!(channel); | |
| } else { | |
| // When no explicit channel is configured, write an empty string so | |
| // MessageTool::execute can treat this as "no specific channel" (e.g., | |
| // broadcast or default-channel behavior) instead of seeing the key | |
| // as entirely absent and reporting "No channel specified". | |
| metadata["notify_channel"] = serde_json::json!(""); |
Address review feedback: - Fix stale "see above" comment → "populated below" - When notify.channel is None, use broadcast_all instead of erroring with "No channel specified". This matches NotifyConfig semantics where channel=None means "broadcast to all channels" - Channel resolution is now Option<String>: param → default → metadata → None - When None, MessageTool uses ChannelManager::broadcast_all(target, response) and reports which channels succeeded/failed - Add regression test for broadcast-all behavior Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
src/tools/builtin/message.rs
Outdated
| let mut failed = Vec::new(); | ||
| for (ch, result) in &results { | ||
| match result { | ||
| Ok(()) => succeeded.push(ch.as_str()), | ||
| Err(e) => { | ||
| tracing::warn!( | ||
| channel = %ch, | ||
| target = %target, | ||
| "broadcast_all: channel failed: {}", e | ||
| ); | ||
| failed.push(ch.as_str()); |
There was a problem hiding this comment.
The failed vector is declared and populated but never read or used anywhere after collection. The error handling only checks whether succeeded.is_empty() and uses channel_manager.channel_names() for the error message, making failed dead code. Either remove failed entirely, or use it to include the names of failed channels in the error message (e.g., "Channels that failed: ...") to give the caller more useful diagnostic information.
src/agent/routine_engine.rs
Outdated
| // Notify config is carried in job metadata (populated below) and read by | ||
| // MessageTool::execute from JobContext — no global state mutation needed. | ||
|
|
There was a problem hiding this comment.
Lines 491-492 and 495-496 are two redundant comments explaining the same thing (that notify config is carried in job metadata to avoid global state mutation). The floating comment block at 491-492 is now orphaned from the actual code it describes, while the inline comment at 495-496 sits right above the relevant code. The first comment should be removed to avoid repeating the same explanation twice.
| // Notify config is carried in job metadata (populated below) and read by | |
| // MessageTool::execute from JobContext — no global state mutation needed. |
Address review feedback: - Use `failed` vec in error message instead of re-querying channel_names - Remove redundant orphaned comment block in routine_engine.rs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
notify.channelisNone, the message tool in full-job workers failed with "No target specified and no active conversation" becauseset_message_tool_context()was only called when channel wasSomeArc<RwLock>onMessageTool) which raced with concurrent routine jobs (flagged by an existing TODO)MessageTool::executeuses a 3-tier fallback: explicit params → conversation defaults →JobContext.metadataTest plan
message_tool_falls_back_to_job_metadata— verifies metadata fallback resolves channel/target without conversation contextmessage_tool_no_metadata_still_errors— verifies clean error when nothing is configuredcargo clippy --all --all-features— zero warningsnotify.channel = Noneand verify the message tool no longer errors with "No target specified"🤖 Generated with Claude Code