[FEATURE] Graph - Cancel Node - Do Not Raise Exception

### Problem Statement

Currently, users can cancel a node execution by setting `event.cancel_node = <STR_MSG|True>` within a `BeforeNodeCallEvent` hook. Unlike for swarm ([docs](https://github.com/strands-agents/docs/blob/main/docs/user-guide/concepts/interrupts.md#swarm), [test](https://github.com/strands-agents/sdk-python/blob/main/tests_integ/interrupts/test_hook.py)), this leads to an exception that stops the entire graph execution. I would like to explore if this is necessary. To help figure this out, we should look at the resulting behavior of returning `False` from the [`should_continue`](https://github.com/strands-agents/sdk-python/blob/main/src/strands/multiagent/graph.py#L644) call.

If it is appropriate to raise an exception, we should raise an explicit node cancel exception instead of a `RuntimeError`.

### Proposed Solution

_No response_

### Use Case

Cleaning exit from graph without having to worry about catching a `RuntimeError` when I intentionally cancel a node execution.

### Alternatives Solutions

_No response_

### Additional Context

_No response_

---

## Implementation Requirements

Based on clarification discussion and repository analysis:

### Summary
Align Graph `cancel_node` behavior with Swarm - don't raise an exception, set status to `FAILED` and return gracefully. This follows the same pattern as `should_continue` returning `False`.

### Technical Approach

#### Current Behavior Comparison

**Swarm** (swarm.py lines 750-759) - Graceful exit:
```python
if before_event.cancel_node:
    yield MultiAgentNodeCancelEvent(current_node.node_id, cancel_message)
    self.state.completion_status = Status.FAILED
    break  # No exception
```

**Graph** (graph.py lines 864-871) - Raises exception:
```python
if before_event.cancel_node:
    yield MultiAgentNodeCancelEvent(node.node_id, cancel_message)
    raise RuntimeError(cancel_message)  # This needs to change
```

#### Reference: `should_continue` Graceful Exit Pattern (lines 648-651)
When `should_continue` returns `False`:
1. Sets `self.state.status = Status.FAILED`
2. Returns gracefully (no exception)
3. Downstream nodes don't execute
4. `GraphResult` is still built and yielded normally via `MultiAgentResultEvent`

### Implementation Details

#### 1. Modify `_execute_node` in `graph.py` (lines 864-871)

Replace the `raise RuntimeError` with graceful handling:

```python
if before_event.cancel_node:
    cancel_message = (
        before_event.cancel_node if isinstance(before_event.cancel_node, str) else "node cancelled by user"
    )
    logger.debug("reason=<%s> | cancelling execution", cancel_message)
    yield MultiAgentNodeCancelEvent(node.node_id, cancel_message)
    
    # Create NodeResult for cancelled node (similar to failure handling)
    node_result = NodeResult(
        result=cancel_message,
        execution_time=0,
        status=Status.FAILED,
        accumulated_usage=Usage(inputTokens=0, outputTokens=0, totalTokens=0),
        accumulated_metrics=Metrics(latencyMs=0),
        execution_count=1,
    )
    
    node.execution_status = Status.FAILED
    node.result = node_result
    self.state.failed_nodes.add(node)
    self.state.results[node.node_id] = node_result
    
    yield MultiAgentNodeStopEvent(node_id=node.node_id, node_result=node_result)
    return  # Graceful exit, no exception
```

#### 2. Add failed_nodes check in `_execute_graph` (after line 658)

The comment on line 669-670 notes: *"a failure would throw exception and code would not make it here"*. Since we're removing the exception, add a check:

```python
async for event in self._execute_nodes_parallel(current_batch, invocation_state):
    yield event

# Check if any nodes failed (including cancelled) - stop execution gracefully
if self.state.failed_nodes:
    self.state.status = Status.FAILED
    return

if self.state.status == Status.INTERRUPTED:
    # ... existing interrupt handling
```

### Files to Modify

| File | Changes |
|------|---------|
| `src/strands/multiagent/graph.py` | Modify `_execute_node` to not raise, add failed_nodes check in `_execute_graph` |
| `tests/strands/multiagent/test_graph.py` | Update `test_graph_cancel_node` - remove `pytest.raises(RuntimeError)`, verify result is yielded |
| `tests_integ/hooks/multiagent/test_cancel.py` | Update `test_graph_cancel_node` - remove `pytest.raises(RuntimeError)`, verify result accessible |

### Acceptance Criteria

- [ ] Setting `cancel_node` in a `BeforeNodeCallEvent` hook does NOT raise a `RuntimeError`
- [ ] Graph execution stops gracefully when a node is cancelled
- [ ] `GraphResult` is yielded normally with `status=Status.FAILED`
- [ ] `MultiAgentNodeCancelEvent` is still emitted
- [ ] `MultiAgentNodeStopEvent` is emitted for the cancelled node
- [ ] Downstream nodes do not execute (same as `should_continue` returning `False`)
- [ ] Behavior is consistent with Swarm `cancel_node` handling
- [ ] Unit tests pass without expecting `RuntimeError`
- [ ] Integration tests pass without expecting `RuntimeError`

### Breaking Change Notice

This is a **breaking change** for any code that catches `RuntimeError` during graph node cancellation. The current behavior is considered a bug since it's inconsistent with Swarm behavior and the existing `should_continue` graceful exit pattern.

### Related Links
- Original PR introducing cancel_node: https://github.com/strands-agents/sdk-python/pull/1203
- Swarm interrupt docs: https://github.com/strands-agents/docs/blob/main/docs/user-guide/concepts/interrupts.md#swarm

File	Changes
`src/strands/multiagent/graph.py`	Modify `_execute_node` to not raise, add failed_nodes check in `_execute_graph`
`tests/strands/multiagent/test_graph.py`	Update `test_graph_cancel_node` - remove `pytest.raises(RuntimeError)`, verify result is yielded
`tests_integ/hooks/multiagent/test_cancel.py`	Update `test_graph_cancel_node` - remove `pytest.raises(RuntimeError)`, verify result accessible

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[FEATURE] Graph - Cancel Node - Do Not Raise Exception #1500

Problem Statement

Proposed Solution

Use Case

Alternatives Solutions

Additional Context

Implementation Requirements

Summary

Technical Approach

Current Behavior Comparison

Reference: `should_continue` Graceful Exit Pattern (lines 648-651)

Implementation Details

1. Modify `_execute_node` in `graph.py` (lines 864-871)

2. Add failed_nodes check in `_execute_graph` (after line 658)

Files to Modify

Acceptance Criteria

Breaking Change Notice

Related Links

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[FEATURE] Graph - Cancel Node - Do Not Raise Exception #1500

Description

Problem Statement

Proposed Solution

Use Case

Alternatives Solutions

Additional Context

Implementation Requirements

Summary

Technical Approach

Current Behavior Comparison

Reference: should_continue Graceful Exit Pattern (lines 648-651)

Implementation Details

1. Modify _execute_node in graph.py (lines 864-871)

2. Add failed_nodes check in _execute_graph (after line 658)

Files to Modify

Acceptance Criteria

Breaking Change Notice

Related Links

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Reference: `should_continue` Graceful Exit Pattern (lines 648-651)

1. Modify `_execute_node` in `graph.py` (lines 864-871)

2. Add failed_nodes check in `_execute_graph` (after line 658)