[Feature Request] Real-time Streaming Events During Context Reduction

### Problem Statement

## Problem Description

When using [SummarizingConversationManager](file:///Users/bingjiang/Documents/quickfinai-model/.venv/lib/python3.13/site-packages/strands/agent/conversation_manager/summarizing_conversation_manager.py#52-268) with [stream_async()](file:///Users/bingjiang/Documents/quickfinai-model/.venv/lib/python3.13/site-packages/strands/agent/agent.py#527-630), there's no way to notify users in real-time when context compression is happening.

### Current Behavior

```python
# In agent.py (lines 716-726)
except ContextWindowOverflowException as e:
    self.conversation_manager.reduce_context(self, e=e)  # Blocks for 30+ seconds
    # ... retry event loop
```

The [reduce_context()](file:///Users/bingjiang/Documents/quickfinai-model/.venv/lib/python3.13/site-packages/strands/agent/conversation_manager/summarizing_conversation_manager.py#124-175) call is **synchronous and blocking**. During this time:
- No events are yielded to the async iterator
- The event loop is blocked
- Users see a frozen UI with no feedback

### Expected Behavior

Users should receive streaming events during context compression:
- `conversation_compacting` - when compression starts
- `conversation_compacted` - when compression completes

This allows frontend applications to show "Compacting conversation..." feedback to users during the 30+ second wait.

## Proposed Solutions

### Option 1: Hook-based Events (Minimal Change)

Add new hook events for context reduction:

```python
# New hook events
class BeforeContextReductionEvent:
    agent: Agent
    exception: ContextWindowOverflowException

class AfterContextReductionEvent:
    agent: Agent
    original_message_count: int
    compressed_message_count: int
    removed_count: int
```

This allows users to subscribe to these events via `HookProvider`.

### Option 2: Yield Events During Exception Handling

Modify [_execute_event_loop_cycle](file:///Users/bingjiang/Documents/quickfinai-model/.venv/lib/python3.13/site-packages/strands/agent/agent.py#685-731) to yield events during context reduction:

```python
except ContextWindowOverflowException as e:
    # Yield event before reduction
    yield {"context_reduction": "starting", "message_count": len(self.messages)}
    
    self.conversation_manager.reduce_context(self, e=e)
    
    # Yield event after reduction
    yield {"context_reduction": "completed", "new_message_count": len(self.messages)}
    
    # Retry
    async for event in self._execute_event_loop_cycle(...):
        yield event
```

### Option 3: Async reduce_context

Make [reduce_context](file:///Users/bingjiang/Documents/quickfinai-model/.venv/lib/python3.13/site-packages/strands/agent/conversation_manager/summarizing_conversation_manager.py#124-175) async and run summarization in a thread pool:

```python
async def reduce_context_async(self, agent: Agent, e: Exception = None):
    loop = asyncio.get_event_loop()
    await loop.run_in_executor(None, super().reduce_context, agent, e)
```

This would unblock the event loop, but requires significant API changes.

## Use Case

We're building a financial research assistant that handles long conversations. When context overflow occurs, users wait 30+ seconds without any feedback. Showing "Compacting conversation..." would significantly improve UX.

## Environment

- Python 3.13
- strands-agents SDK 1.22.0
- strands-agents-tools 0.2.19

## Related Code

```python
class NotifyingSummarizingConversationManager(SummarizingConversationManager):
    """Custom manager that emits events - but events can't be yielded during blocking."""
    
    def __init__(self, event_queue: asyncio.Queue, ...):
        self._event_queue = event_queue
    
    def reduce_context(self, agent, e=None, **kwargs):
        self._event_queue.put_nowait({"type": "conversation_compacting"})
        super().reduce_context(agent, e=e, **kwargs)  # Blocks 30s+
        self._event_queue.put_nowait({"type": "conversation_compacted"})
        # Problem: Both events are consumed AFTER blocking ends
```


### Proposed Solution

_No response_

### Use Case

**Scenario: Financial Research Assistant with Long Conversations**
We are building a financial research assistant that helps users analyze stocks, retrieve financial data, and generate investment insights. The conversations often become very long because:

1. Multiple tool calls: Each query may trigger 5-10 tool calls (fetching income statements, analyst estimates, news, etc.)
2. Rich context: Users ask follow-up questions that require understanding previous analysis
3. Extended sessions: A single research session can last 20+ minutes with dozens of messages

**The Problem:**
When context overflow occurs and SummarizingConversationManager.reduce_context() is triggered:
- The summarization process takes 30+ seconds (calling the LLM to generate a summary)
- During this time, the UI appears frozen with no feedback
- Users may think the application crashed and refresh the page

What We Need:
A way to emit streaming events during context reduction so we can:
- Show "Compacting conversation..." message when reduction starts
- Display "Compaction complete (47 → 25 messages)" when it finishes

Impact:
This would significantly improve UX for any application using `SummarizingConversationManager` with `stream_async()`, especially chatbot interfaces where users expect real-time feedback.

### Alternatives Solutions

_No response_

### Additional Context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Real-time Streaming Events During Context Reduction #1511

Problem Statement

Problem Description

Current Behavior

Expected Behavior

Proposed Solutions

Option 1: Hook-based Events (Minimal Change)

Option 2: Yield Events During Exception Handling

Option 3: Async reduce_context

Use Case

Environment

Related Code

Proposed Solution

Use Case

Alternatives Solutions

Additional Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature Request] Real-time Streaming Events During Context Reduction #1511

Description

Problem Statement

Problem Description

Current Behavior

Expected Behavior

Proposed Solutions

Option 1: Hook-based Events (Minimal Change)

Option 2: Yield Events During Exception Handling

Option 3: Async reduce_context

Use Case

Environment

Related Code

Proposed Solution

Use Case

Alternatives Solutions

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions