Chat Completions streaming fallback tool calls can reuse output_index

### Please read this first

- [x] **Have you read the docs?** [Agents SDK docs](https://openai.github.io/openai-agents-python/)
- [x] **Have you searched for related issues?** Others may have faced similar issues.

### Describe the bug

Chat Completions streaming can emit duplicate `output_index` values for multiple function calls that never enter the real-time streaming-start path.

In `src/agents/models/chatcmpl_stream_handler.py`, the fallback branch for non-streamed function calls recomputes `fallback_starting_index` for each call. It offsets text/refusal/reasoning items and already-started streaming function calls, but it does not offset prior fallback function calls emitted in the same loop. As a result, two fallback function calls can both emit `response.output_item.added`, `response.function_call_arguments.delta`, and `response.output_item.done` with the same `output_index`.

This makes stream consumers that reconcile items by `output_index` unable to distinguish fallback tool calls reliably.

### Debug information

- Agents SDK version: `main` at `3854c124cb8e3e51fb660f5714405ee39ee86c5e`
- Python version: Python 3.12

### Repro steps

Minimal reproducer:

```python
from collections.abc import AsyncIterator

import pytest
from openai.types.chat.chat_completion_chunk import (
    ChatCompletionChunk,
    Choice,
    ChoiceDelta,
    ChoiceDeltaToolCall,
    ChoiceDeltaToolCallFunction,
)
from openai.types.completion_usage import CompletionUsage
from openai.types.responses import Response

from agents.model_settings import ModelSettings
from agents.models.interface import ModelTracing
from agents.models.openai_chatcompletions import OpenAIChatCompletionsModel
from agents.models.openai_provider import OpenAIProvider


@pytest.mark.asyncio
async def test_fallback_tool_call_indexes(monkeypatch):
    chunks = [
        ChatCompletionChunk(
            id="chunk-id",
            created=1,
            model="fake",
            object="chat.completion.chunk",
            choices=[
                Choice(
                    index=0,
                    delta=ChoiceDelta(
                        tool_calls=[
                            ChoiceDeltaToolCall(
                                index=0,
                                function=ChoiceDeltaToolCallFunction(
                                    name="first_tool",
                                    arguments='{"a": 1}',
                                ),
                                type="function",
                            )
                        ]
                    ),
                )
            ],
        ),
        ChatCompletionChunk(
            id="chunk-id",
            created=1,
            model="fake",
            object="chat.completion.chunk",
            choices=[
                Choice(
                    index=0,
                    delta=ChoiceDelta(
                        tool_calls=[
                            ChoiceDeltaToolCall(
                                index=1,
                                function=ChoiceDeltaToolCallFunction(
                                    name="second_tool",
                                    arguments='{"b": 2}',
                                ),
                                type="function",
                            )
                        ]
                    ),
                )
            ],
            usage=CompletionUsage(completion_tokens=1, prompt_tokens=1, total_tokens=2),
        ),
    ]

    async def fake_stream() -> AsyncIterator[ChatCompletionChunk]:
        for chunk in chunks:
            yield chunk

    async def patched_fetch_response(self, *args, **kwargs):
        response = Response(
            id="resp-id",
            created_at=0,
            model="fake-model",
            object="response",
            output=[],
            tool_choice="none",
            tools=[],
            parallel_tool_calls=False,
        )
        return response, fake_stream()

    monkeypatch.setattr(OpenAIChatCompletionsModel, "_fetch_response", patched_fetch_response)

    model = OpenAIProvider(use_responses=False).get_model("gpt-4")
    events = []
    async for event in model.stream_response(
        system_instructions=None,
        input="",
        model_settings=ModelSettings(),
        tools=[],
        output_schema=None,
        handoffs=[],
        tracing=ModelTracing.DISABLED,
        previous_response_id=None,
        conversation_id=None,
        prompt=None,
    ):
        events.append(event)

    added_indexes = [
        event.output_index for event in events if event.type == "response.output_item.added"
    ]
    done_indexes = [
        event.output_index for event in events if event.type == "response.output_item.done"
    ]

    print("added_indexes=", added_indexes)
    print("done_indexes=", done_indexes)
```

Current output on `main`:

```text
added_indexes= [0, 0]
done_indexes= [0, 0]
```

### Expected behavior

Each fallback function call should receive a unique `output_index`.

Expected output:

```text
added_indexes= [0, 1]
done_indexes= [0, 1]
```

The fallback finalization loop should maintain a fallback-emitted count and increment the index after each non-streamed function call.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chat Completions streaming fallback tool calls can reuse output_index #3104

Please read this first

Describe the bug

Debug information

Repro steps

Expected behavior

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Chat Completions streaming fallback tool calls can reuse output_index #3104

Description

Please read this first

Describe the bug

Debug information

Repro steps

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions