[D5] Verify provider streaming specs + docs/tests

inliver233 · inliver233 · commit 12b9599b6a92 · 2026-02-11T00:22:14.000+08:00
diff --git a/docs/ai/streaming_providers.md b/docs/ai/streaming_providers.md
@@ -0,0 +1,55 @@
+# Streaming providers (SSE): OpenAI / Anthropic / Gemini
+
+Last verified: 2026-02-10
+
+This project implements streaming via Server-Sent Events (SSE) and extracts incremental text via provider strategies.
+
+## Common SSE notes
+
+- SSE frames are separated by a blank line (`\n\n`). Each frame may include `event:` and one or more `data:` lines.
+- Networks/proxies can split or coalesce frames arbitrarily at the byte level, so the client must buffer and parse by newline boundaries (not by “chunk” boundaries).
+
+Implementation notes:
+
+- `src/core/ai_client.py`: `_iter_sse_event_data()` parses SSE from arbitrary byte chunks and yields `(event, data)` tuples.
+- Provider-specific text extraction lives in `src/core/ai_providers/*` via `extract_stream_content()`.
+
+## OpenAI (Chat Completions API)
+
+- Endpoint: `POST https://api.openai.com/v1/chat/completions`
+- Headers: `Authorization: Bearer <api_key>`, `Content-Type: application/json`
+- Request: `model`, `messages`, `stream: true`
+  - Standard models: `max_tokens`
+  - Reasoning models (as used by this app for `o1/o3/o4*`): `max_completion_tokens`
+  - Optional: `stream_options: { include_usage: true }`
+- Response: `text/event-stream`
+  - Each SSE frame is `data: { ...json... }`
+  - Stream terminates with: `data: [DONE]`
+- Incremental text extraction used by this app: `choices[0].delta.content`
+
+## Anthropic (Claude Messages API)
+
+- Endpoint: `POST https://api.anthropic.com/v1/messages`
+- Headers: `x-api-key: <api_key>`, `anthropic-version: <date>` (required)
+- Request: `model`, `max_tokens`, `stream: true`, `messages`; optional `system`, `temperature`, `tools`
+- Response: `text/event-stream` with explicit `event:` names.
+  - Typical event sequence: `message_start` → `content_block_start` → `content_block_delta` → … → `message_stop`
+  - Content deltas are sent as JSON objects with a top-level `type` field, e.g. `{"type":"content_block_delta","delta":{"type":"text_delta","text":"..."}}`
+- Incremental text extraction used by this app: when `type == "content_block_delta"`, use `delta.text` (ignoring non-text deltas).
+
+## Gemini (Google AI for Developers — Generative Language API v1beta)
+
+- Endpoint: `POST https://generativelanguage.googleapis.com/v1beta/models/{model}:streamGenerateContent?alt=sse`
+  - `alt=sse` is required to receive SSE responses.
+- Auth: `x-goog-api-key: <api_key>` header (or `key=<api_key>` query parameter)
+- Request: `contents` (role + parts), `generationConfig` (`temperature`, `topP`, `maxOutputTokens`, etc.)
+- Response: `text/event-stream` with `data: { ...json... }` frames (no `[DONE]` sentinel; the stream ends when the server closes).
+- Incremental text extraction used by this app: `candidates[0].content.parts[].text` (ignoring parts marked `thought: true`).
+
+## References (official)
+
+- OpenAI — Chat Completions API: https://platform.openai.com/docs/api-reference/chat/create-chat-completion
+- OpenAI — Streaming responses (Responses API guide): https://platform.openai.com/docs/guides/streaming-responses
+- Anthropic — Streaming: https://docs.anthropic.com/en/docs/build-with-claude/streaming
+- Google AI for Developers — `models.streamGenerateContent` (SSE example): https://ai.google.dev/api/rest/v1beta/models/streamGenerateContent
+
diff --git a/issues/2026-02-10_23-01-37-ui-autosave-streaming-v2.csv b/issues/2026-02-10_23-01-37-ui-autosave-streaming-v2.csv
@@ -1,6 +1,6 @@
 ID,Title,Description,Acceptance,Test_Method,Tools,Dev_Status,Review1_Status,Regression_Status,Files,Dependencies,Notes
-D1,Refactor Ghost Text preview to non-destructive (fix streaming overwrite/caret),"Replace default document-mutating Ghost Text preview with a non-destructive overlay (DeepIntegratedGhostText). Add editor paint overlay rendering and make accept/cancel stable. Ensure streaming chunk rendering never modifies document text or moves caret; clear preview on doc switch/save. Add pytest-qt coverage for overwrite/caret chaos regressions.","Streaming/AI preview does not change editor.toPlainText() until user accepts; caret stays stable during streaming; Tab accept inserts completion once at correct position; Esc cancels and stops updates; stale chunks ignored; tests pass.",".tmp\\ane0305-venv-311\\Scripts\\python.exe -m pytest -q tests/test_streaming_ghost_no_overwrite.py",none,DONE,DONE,TODO,"src/gui/editor/text_editor.py | src/gui/editor/deep_integrated_ghost_text.py | src/gui/editor/optimal_ghost_text.py | src/gui/editor/smart_completion_manager.py | tests/test_streaming_ghost_no_overwrite.py",none,"user_report: streaming overwrites existing text + caret chaos | test:.tmp\\ane0305-venv-311\\Scripts\\python.exe -m pytest -q tests/test_streaming_ghost_no_overwrite.py PASS | done_at:2026-02-10"
-D2,Scratch autosave/restore + flush on close,"Persist unbound scratch editor content to a recovery file on autosave and on app close; restore it on startup when no project is restored. Also flush pending autosave for project-backed editors on app close so quick exits don’t lose content. Add tests + manual restart checklist.", "Typing in scratch tab persists after restart; closing app quickly still saves latest edits (scratch + project docs); tests pass (or limited validation recorded with risk).",".tmp\\ane0305-venv-311\\Scripts\\python.exe -m pytest -q tests/test_scratch_autosave_restore.py",none,DONE,DONE,TODO,"src/gui/editor/editor_panel.py | src/gui/editor/text_editor.py | src/gui/main_window_parts/state.py | src/main.py | src/core/config.py | tests/test_scratch_autosave_restore.py",none,"manual_checklist: 1) Start app, type in scratch tab, close immediately. 2) Reopen; expect content restored. 3) Open project, edit scene, close immediately; reopen project; expect content present. | test:.tmp\\ane0305-venv-311\\Scripts\\python.exe -m pytest -q tests/test_scratch_autosave_restore.py PASS | done_at:2026-02-10"
-D3,Bottom status bar separator/overlap cleanup,"Reduce stacked separator lines and clipping in EnhancedStatusBar. Use a single separator strategy (either borders or VLines) with consistent heights/margins and remove duplicate/overlapping lines. Update status bar layout test accordingly.","Bottom bar shows a single clean row; no stacked/thick separator lines; labels are not clipped; test passes.",".tmp\\ane0305-venv-311\\Scripts\\python.exe -m pytest -q tests/test_status_bar_layout.py",none,DONE,DONE,TODO,"src/gui/status/status_bar.py | tests/test_status_bar_layout.py",none,"user_report: bottom bar lines stacked/overlapping | test:.tmp\\ane0305-venv-311\\Scripts\\python.exe -m pytest -q tests/test_status_bar_layout.py PASS | done_at:2026-02-10"
-D4,"Max tokens UI supports >=1,000,000","Ensure AI配置中心的最大tokens输入允许很大的值（>=1,000,000）且不会出现输入首位数字就被限制/截断的问题；保存/加载配置保持一致。添加pytest-qt回归测试。","Max tokens can be set to 1,000,000+ and persists after reopening AI配置中心; regression test passes.",".tmp\\ane0305-venv-311\\Scripts\\python.exe -m pytest -q tests/test_ai_config_max_tokens_range.py",none,DONE,DONE,TODO,"src/gui/ai/unified_ai_config_dialog.py | tests/test_ai_config_max_tokens_range.py",none,"user_report: max_tokens max=3999 and typing '4' clamps; verify and prevent regression | test:.tmp\\ane0305-venv-311\\Scripts\\python.exe -m pytest -q tests/test_ai_config_max_tokens_range.py PASS | done_at:2026-02-10"
-D5,Verify provider streaming specs + docs/tests,"Use official docs (OpenAI/Anthropic/Gemini) to verify streaming endpoints/headers/params and SSE parsing requirements; adjust provider strategies and AIClient stream parsing if needed; add a short doc summary and tests for OpenAI+Claude streaming fixtures.","Provider streaming endpoints/params match official docs as of 2026-02-10; OpenAI/Claude streaming extraction tests pass; doc added.",".tmp\\ane0305-venv-311\\Scripts\\python.exe -m pytest -q tests/test_provider_streaming_openai_claude.py",manual,TODO,TODO,TODO,"src/core/ai_client.py | src/core/ai_providers/openai.py | src/core/ai_providers/claude.py | src/core/ai_providers/gemini.py | docs/ai/streaming_providers.md | tests/test_provider_streaming_openai_claude.py",none,"requires web research for up-to-date streaming specs; keep tests offline via fixtures"
+D1,Refactor Ghost Text preview to non-destructive (fix streaming overwrite/caret),Replace default document-mutating Ghost Text preview with a non-destructive overlay (DeepIntegratedGhostText). Add editor paint overlay rendering and make accept/cancel stable. Ensure streaming chunk rendering never modifies document text or moves caret; clear preview on doc switch/save. Add pytest-qt coverage for overwrite/caret chaos regressions.,Streaming/AI preview does not change editor.toPlainText() until user accepts; caret stays stable during streaming; Tab accept inserts completion once at correct position; Esc cancels and stops updates; stale chunks ignored; tests pass.,.tmp\\ane0305-venv-311\\Scripts\\python.exe -m pytest -q tests/test_streaming_ghost_no_overwrite.py,none,DONE,DONE,TODO,src/gui/editor/text_editor.py | src/gui/editor/deep_integrated_ghost_text.py | src/gui/editor/optimal_ghost_text.py | src/gui/editor/smart_completion_manager.py | tests/test_streaming_ghost_no_overwrite.py,none,user_report: streaming overwrites existing text + caret chaos | test:.tmp\\ane0305-venv-311\\Scripts\\python.exe -m pytest -q tests/test_streaming_ghost_no_overwrite.py PASS | done_at:2026-02-10
+D2,Scratch autosave/restore + flush on close,Persist unbound scratch editor content to a recovery file on autosave and on app close; restore it on startup when no project is restored. Also flush pending autosave for project-backed editors on app close so quick exits don’t lose content. Add tests + manual restart checklist.," ""Typing in scratch tab persists after restart; closing app quickly still saves latest edits (scratch + project docs); tests pass (or limited validation recorded with risk).""",.tmp\\ane0305-venv-311\\Scripts\\python.exe -m pytest -q tests/test_scratch_autosave_restore.py,none,DONE,DONE,TODO,src/gui/editor/editor_panel.py | src/gui/editor/text_editor.py | src/gui/main_window_parts/state.py | src/main.py | src/core/config.py | tests/test_scratch_autosave_restore.py,none,"manual_checklist: 1) Start app, type in scratch tab, close immediately. 2) Reopen; expect content restored. 3) Open project, edit scene, close immediately; reopen project; expect content present. | test:.tmp\\ane0305-venv-311\\Scripts\\python.exe -m pytest -q tests/test_scratch_autosave_restore.py PASS | done_at:2026-02-10"
+D3,Bottom status bar separator/overlap cleanup,Reduce stacked separator lines and clipping in EnhancedStatusBar. Use a single separator strategy (either borders or VLines) with consistent heights/margins and remove duplicate/overlapping lines. Update status bar layout test accordingly.,Bottom bar shows a single clean row; no stacked/thick separator lines; labels are not clipped; test passes.,.tmp\\ane0305-venv-311\\Scripts\\python.exe -m pytest -q tests/test_status_bar_layout.py,none,DONE,DONE,TODO,src/gui/status/status_bar.py | tests/test_status_bar_layout.py,none,user_report: bottom bar lines stacked/overlapping | test:.tmp\\ane0305-venv-311\\Scripts\\python.exe -m pytest -q tests/test_status_bar_layout.py PASS | done_at:2026-02-10
+D4,"Max tokens UI supports >=1,000,000","Ensure AI配置中心的最大tokens输入允许很大的值（>=1,000,000）且不会出现输入首位数字就被限制/截断的问题；保存/加载配置保持一致。添加pytest-qt回归测试。","Max tokens can be set to 1,000,000+ and persists after reopening AI配置中心; regression test passes.",.tmp\\ane0305-venv-311\\Scripts\\python.exe -m pytest -q tests/test_ai_config_max_tokens_range.py,none,DONE,DONE,TODO,src/gui/ai/unified_ai_config_dialog.py | tests/test_ai_config_max_tokens_range.py,none,user_report: max_tokens max=3999 and typing '4' clamps; verify and prevent regression | test:.tmp\\ane0305-venv-311\\Scripts\\python.exe -m pytest -q tests/test_ai_config_max_tokens_range.py PASS | done_at:2026-02-10
+D5,Verify provider streaming specs + docs/tests,Use official docs (OpenAI/Anthropic/Gemini) to verify streaming endpoints/headers/params and SSE parsing requirements; adjust provider strategies and AIClient stream parsing if needed; add a short doc summary and tests for OpenAI+Claude streaming fixtures.,Provider streaming endpoints/params match official docs as of 2026-02-10; OpenAI/Claude streaming extraction tests pass; doc added.,.tmp\\ane0305-venv-311\\Scripts\\python.exe -m pytest -q tests/test_provider_streaming_openai_claude.py,manual,DONE,DONE,TODO,src/core/ai_client.py | src/core/ai_providers/openai.py | src/core/ai_providers/claude.py | src/core/ai_providers/gemini.py | docs/ai/streaming_providers.md | tests/test_provider_streaming_openai_claude.py,none,requires web research for up-to-date streaming specs; keep tests offline via fixtures | test:.tmp\ane0305-venv-311\Scripts\python.exe -m pytest -q tests/test_provider_streaming_openai_claude.py PASS | doc:docs/ai/streaming_providers.md | done_at:2026-02-10
diff --git a/src/core/ai_client.py b/src/core/ai_client.py
@@ -214,6 +214,68 @@ def _get_stream_endpoint_url(self) -> str:
             return self._provider_strategy.get_endpoint_url()
         except Exception as e:
             raise AIClientError(str(e))
+
+    async def _iter_sse_event_data(self, byte_iter) -> AsyncGenerator[tuple[str | None, str], None]:
+        """Parse SSE stream and yield (event, data) tuples.
+
+        Notes:
+        - aiohttp may yield arbitrary byte chunks, not line-delimited data.
+        - This parser tolerates chunk boundaries splitting SSE lines and multiple SSE events per chunk.
+        - For non-SSE servers that return raw JSON lines, it will yield them as (None, line).
+        """
+        buffer = b""
+        current_event: str | None = None
+        data_lines: List[str] = []
+
+        async for chunk in byte_iter:
+            if not chunk:
+                continue
+            if not isinstance(chunk, (bytes, bytearray)):
+                continue
+
+            buffer += bytes(chunk)
+            while True:
+                newline_index = buffer.find(b"\n")
+                if newline_index < 0:
+                    break
+                line_bytes = buffer[:newline_index]
+                buffer = buffer[newline_index + 1 :]
+
+                if line_bytes.endswith(b"\r"):
+                    line_bytes = line_bytes[:-1]
+
+                line = line_bytes.decode("utf-8", errors="replace")
+                if line == "":
+                    if data_lines:
+                        yield current_event, "\n".join(data_lines)
+                    current_event = None
+                    data_lines = []
+                    continue
+
+                if line.startswith(":"):
+                    continue
+                if line.startswith("event:"):
+                    current_event = line[6:].lstrip() or None
+                    continue
+                if line.startswith("data:"):
+                    data_lines.append(line[5:].lstrip())
+                    continue
+
+                # Fallback: some providers/proxies may return JSON lines without SSE framing.
+                stripped = line.strip()
+                if stripped.startswith("{") or stripped.startswith("["):
+                    yield None, stripped
+
+        # Flush any remaining buffered content.
+        if buffer:
+            tail = buffer.decode("utf-8", errors="replace").strip()
+            if tail.startswith("data:"):
+                data_lines.append(tail[5:].lstrip())
+            elif tail.startswith("{") or tail.startswith("["):
+                yield None, tail
+
+        if data_lines:
+            yield current_event, "\n".join(data_lines)
     
     def _build_messages(self, prompt: Union[str, List[MultimodalMessage]], system_prompt: Optional[str] = None) -> List[Dict[str, Any]]:
         """构建消息列表 - 支持多模态内容"""
@@ -747,31 +809,17 @@ async def complete_multimodal_stream(self, messages: List[MultimodalMessage], sy
                 headers=headers,
                 json=data,
                 timeout=aiohttp.ClientTimeout(total=self.config.timeout)
-            ) as response:
+                ) as response:
                 if response.status != 200:
                     error_text = await response.text()
                     self._raise_http_error(response.status, error_text)
 
                 self.logger.debug("开始接收多模态流式数据")
 
-                async for line in response.content:
-                    if not line:
-                        continue
-
-                    line_str = line.decode("utf-8", errors="replace").strip()
-                    if not line_str:
-                        continue
-
-                    # 处理Server-Sent Events格式
-                    if line_str.startswith("event:"):
-                        continue
-
-                    data_str: str | None = None
-                    if line_str.startswith("data:"):
-                        data_str = line_str[5:].lstrip()
-                    elif line_str.startswith("{") or line_str.startswith("["):
-                        data_str = line_str
-
+                async for event_name, data_str in self._iter_sse_event_data(
+                    response.content.iter_any()
+                ):
+                    data_str = (data_str or "").strip()
                     if not data_str:
                         continue
                     if data_str == "[DONE]":
@@ -781,7 +829,9 @@ async def complete_multimodal_stream(self, messages: List[MultimodalMessage], sy
                     try:
                         chunk_data = json.loads(data_str)
                     except json.JSONDecodeError as e:
-                        self.logger.warning(f"解析多模态流式数据失败: {e}, 数据: {data_str}")
+                        self.logger.warning(
+                            f"解析多模态流式数据失败: {e}, event: {event_name}, 数据: {data_str}"
+                        )
                         continue
 
                     content = self._extract_stream_content(chunk_data)
@@ -926,31 +976,17 @@ async def complete_stream(self, prompt: str, system_prompt: Optional[str] = None
                 headers=headers,
                 json=data,
                 timeout=aiohttp.ClientTimeout(total=self.config.timeout)
-            ) as response:
+                ) as response:
                 if response.status != 200:
                     error_text = await response.text()
                     self._raise_http_error(response.status, error_text)
 
                 self.logger.debug("开始接收流式数据")
 
-                async for line in response.content:
-                    if not line:
-                        continue
-
-                    line_str = line.decode("utf-8", errors="replace").strip()
-                    if not line_str:
-                        continue
-
-                    # 处理Server-Sent Events格式
-                    if line_str.startswith("event:"):
-                        continue
-
-                    data_str: str | None = None
-                    if line_str.startswith("data:"):
-                        data_str = line_str[5:].lstrip()
-                    elif line_str.startswith("{") or line_str.startswith("["):
-                        data_str = line_str
-
+                async for event_name, data_str in self._iter_sse_event_data(
+                    response.content.iter_any()
+                ):
+                    data_str = (data_str or "").strip()
                     if not data_str:
                         continue
                     if data_str == "[DONE]":
@@ -960,7 +996,9 @@ async def complete_stream(self, prompt: str, system_prompt: Optional[str] = None
                     try:
                         chunk_data = json.loads(data_str)
                     except json.JSONDecodeError as e:
-                        self.logger.warning(f"解析流式数据失败: {e}, 数据: {data_str}")
+                        self.logger.warning(
+                            f"解析流式数据失败: {e}, event: {event_name}, 数据: {data_str}"
+                        )
                         continue
 
                     content = self._extract_stream_content(chunk_data)
diff --git a/tests/test_provider_streaming_openai_claude.py b/tests/test_provider_streaming_openai_claude.py