Streaming Response + Progress Indicators for Long Docs

#### Summary

Enable **streaming AI responses** (partial tokens, SSE/websockets) for faster perceived speed when querying large PDFs. Add progress indicators for document ingestion + query execution.

---

#### Why

* Current UX feels “stalled” on large documents (users wait with no feedback).
* Token streaming makes answers feel **instantaneous**.
* Progress indicators build trust for heavy PDFs (e.g. “Parsing 200 pages…”).

---

#### Scope

1. **Backend**

   * Add SSE endpoint for AI completions.
   * Stream tokens from provider (OpenAI / local LLM) to client.
   * Add `progress` events for doc ingestion (chunking, embedding, indexing).
   * Update query pipeline to emit checkpoints: retrieval start, N chunks retrieved, response start.

2. **Frontend**

   * Update React hooks to handle `EventSource`/WebSocket streaming.
   * Render live token stream (like ChatGPT).
   * Add progress UI:

     * “Uploading PDF” → percent.
     * “Embedding & indexing” → percent or step counter.
     * “Fetching context…” → spinner.
     * Then live streamed answer.
   * Provide cancel/stop button.

3. **Infra**

   * SSE route under `/api/query/stream`.
   * Ensure Nginx/Next.js proxy passes streaming responses.
   * Handle disconnect/resume gracefully.

---

#### Acceptance Criteria

* Querying a large doc (>100 pages) → user sees immediate progress (upload → index → retrieve).
* AI responses stream word-by-word with no blank screen delay.
* Cancelling query works mid-stream.
* Works across Chrome/Edge/Firefox/Safari.
* No regression for small/fast queries.

---

#### Example API (SSE)

```ts
// /api/query/stream.ts
export default async function handler(req, res) {
  res.setHeader("Content-Type", "text/event-stream");
  res.setHeader("Cache-Control", "no-cache");
  res.setHeader("Connection", "keep-alive");

  const encoder = new TextEncoder();

  const stream = ai.stream({ prompt: req.body.prompt, docId: req.body.docId });
  for await (const chunk of stream) {
    res.write(`data: ${JSON.stringify({ type: "token", value: chunk })}\n\n`);
  }
  res.write(`data: ${JSON.stringify({ type: "done" })}\n\n`);
  res.end();
}
```

---

#### Tasks

**Backend**

* [ ] Add SSE/WS endpoints for query streaming.
* [ ] Add progress events to ingestion + retrieval pipeline.
* [ ] Update OpenAI wrapper to forward token stream.

**Frontend**

* [ ] Add `useStreamQuery()` hook (handles SSE).
* [ ] Implement live token rendering.
* [ ] Add progress bars for ingestion + retrieval.
* [ ] Add cancel/stop button.

**Infra**

* [ ] Update Next.js API route config for streaming.
* [ ] Verify proxy/server supports chunked responses.

**Docs**

* [ ] Add “Streaming Responses” section to README.
* [ ] Document SSE API + frontend usage.

---

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Streaming Response + Progress Indicators for Long Docs #25

Summary

Why

Scope

Acceptance Criteria

Example API (SSE)

Tasks

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Streaming Response + Progress Indicators for Long Docs #25

Description

Summary

Why

Scope

Acceptance Criteria

Example API (SSE)

Tasks

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions