Skip to content

Streaming Response + Progress Indicators for Long Docs #25

@hoangsonww

Description

@hoangsonww

Summary

Enable streaming AI responses (partial tokens, SSE/websockets) for faster perceived speed when querying large PDFs. Add progress indicators for document ingestion + query execution.


Why

  • Current UX feels “stalled” on large documents (users wait with no feedback).
  • Token streaming makes answers feel instantaneous.
  • Progress indicators build trust for heavy PDFs (e.g. “Parsing 200 pages…”).

Scope

  1. Backend

    • Add SSE endpoint for AI completions.
    • Stream tokens from provider (OpenAI / local LLM) to client.
    • Add progress events for doc ingestion (chunking, embedding, indexing).
    • Update query pipeline to emit checkpoints: retrieval start, N chunks retrieved, response start.
  2. Frontend

    • Update React hooks to handle EventSource/WebSocket streaming.

    • Render live token stream (like ChatGPT).

    • Add progress UI:

      • “Uploading PDF” → percent.
      • “Embedding & indexing” → percent or step counter.
      • “Fetching context…” → spinner.
      • Then live streamed answer.
    • Provide cancel/stop button.

  3. Infra

    • SSE route under /api/query/stream.
    • Ensure Nginx/Next.js proxy passes streaming responses.
    • Handle disconnect/resume gracefully.

Acceptance Criteria

  • Querying a large doc (>100 pages) → user sees immediate progress (upload → index → retrieve).
  • AI responses stream word-by-word with no blank screen delay.
  • Cancelling query works mid-stream.
  • Works across Chrome/Edge/Firefox/Safari.
  • No regression for small/fast queries.

Example API (SSE)

// /api/query/stream.ts
export default async function handler(req, res) {
  res.setHeader("Content-Type", "text/event-stream");
  res.setHeader("Cache-Control", "no-cache");
  res.setHeader("Connection", "keep-alive");

  const encoder = new TextEncoder();

  const stream = ai.stream({ prompt: req.body.prompt, docId: req.body.docId });
  for await (const chunk of stream) {
    res.write(`data: ${JSON.stringify({ type: "token", value: chunk })}\n\n`);
  }
  res.write(`data: ${JSON.stringify({ type: "done" })}\n\n`);
  res.end();
}

Tasks

Backend

  • Add SSE/WS endpoints for query streaming.
  • Add progress events to ingestion + retrieval pipeline.
  • Update OpenAI wrapper to forward token stream.

Frontend

  • Add useStreamQuery() hook (handles SSE).
  • Implement live token rendering.
  • Add progress bars for ingestion + retrieval.
  • Add cancel/stop button.

Infra

  • Update Next.js API route config for streaming.
  • Verify proxy/server supports chunked responses.

Docs

  • Add “Streaming Responses” section to README.
  • Document SSE API + frontend usage.

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingdocumentationImprovements or additions to documentationenhancementNew feature or requestgood first issueGood for newcomershelp wantedExtra attention is neededquestionFurther information is requested

Projects

Status

Ready

Relationships

None yet

Development

No branches or pull requests

Issue actions