-
Notifications
You must be signed in to change notification settings - Fork 5
Labels
bugSomething isn't workingSomething isn't workingdocumentationImprovements or additions to documentationImprovements or additions to documentationenhancementNew feature or requestNew feature or requestgood first issueGood for newcomersGood for newcomershelp wantedExtra attention is neededExtra attention is neededquestionFurther information is requestedFurther information is requested
Milestone
Description
Summary
Enable streaming AI responses (partial tokens, SSE/websockets) for faster perceived speed when querying large PDFs. Add progress indicators for document ingestion + query execution.
Why
- Current UX feels “stalled” on large documents (users wait with no feedback).
- Token streaming makes answers feel instantaneous.
- Progress indicators build trust for heavy PDFs (e.g. “Parsing 200 pages…”).
Scope
-
Backend
- Add SSE endpoint for AI completions.
- Stream tokens from provider (OpenAI / local LLM) to client.
- Add
progressevents for doc ingestion (chunking, embedding, indexing). - Update query pipeline to emit checkpoints: retrieval start, N chunks retrieved, response start.
-
Frontend
-
Update React hooks to handle
EventSource/WebSocket streaming. -
Render live token stream (like ChatGPT).
-
Add progress UI:
- “Uploading PDF” → percent.
- “Embedding & indexing” → percent or step counter.
- “Fetching context…” → spinner.
- Then live streamed answer.
-
Provide cancel/stop button.
-
-
Infra
- SSE route under
/api/query/stream. - Ensure Nginx/Next.js proxy passes streaming responses.
- Handle disconnect/resume gracefully.
- SSE route under
Acceptance Criteria
- Querying a large doc (>100 pages) → user sees immediate progress (upload → index → retrieve).
- AI responses stream word-by-word with no blank screen delay.
- Cancelling query works mid-stream.
- Works across Chrome/Edge/Firefox/Safari.
- No regression for small/fast queries.
Example API (SSE)
// /api/query/stream.ts
export default async function handler(req, res) {
res.setHeader("Content-Type", "text/event-stream");
res.setHeader("Cache-Control", "no-cache");
res.setHeader("Connection", "keep-alive");
const encoder = new TextEncoder();
const stream = ai.stream({ prompt: req.body.prompt, docId: req.body.docId });
for await (const chunk of stream) {
res.write(`data: ${JSON.stringify({ type: "token", value: chunk })}\n\n`);
}
res.write(`data: ${JSON.stringify({ type: "done" })}\n\n`);
res.end();
}Tasks
Backend
- Add SSE/WS endpoints for query streaming.
- Add progress events to ingestion + retrieval pipeline.
- Update OpenAI wrapper to forward token stream.
Frontend
- Add
useStreamQuery()hook (handles SSE). - Implement live token rendering.
- Add progress bars for ingestion + retrieval.
- Add cancel/stop button.
Infra
- Update Next.js API route config for streaming.
- Verify proxy/server supports chunked responses.
Docs
- Add “Streaming Responses” section to README.
- Document SSE API + frontend usage.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingdocumentationImprovements or additions to documentationImprovements or additions to documentationenhancementNew feature or requestNew feature or requestgood first issueGood for newcomersGood for newcomershelp wantedExtra attention is neededExtra attention is neededquestionFurther information is requestedFurther information is requested
Projects
Status
Ready