Skip to content

Implement hybrid fusion + reranker and staged status UI #4

@hammertoe

Description

@hammertoe

Summary

Implement robust hybrid retrieval with RRF fusion + Gemini reranker, and replace typing dots with staged, user-friendly progress statuses (SSE-backed with fallback).

Retrieval: Robust Fusion + Reranker

  • Split seed retrieval into separate channels (vector, full-text, alias) and keep ranked lists distinct.
  • Add RRF fusion helper (reciprocal rank fusion) with configurable .
  • Add intent boosts (topic term coverage, recency intent) and generic-term penalties.
  • Add Gemini rerank step over top N fused candidates with strict JSON schema; fallback to fused order on failure/timeout.
  • Feed reranked seeds into edge + citation retrieval; enforce topical evidence minimum for topical queries.
  • Add feature flags in config: , , , , .

Progress Events (Backend)

  • Add stage callbacks around chat pipeline in :
    • received
    • retrieving_sources
    • ranking_sources
    • reading_evidence
    • drafting_answer
    • finalizing_response
  • Add SSE endpoint in for streaming stage events + final response.
  • Keep existing POST endpoint for compatibility.

UI: Staged Status Messages

  • Add SSE client in (event types: stage, final, error).
  • Replace bouncing dots with staged status card in .
  • Friendly stage copy (non-technical):
    • Got it — checking your question.
    • Looking through recent debates and documents.
    • Picking the most relevant receipts from Parliament.
    • Reading the clips and bill excerpts.
    • Drafting your answer.
    • Adding sources and final checks.
  • Add fallback staged progression if SSE unavailable.
  • Update styles in .

Tests

  • Add unit tests for fusion behavior (generic lexical dominance doesn’t suppress topical vector hits).
  • Add tests for reranker fallback on malformed JSON/timeouts.
  • Add tests for topical evidence fallback response.

Rollout

  • Feature-flagged rollout on staging.
  • Evaluate: topical source precision, off-topic drift rate, latency impact, user reformulation rate.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions