Skip to content

Search History & Evidence: shared ledger #9

@functor-flow

Description

@functor-flow

Search History & Evidence: shared ledger

full spec at spec

flowchart TD
  Task["SearchTask"]
  SH["SearchHistory"]

  Serper["Serper"]
  Crawler["Crawler"]
  Emb["Embeddings"]
  Nav["Nav"]
  Dec["Decision (read-only)"]

  Task --> Serper
  Serper -->|add SerpQuery| SH

  Task --> Crawler
  Crawler -->|add/update PageVisit| SH

  Task --> Emb
  Emb -->|add EvidenceHit + aggregates
+ page summary when needed| SH

  Task --> Nav
  Nav -->|add nav PageVisit| SH

  SH --> Dec
Loading

Status: Not implemented.

Role. Single per-task state object that flows through the loop (Serper → Crawler → Embeddings → Nav → Decision), capturing what has been tried and what evidence exists so far.

Initial shape (sketch).

interface SearchHistory {
  pages: PageVisit[];        // what we crawled / tried
  evidence: EvidenceHit[];   // normalized snippets across pages
  aggregates: {
    totalSupport: number;
    totalRefute: number;
    domainsSeen: number;
  };
  serpQueries: SerpQuery[];  // queries used in each loop iteration
}

interface PageVisit {
  url: string;
  domain: string | null;
  source: "serp" | "nav";
  status: "pending" | "fetched" | "blocked" | "no_content" | "skipped";
  summary?: string | null;   // short TLDR of page, optional
}

interface EvidenceHit {
  url: string;
  pageIndex: number;         // index into `pages`
  snippet: string;
  stance: "support" | "refute" | "unclear" | null;
  weight: number;         // 1.0 primary, 0.8 wire, 0.6 trade, ≤0.4 other
  dateUtc: string | null;
}

interface SerpQuery {
  query: string;
  iteration: number;
  candidateUrls: string[];
  outcome: "no_results" | "insufficient" | "enough_evidence";
}

(Types are an initial sketch for this issue; they can be refined during implementation, but the idea is to keep everything needed for refinement and decision in one place.)

Responsibilities.

  • Track visited pages (URL, domain, status, short summary), evidence hits per page, and simple support/refute aggregates. Even when a page yields no usable evidence, keep a short 'what we saw here' summary so future SERP/Nav decisions treat it as already explored.
  • Record all SERP queries used for the task, including which URLs were proposed in each iteration.
  • Offer simple read-only helpers for Serper and Decision (e.g. "already visited URLs", "top supporting hits", "which time ranges or domains we have not touched yet").

TODO

  • Implement an initial SearchHistory type aligned with the sketch above and use it as the state threaded through Serper → Crawler → Embeddings → Nav → Decision.
  • Add pure helper functions to mutate SearchHistory (add SERP query, mark page visited, attach evidence hit, update aggregates) so each module calls helpers instead of reaching into the structure directly.
  • Add small read-only helpers used by Serper and Decision (e.g. getVisitedUrls(history), getTopEvidence(history), hasEnoughCoverage(history)) to keep the code simple and bottom-up.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions