Skip to content

Build a Web Search Pipeline #12

@functor-flow

Description

@functor-flow

Build a Web Search Pipeline

full spec at spec

Goal

Integrate the new web-search agent pipeline into the Validator, using the module boundaries defined in issues #4#11, and run it end-to-end on real predictions with clear evidence, cost tracking, and test coverage.

Context

The repository already contains most of the building blocks for a modular web-search agent:

  • Input (Input: Web‑search agent input layer #4) – entrypoint types and helpers that turn a prediction/claim into a SearchTask + initial SearchHistory, and expose runWebSearchAgent / handleExternalSearchRequest.
  • Serper (Serper: SERP queries & candidate URLs #5) – SERP boundary that takes a SearchTask + SearchHistory, builds initial/refined queries, and returns de-duplicated CandidateUrl[] while appending SerpQuery entries to history.
  • Crawler (Crawler: URLs to normalized pages #6) – URL→CrawledPage boundary using ScraperAPI and HTML utilities, updating SearchHistory.pages with visit status and short summaries.
  • Embeddings (Embeddings: page scoring & evidence hits #7) – page scoring and evidence extraction, turning CrawledPage + SearchTask into EvidenceHit records and page summaries attached to SearchHistory, with awareness of timeframe and other key factors.
  • Nav (Nav: guided on-site navigation #8) – guided on-site navigation (Occam + Playwright) behind a nav-gate prompt, discovering additional URLs for Crawler and recording navigation attempts in history.
  • Search History & Evidence (Search History & Evidence: shared ledger #9) – the shared ledger structure (SearchHistory) that all modules update and that Decision reads from.
  • Decision & Output (Decision & Output: outcome + proof from history #10) – rule-based decision layer that reads SearchTask + SearchHistory, applies simple evidence thresholds and weights, chooses between True / False / Invalid, and produces an AgentResult via a small LLM-backed proof writer.
  • Testing (Testing: web agent pipeline #11) – module-level and end-to-end tests that run the pipeline on real Validator data, benchmark costs, and feed findings back to the team.

This issue tracks the work of wiring those pieces together inside the repo, not redesigning them. The aim is a single, coherent pipeline that:

  • starts from a Validator prediction (or external request),
  • flows through Input → Serper → Crawler → Embeddings → Nav → Decision, with SearchHistory threaded throughout,
  • returns a stable AgentResult that Validator can map into validation_result, and
  • is exercised by automated tests and small real-data runs so behavior and costs are well understood.

When this issue is done, we should have:

  • a working web-search pipeline wired through the modules above,
  • a simple entrypoint that the Validator and small external adapters can call, and
  • tests and small real-data runs that make its behavior and costs understandable for the whole team.

Sub-issues

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions