Build a Web Search Pipeline

# Build a Web Search Pipeline

> full spec at [spec](https://gist.github.com/functor-flow/e5ad7923a0853399800c46f99d35ecfe)

## Goal

Integrate the new web-search agent pipeline into the Validator, using the module boundaries defined in issues #4–#11, and run it end-to-end on real predictions with clear evidence, cost tracking, and test coverage.

## Context

The repository already contains most of the building blocks for a modular web-search agent:

- **Input (#4)** – entrypoint types and helpers that turn a prediction/claim into a `SearchTask` + initial `SearchHistory`, and expose `runWebSearchAgent` / `handleExternalSearchRequest`.
- **Serper (#5)** – SERP boundary that takes a `SearchTask` + `SearchHistory`, builds initial/refined queries, and returns de-duplicated `CandidateUrl[]` while appending `SerpQuery` entries to history.
- **Crawler (#6)** – URL→`CrawledPage` boundary using ScraperAPI and HTML utilities, updating `SearchHistory.pages` with visit status and short summaries.
- **Embeddings (#7)** – page scoring and evidence extraction, turning `CrawledPage` + `SearchTask` into `EvidenceHit` records and page summaries attached to `SearchHistory`, with awareness of timeframe and other key factors.
- **Nav (#8)** – guided on-site navigation (Occam + Playwright) behind a nav-gate prompt, discovering additional URLs for Crawler and recording navigation attempts in history.
- **Search History & Evidence (#9)** – the shared ledger structure (`SearchHistory`) that all modules update and that Decision reads from.
- **Decision & Output (#10)** – rule-based decision layer that reads `SearchTask` + `SearchHistory`, applies simple evidence thresholds and weights, chooses between True / False / Invalid, and produces an `AgentResult` via a small LLM-backed proof writer.
- **Testing (#11)** – module-level and end-to-end tests that run the pipeline on real Validator data, benchmark costs, and feed findings back to the team.

This issue tracks the work of wiring those pieces together inside the repo, not redesigning them. The aim is a single, coherent pipeline that:

- starts from a Validator prediction (or external request),
- flows through Input → Serper → Crawler → Embeddings → Nav → Decision, with `SearchHistory` threaded throughout,
- returns a stable `AgentResult` that Validator can map into `validation_result`, and
- is exercised by automated tests and small real-data runs so behavior and costs are well understood.

When this issue is done, we should have:

- a working web-search pipeline wired through the modules above,
- a simple entrypoint that the Validator and small external adapters can call, and
- tests and small real-data runs that make its behavior and costs understandable for the whole team.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Build a Web Search Pipeline #12

Build a Web Search Pipeline

Goal

Context

Sub-issues

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Build a Web Search Pipeline #12

Description

Build a Web Search Pipeline

Goal

Context

Sub-issues

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions