-
Notifications
You must be signed in to change notification settings - Fork 0
Labels
enhancementNew feature or requestNew feature or request
Description
Build a Web Search Pipeline
full spec at spec
Goal
Integrate the new web-search agent pipeline into the Validator, using the module boundaries defined in issues #4–#11, and run it end-to-end on real predictions with clear evidence, cost tracking, and test coverage.
Context
The repository already contains most of the building blocks for a modular web-search agent:
- Input (Input: Web‑search agent input layer #4) – entrypoint types and helpers that turn a prediction/claim into a
SearchTask+ initialSearchHistory, and exposerunWebSearchAgent/handleExternalSearchRequest. - Serper (Serper: SERP queries & candidate URLs #5) – SERP boundary that takes a
SearchTask+SearchHistory, builds initial/refined queries, and returns de-duplicatedCandidateUrl[]while appendingSerpQueryentries to history. - Crawler (Crawler: URLs to normalized pages #6) – URL→
CrawledPageboundary using ScraperAPI and HTML utilities, updatingSearchHistory.pageswith visit status and short summaries. - Embeddings (Embeddings: page scoring & evidence hits #7) – page scoring and evidence extraction, turning
CrawledPage+SearchTaskintoEvidenceHitrecords and page summaries attached toSearchHistory, with awareness of timeframe and other key factors. - Nav (Nav: guided on-site navigation #8) – guided on-site navigation (Occam + Playwright) behind a nav-gate prompt, discovering additional URLs for Crawler and recording navigation attempts in history.
- Search History & Evidence (Search History & Evidence: shared ledger #9) – the shared ledger structure (
SearchHistory) that all modules update and that Decision reads from. - Decision & Output (Decision & Output: outcome + proof from history #10) – rule-based decision layer that reads
SearchTask+SearchHistory, applies simple evidence thresholds and weights, chooses between True / False / Invalid, and produces anAgentResultvia a small LLM-backed proof writer. - Testing (Testing: web agent pipeline #11) – module-level and end-to-end tests that run the pipeline on real Validator data, benchmark costs, and feed findings back to the team.
This issue tracks the work of wiring those pieces together inside the repo, not redesigning them. The aim is a single, coherent pipeline that:
- starts from a Validator prediction (or external request),
- flows through Input → Serper → Crawler → Embeddings → Nav → Decision, with
SearchHistorythreaded throughout, - returns a stable
AgentResultthat Validator can map intovalidation_result, and - is exercised by automated tests and small real-data runs so behavior and costs are well understood.
When this issue is done, we should have:
- a working web-search pipeline wired through the modules above,
- a simple entrypoint that the Validator and small external adapters can call, and
- tests and small real-data runs that make its behavior and costs understandable for the whole team.
functor-flow
Sub-issues
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request