Skip to content

Serper: SERP queries & candidate URLs #5

@functor-flow

Description

@functor-flow

Serper: SERP queries & candidate URLs

full spec at spec

flowchart TD
  Task["SearchTask"]
  HistBefore["SearchHistory (before)"]
  Iter{"first loop?"}

  subgraph Serper["Serper module"]
    InitQ["buildInitialSearchQuery"]
    RefineQ["buildRefinedQueryFromHistory"]
    CallApi["call SERP API"]
    Filter["filter + dedupe URLs"]
    Record["append SerpQuery entry"]
  end

  HistAfter["SearchHistory (after)"]
  Urls["CandidateUrl[]"]

  Task --> Iter
  HistBefore --> Iter

  Iter -- "yes" --> InitQ
  Iter -- "no" --> RefineQ

  InitQ --> CallApi
  RefineQ --> CallApi

  CallApi --> Filter
  Filter --> Urls
  Filter --> Record

  Record --> HistAfter
Loading

Status: Partially implemented (schemas + sample parsing + LLM gating); not integrated into the loop (pipeline).

Role. On each loop iteration, take a SearchTask plus its current SearchHistory, generate a SERP query, and return a small set of high-value URLs to inspect next.

Responsibilities.

  • On the first iteration, turn the SearchTask from Input: Web‑search agent input layer #4 into an initial SERP query (using buildInitialSearchQuery) and write both the query and returned URLs into SearchHistory.
  • On later iterations, look at SearchHistory (visited URLs, evidence so far, past queries and their outcomes) and synthesize a refined SERP query that avoids already-exhausted pages/domains and targets missing evidence.
  • De-duplicate candidate URLs against SearchHistory so the Crawler only receives new pages to try.
  • Emit a small, typed list of candidate URLs (with title, domain, snippet, date) that the Crawler can consume.

TODO

  • Define a Serper boundary function (input: SearchTask + SearchHistory, output: { query: string; candidates: CandidateUrl[]; history: SearchHistory }) and use it as the only way the loop talks to SERP.
  • Reuse the existing Serper schemas / parsing helpers to map raw SERP responses into CandidateUrl objects (url, domain, title, snippet, date) for the Crawler. (Already partially implemented.)
  • Extend SearchHistory with minimal serpQueries entries (query string, iteration index, candidate URLs, rough outcome) and make sure each Serper call appends one.
  • When Decision says the agent is still missing evidence, build a new SERP query from SearchHistory and call Serper again (for example: skip domains that already gave us weak/no evidence, tighten or shift the timeframe, or add entities that keep showing up in snippets).

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions