Testing: web agent pipeline

# Testing: web agent pipeline

> full spec at [spec](https://gist.github.com/functor-flow/e5ad7923a0853399800c46f99d35ecfe)

**Status:** Very limited.

**Role.** Provide module-level and end-to-end tests for the new web-search agent pipeline. This is an essential part of the system: the tests must run the pipeline on real Validator data and include cost benchmarking.

**Responsibilities.**
- Add focused tests for each boundary (Input, Serper, Crawler, Embeddings, Nav, SearchHistory, Decision).
- Add end-to-end tests hitting the agent entrypoint on real historical predictions from the Validator and checking both outcomes and evidence, while recording cost metrics.
- Make it easy to compare cost across runs (LLM tokens, SERP/Scraper usage) so budget and behavior regressions are visible.
- Document key findings from the testing process (edge cases, failure patterns, cost surprises) and share them with the team in Discord so the pipeline can be tuned collaboratively.

---

## TODO

- [ ] Add small, focused tests around each module boundary so `SearchTask` → `SearchHistory` transitions remain stable as the pipeline evolves.
- [ ] Add at least one end-to-end test that replays a small, fixed set of Validator predictions through the agent entrypoint and asserts on the resulting `AgentResult` shape, key evidence fields, and basic cost metrics (without depending on live APIs).




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Testing: web agent pipeline #11

Testing: web agent pipeline

TODO

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Testing: web agent pipeline #11

Description

Testing: web agent pipeline

TODO

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions