Skip to content

docs(scrape): add map tool as fallback step for SPA/JS-heavy pages#157

Merged
leonardogrig merged 2 commits intomainfrom
feat/scrape-map-fallback-guidance
Feb 5, 2026
Merged

docs(scrape): add map tool as fallback step for SPA/JS-heavy pages#157
leonardogrig merged 2 commits intomainfrom
feat/scrape-map-fallback-guidance

Conversation

@firecrawl-spring
Copy link
Contributor

Summary

  • Adds step 3 in the scrape tool's SPA handling guidance to use firecrawl_map with a search parameter before falling back to firecrawl_agent
  • This helps agents discover the correct page URL on large documentation sites where content is spread across multiple pages

Context

When scraping JavaScript-heavy pages like API documentation sites, the scrape tool may return empty or minimal content. Previously, the guidance jumped directly to using firecrawl_agent.

Testing showed that using firecrawl_map with a search parameter can efficiently discover the specific page URL (e.g., /reference/webhook-events instead of just /reference), allowing a subsequent scrape to succeed without the overhead of the full agent.

Example workflow that now works:

  1. Scrape https://docs.example.com/reference for "Invoice Deleted webhook parameters" → empty result
  2. Map with {"url": "https://docs.example.com/reference", "search": "Invoice Deleted webhook"} → finds /reference/webhook-events
  3. Scrape the specific URL → succeeds

Test plan

  • Verify the updated tool description renders correctly in MCP clients
  • Test with a JS-heavy documentation site to confirm the map→scrape workflow guidance helps agents succeed

When scrape returns empty or minimal content from JavaScript-rendered
pages, guide users to try firecrawl_map with a search parameter to
discover the specific page URL before falling back to firecrawl_agent.

This is more efficient than immediately using the agent, as many large
documentation sites spread content across multiple URLs that can be
discovered via map.

Co-Authored-By: leonardogrig <leo@sideguide.dev>
- Enhanced map tool description to highlight its role in finding
  specific page URLs when scrape returns empty results
- Added search parameter example showing the recommended workflow
- Removed waitFor from default scrape JSON example (it's optional,
  only needed when JS rendering requires extra wait time)

Co-Authored-By: leonardogrig <leo@sideguide.dev>
@leonardogrig leonardogrig merged commit 0159dda into main Feb 5, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments