generated from n8n-io/n8n-nodes-starter
-
Notifications
You must be signed in to change notification settings - Fork 21
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Bug Details
What operation were you trying to use?
- Search
- Map URLs
- Scrape URL
- Crawl Website
- Get Crawl Status
- Extract Data
- Get Extract Status
- Something else
What happened?
Two issues:
-
Initial API validation error: First 2-3 crawl attempts fail with
scrapeOptions.formatsvalidation error (expected array, received object), even though the config only includes default empty headers. After retries, the crawl starts. -
Crawl never completes: Once started, the crawl with
limit: 5stays in "running" indefinitely. Status never reaches "completed" and must be manually stopped in the Firecrawl dashboard. No pages are returned.
What did you expect to happen?
- Crawl should complete within a reasonable time (limit: 5 pages)
- Should return markdown content for each crawled page
- Should respect the prompt-generated paths and excludePaths configuration
- Status should change from "running" to "completed"
Error Message (if any)
{
"success": false,
"code": "BAD_REQUEST",
"error": "Bad Request",
"details": [
{
"code": "invalid_type",
"expected": "array",
"received": "object",
"path": ["scrapeOptions", "formats"],
"message": "Expected array, received object"
},
{
"code": "unrecognized_keys",
"keys": ["formats"],
"path": [],
"message": "Unrecognized key in body -- please review the v2 API documentation for request body changes"
}
]
}
Environment
n8n Version
Node Version
@mendable/n8n-nodes-firecrawl v1
Configuration Used
{
"operation": "crawl",
"url": "={{ $json.companyWebsite }}",
"prompt": "Only extract content related to recent developments at the company that can be used as a point of relevance in an outreach message. Do not extract generic information that is not of current concern to the company. It can be things mentioned on the home page, blog, news, press, about, why etc.",
"limit": 5,
"delay": 1000,
"maxConcurrency": null,
"excludePaths": {
"items": [
{
"path": "data/*"
}
]
},
"crawlOptions": {
"allowSubdomains": true
},
"scrapeOptions": {
"options": {
"headers": {}
}
},
"requestOptions": {
"batching": {
"batch": {
"batchSize": 1,
"batchInterval": 3000
}
}
}
}Note: The scrapeOptions only contains default empty headers, yet the error suggests formats is being sent incorrectly by the node.
Additional Context
- The error suggests the node may be adding
formatstoscrapeOptionsautomatically, or there's a mismatch between what the node sends and what the v2 API expects - The crawl job is created successfully (returns job ID), but status polling shows it stays in "running" state indefinitely
- This is consistent behavior - happens every time, not intermittent
- No workarounds found - crawl must be manually stopped in Firecrawl dashboard
- Node has
retryOnFail: trueandwaitBetweenTries: 5000configured
Main blocker: The crawl starts but never completes, even for small limits (5 pages), making the node unusable for crawl operations.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working