feat(v2): migrate Go SDK from Firecrawl API v1 to v2#1
Open
ArmandoHerra wants to merge 33 commits intomainfrom
Open
feat(v2): migrate Go SDK from Firecrawl API v1 to v2#1ArmandoHerra wants to merge 33 commits intomainfrom
ArmandoHerra wants to merge 33 commits intomainfrom
Conversation
- Fix monitorJobStatus retry counter starting at threshold (3→0) - Fix defer resp.Body.Close() connection leak in retry loop - Fix request body consumed on first attempt, retries sending empty body - Fix ScrapeURL checking Success before unmarshal error - Fix ScrapeOptions gate only checking Formats field - Remove dead commented-out v0 extractor code
- Split 838-line firecrawl.go into 9 domain-specific files - client.go: struct, constructor, headers - types.go: all request/response type definitions - scrape.go, crawl.go, map.go, search.go: endpoint methods - errors.go, helpers.go, options.go: internal utilities - Zero logic changes — pure structural refactor
- Add Makefile with build, test, lint, fmt, vet, coverage, and check targets - Add .golangci.yml with errcheck, govet, bodyclose, noctx, gosec linters - Add GitHub Actions CI workflow (lint + test matrix Go 1.22/1.23 + integration) - Add Dependabot config for gomod and github-actions ecosystems - Add .editorconfig for consistent editor settings - Delete legacy firecrawl_test.go_V0
- Add //go:build integration tag to gate E2E tests behind -tags=integration - Replace init() with TestMain for graceful skip when .env is missing - Fix gofumpt formatting in crawl.go - Use http.NewRequestWithContext to satisfy noctx linter - Use errors.New for dynamic format string to satisfy staticcheck SA1006 - Disable fieldalignment in govet config (structs rewritten in MIG-04)
- Skip 80% coverage gate when coverage is 0.0% (no test files ran) - Threshold activates automatically once unit tests are added
…-1.25 - Bump actions/checkout v4 → v5, actions/setup-go v5 → v6 - Bump golangci-lint-action v6 → v7 - Add Go 1.24 and 1.25 to test matrix - Use Go 1.25 for lint and integration jobs
…rom matrix - Add version: "2" to .golangci.yml for golangci-lint v2.x compatibility - Move linters-settings under linters.settings per v2 schema - Drop Go 1.22 from test matrix (EOL, keep 1.23-1.25)
- golangci-lint v2 treats gofumpt as a formatter, not a linter - Move from linters.enable to formatters.enable per v2 schema
- Check resp.Body.Close() return values to satisfy errcheck - Refactor monitorJobStatus status chain to switch statement (QF1003)
…names - Rewrite types.go with 31 v2 type definitions (ScrapeParams, CrawlParams, MapParams, SearchParams, BatchScrapeParams, ExtractParams, WebhookConfig, LocationConfig, ActionConfig, ParserConfig, MapLink, PaginationConfig, etc.) - Rename CrawlParams fields: MaxDepth→MaxDiscoveryDepth, AllowBackwardLinks→CrawlEntireDomain, IgnoreSitemap→Sitemap enum, Webhook *string→*WebhookConfig - Change MapResponse.Links from []string to []MapLink - Remove ParsePDF from ScrapeParams, replace with Parsers []ParserConfig - Add v2 scrape options: Mobile, Location, Actions, Proxy, BlockAds, etc. - Bump go.mod minimum Go version to 1.23 BREAKING CHANGE: CrawlParams, MapParams, ScrapeParams, and MapResponse have renamed/removed/added fields per Firecrawl API v2.
…lpers - Add ctx context.Context as first parameter to all 7 public methods - Add ctx to makeRequest and monitorJobStatus internal helpers - Use http.NewRequestWithContext(ctx, ...) for request creation - Replace time.Sleep with context-aware select in polling loop - Check ctx.Err() at loop boundaries for fast cancellation - Update integration tests with context.Background() BREAKING CHANGE: All public methods now require context.Context as first parameter. Callers must pass context.Background() or a derived context.
- Change ScrapeURL endpoint from /v1/scrape to /v2/scrape - Replace map[string]any body with typed scrapeRequest struct - Refactor makeRequest to accept pre-marshaled []byte instead of map[string]any - Update all makeRequest callers (crawl, map) to marshal at call site
…aling
- Change CrawlURL, AsyncCrawlURL, CheckCrawlStatus, CancelCrawlJob to /v2/crawl
- Update monitorJobStatus polling path to /v2/crawl/{id}
- Replace map[string]any body with typed crawlRequest struct
- Extract shared buildCrawlRequest helper to eliminate duplication
- Replace v1 polling statuses (active, paused, pending, queued, waiting) with single v2 "scraping" status - Add explicit "failed" case for v2 failure handling - Change default to "unknown crawl status" for unexpected values
- Change MapURL endpoint from /v1/map to /v2/map - Replace map[string]any body with typed mapRequest struct - Response uses MapLink objects per v2 API (from MIG-04)
…t structure - Replace v1 method signatures with v2 (context.Context, renamed fields) - Add project structure, Makefile targets, CI pipeline docs - Add configuration, development setup, and testing sections - Update usage examples for ScrapeURL, CrawlURL, MapURL with v2 params
- Define 8 sentinel errors for programmatic error handling - Add APIError struct with StatusCode, Message, Action fields - Implement Unwrap() for errors.Is/errors.As support - Update handleError to return *APIError wrapping sentinels - Use ErrNoAPIKey in NewFirecrawlApp constructor
…IKey - Add security.go with SSRF-preventing pagination URL validation - Add UUID-format job ID validation to prevent path injection - Unexport APIKey field to apiKey, add APIKey() accessor method - Add String() method with key redaction for safe logging - Add HTTPS enforcement warning for non-localhost HTTP URLs - Wire validations into CheckCrawlStatus, CancelCrawlJob, monitorJobStatus - Add 14 unit tests for all security functions BREAKING CHANGE: FirecrawlApp.APIKey is now unexported. Use app.APIKey() instead.
- Create testhelpers_test.go with newMockServer, respondJSON, ptr helpers - Add client_test.go with constructor and env fallback tests - Add errors_test.go with handleError status code and APIError tests - Add scrape_test.go with ScrapeURL success, params, and error tests
…tests) - Add crawl_test.go with 11 tests for CrawlURL, AsyncCrawlURL, Check/Cancel - Add map_test.go with 5 tests for MapURL success, params, and errors - Add helpers_test.go with 4 tests for makeRequest retry and context - Add types_test.go with 8 tests for StringOrStringSlice unmarshaling - Add search_test.go with stub verification test - Extend scrape_test.go with 6 tests (all params, errors, context cancel) - Extend client_test.go with 4 tests (env fallback, timeout config)
…sion - Add SDKVersion constant and User-Agent header on all requests - Add ClientOption functional options (WithTimeout, WithTransport, etc.) - Add NewFirecrawlAppWithOptions constructor with configurable transport - Clone DefaultTransport for connection pool tuning - Add 13 unit tests for options and User-Agent behavior
- Replace stub with full POST /v2/search implementation - Define searchRequest struct with all v2 search params - Return typed *SearchResponse with web/images/news results - Add 6 unit tests covering success, params, and error cases
- Add BatchScrapeURLs (sync with polling), AsyncBatchScrapeURLs, CheckBatchScrapeStatus - Add monitorBatchScrapeStatus internal poller with context-aware polling - Include validateJobID and validatePaginationURL security checks - Add 21 unit tests covering all batch scrape operations
- Add AsyncExtract, Extract (sync with polling), CheckExtractStatus - Add monitorExtractStatus with "processing" status polling - Include validateJobID security check on status endpoints - Add 17 unit tests covering all extract operations
- Add TestSearch_RateLimited for 429 sentinel error handling - Add TestSearch_ContextCancelled for pre-cancelled context - IMP-01/02/03 already shipped 44 tests exceeding the 34-test target
- Wire PaginationConfig into CheckCrawlStatus and CheckBatchScrapeStatus - Add GetCrawlStatusPage and GetBatchScrapeStatusPage public methods - Implement auto-pagination with MaxPages, MaxResults, MaxWaitTime limits - Validate pagination URLs against API host (SSRF prevention) - Add 12 unit tests for pagination behavior
…E tests - Fix v1 field names (MaxDepth, AllowBackwardLinks, IgnoreSitemap) in E2E tests - Update Map tests to use MapLink response objects - Add 9 new E2E tests for Search, BatchScrape, Extract, and PaginationConfig - Total: 32 E2E tests (was 23), all async-only for fast CI
- Document all 14 public methods across 6 endpoint groups - Add Search, Batch Scrape, Extract usage examples - Add Error Handling section (APIError, sentinel errors, errors.Is/errors.As) - Add Client Options section (NewFirecrawlAppWithOptions, WithTimeout, etc.) - Add PaginationConfig and Security sections - Create CONTRIBUTING.md with setup, workflow, and style guide
…files - Create CHANGELOG.md with proper Added/Changed/Fixed/Removed sections - Delete informal changelog.md (replaced by CHANGELOG.md) - Update README project structure reference - Update .env.example with runtime and test variable documentation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Complete migration of the firecrawl-go SDK from Firecrawl API v1 to v2 with full Python SDK feature parity. This PR delivers 31 commits across 5 implementation phases — all existing endpoints migrated to v2, 3 new endpoint groups implemented (Search, Batch Scrape, Extract), typed error system, security hardening, 167 unit tests, 32 E2E tests, and comprehensive documentation.
Migration Phase — Complete (11/11 specs)
Phase 1: Foundation
Phase 2: Core Migration (v1 to v2)
Improvement Phase — Complete (13/16 specs, 3 P2 deferred)
Phase 3: Testing and Security
Phase 4: New v2 Endpoints
Phase 5: Documentation and E2E
Phase 6: Advanced Features (P2 — deferred)
Breaking Changes
Public API (14 methods)
Test Coverage
File Structure (27 Go files)