Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
dd1192d
fix(sdk): resolve 6 foundation bugs in preparation for v2 API migration
ArmandoHerra Mar 15, 2026
f6a01f2
refactor(sdk): split monolithic firecrawl.go into modular file structure
ArmandoHerra Mar 15, 2026
16cec0a
ci(sdk): add Makefile, golangci-lint, GitHub Actions CI, and Dependabot
ArmandoHerra Mar 15, 2026
10f9db9
fix(ci): resolve lint and test failures in GitHub Actions pipeline
ArmandoHerra Mar 15, 2026
c4c7af0
fix(ci): skip coverage threshold when no unit tests exist
ArmandoHerra Mar 15, 2026
41ec719
ci(sdk): bump actions to Node.js 24 and expand Go test matrix to 1.22…
ArmandoHerra Mar 15, 2026
6abad27
fix(ci): migrate golangci-lint config to v2 format and drop Go 1.22 f…
ArmandoHerra Mar 15, 2026
2b4bdb6
fix(ci): move gofumpt to formatters section for golangci-lint v2
ArmandoHerra Mar 15, 2026
0e460ee
fix(ci): remove gosimple linter (merged into staticcheck in v2)
ArmandoHerra Mar 15, 2026
681bfdc
fix(sdk): resolve errcheck and staticcheck lint issues in helpers.go
ArmandoHerra Mar 15, 2026
596670a
feat(sdk)!: define all v2 API types and update endpoints to v2 field …
ArmandoHerra Mar 15, 2026
75036b9
feat(sdk)!: add context.Context to all public methods and internal he…
ArmandoHerra Mar 15, 2026
bc240b6
refactor(sdk): migrate ScrapeURL to /v2/scrape with struct marshaling
ArmandoHerra Mar 15, 2026
0d0af9b
refactor(sdk): migrate crawl endpoints to /v2/crawl with struct marsh…
ArmandoHerra Mar 15, 2026
4837940
refactor(sdk): update monitorJobStatus to v2 crawl status values
ArmandoHerra Mar 15, 2026
3832e64
refactor(sdk): migrate MapURL to /v2/map with struct marshaling
ArmandoHerra Mar 15, 2026
0c6b094
docs(sdk): add MIG-10 and MIG-11 verification checkpoint entries to c…
ArmandoHerra Mar 15, 2026
243f601
docs(sdk): rewrite README for v2 API with updated examples and projec…
ArmandoHerra Mar 15, 2026
f030391
feat(errors): add typed error system with APIError and sentinel errors
ArmandoHerra Mar 15, 2026
838e76a
feat(security)!: add URL validation, ID sanitization, and unexport AP…
ArmandoHerra Mar 15, 2026
61fabd7
test(sdk): add unit test foundation with mock server and 17 smoke tests
ArmandoHerra Mar 15, 2026
d676b48
test(sdk): add comprehensive unit tests for all existing methods (97 …
ArmandoHerra Mar 15, 2026
48eb6fa
feat(client): add HTTP client options, User-Agent header, and SDK ver…
ArmandoHerra Mar 15, 2026
57e457d
feat(search): implement Search endpoint for v2 API
ArmandoHerra Mar 15, 2026
ecf5e37
feat(batch): implement Batch Scrape endpoints for v2 API
ArmandoHerra Mar 15, 2026
8072c94
feat(extract): implement Extract endpoints for v2 API
ArmandoHerra Mar 15, 2026
32d8441
test(sdk): add remaining coverage tests for new endpoints (155 total)
ArmandoHerra Mar 15, 2026
3107777
feat(pagination): add PaginationConfig support and manual page methods
ArmandoHerra Mar 15, 2026
3c9b853
test(e2e): modernize integration tests for v2 and add new endpoint E2…
ArmandoHerra Mar 15, 2026
0c36ebf
docs(sdk): comprehensive README rewrite and CONTRIBUTING.md for v2 SDK
ArmandoHerra Mar 15, 2026
9b89abf
docs(sdk): add CHANGELOG.md in Keep a Changelog format and update DX …
ArmandoHerra Mar 15, 2026
f3e048f
docs(sdk): fix fork link and remove duplicate license note in README
ArmandoHerra Mar 15, 2026
30a4d58
docs(sdk): remove external specs reference from CONTRIBUTING.md
ArmandoHerra Mar 15, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions .editorconfig
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
root = true

[*]
end_of_line = lf
insert_final_newline = true
trim_trailing_whitespace = true
charset = utf-8

[*.go]
indent_style = tab
indent_size = 4

[*.{yml,yaml}]
indent_style = space
indent_size = 2

[*.md]
trim_trailing_whitespace = false

[Makefile]
indent_style = tab
9 changes: 7 additions & 2 deletions .env.example
Original file line number Diff line number Diff line change
@@ -1,2 +1,7 @@
API_URL=http://localhost:3002
TEST_API_KEY=fc-YOUR-API-KEY
# Firecrawl SDK Runtime (used by your application)
# FIRECRAWL_API_KEY=fc-your-api-key
# FIRECRAWL_API_URL=https://api.firecrawl.dev

# Integration Tests (used by `make test-integration`)
API_URL=https://api.firecrawl.dev
TEST_API_KEY=fc-your-test-api-key
19 changes: 19 additions & 0 deletions .github/dependabot.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
version: 2
updates:
- package-ecosystem: gomod
directory: /
schedule:
interval: weekly
open-pull-requests-limit: 5
labels:
- dependencies
- go

- package-ecosystem: github-actions
directory: /
schedule:
interval: weekly
open-pull-requests-limit: 5
labels:
- dependencies
- ci
63 changes: 63 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
name: CI
on:
push:
branches: [main]
pull_request:
branches: [main]

permissions:
contents: read

concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true

jobs:
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v5
- uses: actions/setup-go@v6
with:
go-version: '1.25'
- uses: golangci/golangci-lint-action@v7
with:
version: latest

test:
runs-on: ubuntu-latest
strategy:
matrix:
go-version: ['1.23', '1.24', '1.25']
steps:
- uses: actions/checkout@v5
- uses: actions/setup-go@v6
with:
go-version: ${{ matrix.go-version }}
- run: go test -race -v -count=1 -coverprofile=coverage.out ./...
- name: Check coverage
run: |
COVERAGE=$(go tool cover -func=coverage.out | grep total | awk '{print $3}' | sed 's/%//')
echo "Coverage: ${COVERAGE}%"
if [ "$COVERAGE" = "0.0" ] || [ -z "$COVERAGE" ]; then
echo "No unit tests ran — skipping coverage check"
exit 0
fi
if (( $(echo "$COVERAGE < 80" | bc -l) )); then
echo "Coverage below 80% threshold"
exit 1
fi

integration:
runs-on: ubuntu-latest
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
needs: [lint, test]
steps:
- uses: actions/checkout@v5
- uses: actions/setup-go@v6
with:
go-version: '1.25'
- run: go test -race -v -count=1 -tags=integration ./...
env:
API_URL: https://api.firecrawl.dev
TEST_API_KEY: ${{ secrets.FIRECRAWL_API_KEY }}
6 changes: 5 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,6 @@
.env
vendor
coverage.out
coverage.html
vendor/
*.test
*.prof
31 changes: 31 additions & 0 deletions .golangci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
version: "2"

run:
timeout: 5m

formatters:
enable:
- gofumpt

linters:
enable:
- errcheck
- govet
- staticcheck
- unused
- ineffassign
- misspell
- bodyclose
- noctx
- gosec
- prealloc

settings:
errcheck:
check-type-assertions: true
govet:
disable:
- fieldalignment
gosec:
excludes:
- G402 # TLS InsecureSkipVerify (user controls this)
91 changes: 91 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
# Changelog

All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [Unreleased]

### Added

- Search endpoint (`POST /v2/search`) with typed `SearchResponse` (IMP-01)
- Batch Scrape endpoints: `BatchScrapeURLs`, `AsyncBatchScrapeURLs`, `CheckBatchScrapeStatus` (IMP-02)
- Extract endpoints: `Extract`, `AsyncExtract`, `CheckExtractStatus` (IMP-03)
- Typed error system: `APIError` struct with 8 sentinel errors (`ErrUnauthorized`, `ErrRateLimited`, `ErrNoAPIKey`, `ErrPaymentRequired`, `ErrNotFound`, `ErrTimeout`, `ErrConflict`, `ErrServerError`) (IMP-04)
- Security hardening: pagination URL validation against API host, UUID job ID validation, HTTPS warning on non-localhost HTTP (IMP-05)
- Unit test foundation with `httptest.NewServer` mock server helpers (IMP-06)
- 160+ unit tests covering all methods, error paths, and security behaviors (IMP-07, IMP-08)
- HTTP client options: `NewFirecrawlAppWithOptions`, `WithTimeout`, `WithTransport`, `WithUserAgent`, `WithMaxIdleConns`, `WithMaxIdleConnsPerHost` (IMP-15)
- `PaginationConfig` support for `CheckCrawlStatus` and `CheckBatchScrapeStatus` (IMP-10)
- `GetCrawlStatusPage` and `GetBatchScrapeStatusPage` public methods for manual pagination (IMP-10)
- `SDKVersion` constant (`"2.0.0"`) and `User-Agent` header on all requests (IMP-15)
- `CONTRIBUTING.md` with development workflow, code style, and endpoint addition guide (IMP-11)
- Integration tests for Search, Batch Scrape, Extract, and PaginationConfig (IMP-09)

### Changed

- **BREAKING:** All public methods now require `context.Context` as first parameter (MIG-05)
- **BREAKING:** `CrawlParams.MaxDepth` renamed to `MaxDiscoveryDepth` (MIG-04)
- **BREAKING:** `CrawlParams.AllowBackwardLinks` renamed to `CrawlEntireDomain` (MIG-04)
- **BREAKING:** `CrawlParams.IgnoreSitemap` replaced by `Sitemap` string enum (`"include"`, `"skip"`, `"only"`) (MIG-04)
- **BREAKING:** `CrawlParams.Webhook` changed from `*string` to `*WebhookConfig` (MIG-04)
- **BREAKING:** `MapResponse.Links` changed from `[]string` to `[]MapLink` (MIG-04)
- **BREAKING:** `ScrapeParams.ParsePDF` removed, replaced by `Parsers []ParserConfig` (MIG-04)
- **BREAKING:** `FirecrawlApp.APIKey` field unexported — use `APIKey()` accessor method (IMP-05)
- **BREAKING:** `Search` method signature changed from `(ctx, query, *any) (any, error)` to `(ctx, query, *SearchParams) (*SearchResponse, error)` (IMP-01)
- All endpoints migrated from `/v1/*` to `/v2/*` (MIG-06 through MIG-09)
- `makeRequest` accepts `[]byte` body instead of `map[string]any`; callers marshal before passing (MIG-06)
- `monitorJobStatus` uses v2 status values: `"scraping"` (poll), `"completed"`, `"failed"` (MIG-08)
- Minimum Go version bumped from 1.22 to 1.23 (MIG-04)
- Split monolithic `firecrawl.go` into 16 modular files (MIG-02)
- `http.DefaultTransport` is cloned instead of referenced directly (IMP-15)

### Fixed

- Retry counter in `monitorJobStatus` was initialized at retry threshold — now starts at 0 so retries actually occur (MIG-01)
- `defer resp.Body.Close()` inside retry loop leaked HTTP connections; intermediate bodies now closed explicitly (MIG-01)
- Request body (`bytes.NewBuffer`) consumed on first attempt, all retries sent empty body; body now recreated per attempt (MIG-01)
- `ScrapeURL` checked response `Success` before checking unmarshal error — order corrected (MIG-01)
- `ScrapeOptions` gate only checked `Formats` field — gate now checks any non-zero field (MIG-01)

### Removed

- Commented-out v0 extractor code (MIG-01)
- Legacy `firecrawl_test.go_V0` test file (MIG-03)
- v1 API paths (`/v1/*`) — all replaced by `/v2/*`

## [2.0.0] — 2026-03-15

### Added

- `context.Context` on all public methods and internal helpers (MIG-05)
- 31+ v2 type definitions: `LocationConfig`, `WebhookConfig`, `ActionConfig`, `ParserConfig`, `MapLink`, `PaginationConfig`, `SearchParams`, `SearchResponse`, `BatchScrapeParams`, `BatchScrapeResponse`, `ExtractParams`, `ExtractResponse`, and more (MIG-04)
- CI/CD pipeline: `Makefile` with 9 targets, `golangci-lint` v2 config, GitHub Actions with lint + test matrix (Go 1.23/1.24/1.25) (MIG-03)
- Modular file structure: 16 Go source files split by concern (MIG-02)
- `.editorconfig` and `dependabot.yml` (MIG-03)

### Changed

- All endpoints migrated to `/v2/*` paths (MIG-06 through MIG-09)
- Request bodies use typed struct marshaling instead of `map[string]any` (MIG-11)
- `monitorJobStatus` updated for v2 status values: `"scraping"`, `"completed"`, `"failed"` (MIG-08)
- Crawl parameters updated: `MaxDepth` → `MaxDiscoveryDepth`, `IgnoreSitemap` → `Sitemap`, `AllowBackwardLinks` → `CrawlEntireDomain` (MIG-07)
- `MapResponse.Links` changed from `[]string` to `[]MapLink` (MIG-09)
- `.env.example` updated to use live API URL (MIG-03)

### Fixed

- Retry counter starting at threshold instead of 0 (MIG-01)
- `defer resp.Body.Close()` connection leak in retry loop (MIG-01)
- Request body reuse across retries sending empty body (MIG-01)
- Error handling order in `ScrapeURL` — unmarshal error checked before `Success` (MIG-01)
- `ScrapeOptions` gate missing nil check on non-Formats fields (MIG-01)

### Removed

- v1 field names: `MaxDepth`, `AllowBackwardLinks`, `IgnoreSitemap` from `CrawlParams` (MIG-07)
- Dead v0 extractor code and legacy test file (MIG-01, MIG-03)

[Unreleased]: https://github.com/firecrawl/firecrawl-go/compare/v2.0.0...HEAD
[2.0.0]: https://github.com/firecrawl/firecrawl-go/releases/tag/v2.0.0
71 changes: 71 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
# Contributing to firecrawl-go

Thank you for your interest in contributing!

## Quick Start

```bash
git clone git@github.com:firecrawl/firecrawl-go.git
cd firecrawl-go
go mod download
make check # lint + vet + test
```

## Development Workflow

1. Fork the repository and create a feature branch from `main`.
2. Make your changes following the code style below.
3. Run `make check` before committing (lint + vet + unit tests).
4. Push and open a pull request with a clear description of what changed and why.

A pre-commit hook runs `make check` automatically on every commit.

## Code Style

- Format with `gofumpt`: `make fmt`
- Lint with `golangci-lint` v2: `make lint`
- Vet with `go vet`: `make vet`
- All public methods require `context.Context` as the first parameter.
- Optional request fields use pointer types with `json:",omitempty"`.
- Use typed request structs (internal, unexported) with `json.Marshal` for POST endpoints.
- Follow conventional commit format: `feat(scope): description`, `fix(scope): description`, `docs: description`.

## Testing

| Command | What It Runs | API Key? |
|---------|-------------|----------|
| `make test` | 160 unit tests (httptest mocks) | No |
| `make test-integration` | 32 E2E tests (live Firecrawl API) | Yes |
| `make coverage` | HTML coverage report | No |

Unit tests run against `httptest.NewServer` mock servers — no `.env` file or API key needed. If unit tests fail, the issue is in the code, not missing credentials.

For integration tests:

```bash
cp .env.example .env
# Edit .env:
# API_URL=https://api.firecrawl.dev
# TEST_API_KEY=fc-your-api-key
make test-integration
```

Integration tests consume API credits.

## Prerequisites

| Tool | Version | Installation |
|------|---------|-------------|
| Go | 1.23+ | [go.dev/dl](https://go.dev/dl/) |
| golangci-lint | v2.x | `go install github.com/golangci/golangci-lint/v2/cmd/golangci-lint@latest` |
| gofumpt | latest | `go install mvdan.cc/gofumpt@latest` |

## Adding a New Endpoint

1. Define request/response types in `types.go` with full godoc comments.
2. Create a new file `<endpoint>.go` with the public method(s).
3. Add a corresponding `<endpoint>_test.go` with unit tests using `httptest.NewServer`.
4. If the endpoint is async with polling, add E2E tests in `firecrawl_test.go` (build tag: `integration`).
5. Run `make check` to verify everything passes.

Every exported symbol must have a godoc comment. Public methods must document all parameters, return values, and any error conditions.
34 changes: 34 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
.DEFAULT_GOAL := help
.PHONY: help build test test-integration lint fmt vet coverage clean check

help: ## Show this help
@grep -E '^[a-zA-Z_-]+:.*?## .*$$' $(MAKEFILE_LIST) | sort | \
awk 'BEGIN {FS = ":.*?## "}; {printf "\033[36m%-20s\033[0m %s\n", $$1, $$2}'

build: ## Compile the library
go build ./...

test: ## Run unit tests (no API key needed)
go test -race -v -count=1 ./...

test-integration: ## Run integration tests (requires .env with API key)
go test -race -v -count=1 -tags=integration ./...

lint: ## Run golangci-lint
golangci-lint run

fmt: ## Format code with gofumpt
gofumpt -w .

vet: ## Run go vet
go vet ./...

coverage: ## Generate HTML coverage report
go test -coverprofile=coverage.out -covermode=atomic ./...
go tool cover -html=coverage.out -o coverage.html
@echo "Coverage report: coverage.html"

clean: ## Remove generated files
rm -f coverage.out coverage.html

check: lint vet test ## Run all checks (lint + vet + test)
Loading
Loading