Skip to content

Conversation

@GTFalcao
Copy link
Collaborator

@GTFalcao GTFalcao commented Mar 6, 2025

Closes #15780

Crawl and Scrape endpoints changed pretty drastically in terms of parameters, so the props were changed accordingly.

Summary by CodeRabbit

  • New Features
    • Enhanced URL crawling and page scraping actions with flexible configuration options and output format selections.
    • Introduced new properties for crawling and scraping, improving usability and clarity.
    • Added a new constant for output format options to enhance user experience.
  • Bug Fixes
    • Updated documentation links for crawl status and data retrieval.
  • Refactor
    • Streamlined data retrieval workflows and updated messaging with clearer descriptions.
    • Updated API version and simplified method paths for improved clarity.
  • Chore
    • Upgraded to version 1.0.0 for improved performance and overall reliability.

@vercel
Copy link

vercel bot commented Mar 6, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

3 Skipped Deployments
Name Status Preview Comments Updated (UTC)
docs-v2 ⬜️ Ignored (Inspect) Visit Preview Mar 7, 2025 6:50pm
pipedream-docs ⬜️ Ignored (Inspect) Mar 7, 2025 6:50pm
pipedream-docs-redirect-do-not-edit ⬜️ Ignored (Inspect) Mar 7, 2025 6:50pm

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 6, 2025

Walkthrough

This pull request revises several Firecrawl components by updating action configurations, removing legacy properties, and introducing new ones. Key changes include refactoring run methods to simplify parameter handling, consolidating JSON parsing logic in utility functions, and introducing new constants for output formats. Additionally, several properties have been removed or updated to reflect a new API version, and the package version has been incremented to denote major updates.

Changes

File(s) Change Summary
components/firecrawl/actions/crawl-url/crawl-url.mjs - Updated description and version
- Removed legacy properties (e.g., includes, excludes, generateImgAltText, etc.)
- Added new properties (excludePaths, includePaths, ignoreQueryParameters, allowBackwardLinks, allowExternalLinks)
- Updated maxDepth and limit properties with new descriptions
- Refactored run method to destructure parameters and export response.id instead of response.jobId
components/firecrawl/actions/get-crawl-status/get-crawl-status.mjs - Renamed action from "Get Crawl Status" to "Get Crawl Data"
- Updated documentation link in description
- Incremented version from "0.0.1" to "0.0.2"
- Modified success message formatting in the run method
components/firecrawl/actions/scrape-page/scrape-page.mjs - Updated version from "0.0.1" to "1.0.0"
- Removed outdated extraction properties
- Added new properties (formats, includeTags, excludeTags, headers, waitFor, mobile, timeout, additionalOptions)
- Refactored run method for cleaner parameter passing
components/firecrawl/common/constants.mjs - Introduced new constant OUTPUT_FORMAT_OPTIONS listing supported output format options including markdown, html, rawHtml, links, screenshot, fullPageScreenshot, and json
components/firecrawl/common/utils.mjs - Converted parseObject from an arrow function to a standard function declaration
- Centralized JSON error handling via a new helper function optionalParseAsJSON
- Added new function parseObjectEntries to parse and return object entries with enhanced error handling
components/firecrawl/firecrawl.app.mjs - Removed several configuration properties from propDefinitions
- Added additionalOptions property for extended request parameters
- Updated crawlId property description
- Modified _baseUrl() to return API endpoint version v1
- Updated getCrawlStatus() method path to simplify the URL
components/firecrawl/package.json - Updated package version from "0.1.0" to "1.0.0" indicating major changes

Sequence Diagram(s)

sequenceDiagram
    participant U as User
    participant A as Action (Crawl/Scrape)
    participant F as Firecrawl Service
    U->>A: Invoke run() with parameters
    A->>F: Call crawl/scrape method with destructured parameters
    F-->>A: Return response with identifier
    A->>U: Output summary message using response.id
Loading

Possibly related PRs

  • New Components - scrapfly #13778: The changes in the main PR are related to the modifications in the scrape-page action, as both involve significant updates to the properties and structure of the scraping functionality, including the run method and the parameters used.

Suggested labels

ai-assisted

Suggested reviewers

  • michelle0927

Poem

Hop along, through code I race,
Tweaking actions with elegant grace.
New fields bloom, old ones depart,
Refactored functions, a work of art.
Version bumped high, a joyous scene,
A rabbit's code hop, so sleek and keen!
🐇✨

Warning

There were issues while running some tools. Please review the errors and either fix the tool’s configuration or disable the tool if it’s a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

components/housecall_pro/housecall_pro.app.mjs

Oops! Something went wrong! :(

ESLint: 8.57.1

Error [ERR_MODULE_NOT_FOUND]: Cannot find package 'jsonc-eslint-parser' imported from /eslint.config.mjs
at packageResolve (node:internal/modules/esm/resolve:839:9)
at moduleResolve (node:internal/modules/esm/resolve:908:18)
at defaultResolve (node:internal/modules/esm/resolve:1038:11)
at ModuleLoader.defaultResolve (node:internal/modules/esm/loader:557:12)
at ModuleLoader.resolve (node:internal/modules/esm/loader:525:25)
at ModuleLoader.getModuleJob (node:internal/modules/esm/loader:246:38)
at ModuleJob._link (node:internal/modules/esm/module_job:126:49)

components/rabbitmq/rabbitmq.app.mjs

Oops! Something went wrong! :(

ESLint: 8.57.1

Error [ERR_MODULE_NOT_FOUND]: Cannot find package 'jsonc-eslint-parser' imported from /eslint.config.mjs
at packageResolve (node:internal/modules/esm/resolve:839:9)
at moduleResolve (node:internal/modules/esm/resolve:908:18)
at defaultResolve (node:internal/modules/esm/resolve:1038:11)
at ModuleLoader.defaultResolve (node:internal/modules/esm/loader:557:12)
at ModuleLoader.resolve (node:internal/modules/esm/loader:525:25)
at ModuleLoader.getModuleJob (node:internal/modules/esm/loader:246:38)
at ModuleJob._link (node:internal/modules/esm/module_job:126:49)

components/servicetitan/servicetitan.app.mjs

Oops! Something went wrong! :(

ESLint: 8.57.1

Error [ERR_MODULE_NOT_FOUND]: Cannot find package 'jsonc-eslint-parser' imported from /eslint.config.mjs
at packageResolve (node:internal/modules/esm/resolve:839:9)
at moduleResolve (node:internal/modules/esm/resolve:908:18)
at defaultResolve (node:internal/modules/esm/resolve:1038:11)
at ModuleLoader.defaultResolve (node:internal/modules/esm/loader:557:12)
at ModuleLoader.resolve (node:internal/modules/esm/loader:525:25)
at ModuleLoader.getModuleJob (node:internal/modules/esm/loader:246:38)
at ModuleJob._link (node:internal/modules/esm/module_job:126:49)


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c207f4f and 24fe641.

⛔ Files ignored due to path filters (1)
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (3)
  • components/housecall_pro/housecall_pro.app.mjs (1 hunks)
  • components/rabbitmq/rabbitmq.app.mjs (1 hunks)
  • components/servicetitan/servicetitan.app.mjs (1 hunks)
✅ Files skipped from review due to trivial changes (3)
  • components/rabbitmq/rabbitmq.app.mjs
  • components/servicetitan/servicetitan.app.mjs
  • components/housecall_pro/housecall_pro.app.mjs
⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: Verify TypeScript components
  • GitHub Check: pnpm publish
  • GitHub Check: Publish TypeScript components

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (7)
components/firecrawl/common/utils.mjs (2)

17-17: Minor style preference.
A semicolon after a function definition can be optional in many JS style guides, so this is just a matter of stylistic consistency.


18-24: Centralized JSON parsing helper.
Encapsulating the try-catch in optionalParseAsJSON improves readability and ensures code reuse. However, consider whether you want to log or handle parse errors to help with debugging in some scenarios.

components/firecrawl/actions/scrape-page/scrape-page.mjs (1)

71-73: Extra scraping options.
additionalOptions is flexible; passing it through parseObjectEntries supports partial or nested JSON structures. Consider clarifying allowable keys to avoid confusion.

components/firecrawl/firecrawl.app.mjs (1)

12-16: Consider validating input types.
While allowing extra parameters via additionalOptions is flexible, ensure that any nested objects or arrays are properly validated before passing them to the API to avoid unexpected parsing errors.

components/firecrawl/actions/crawl-url/crawl-url.mjs (3)

60-63: Careful with external link expansions.
Following external sites can balloon the crawl scope. Consider adding usage hints or warnings.


66-71: Typo in the documentation link format.
The link bracket is unbalanced: [https://docs.firecrawl.dev/api-reference/endpoint/crawl-post). Add the missing bracket to ensure proper rendering.

-      description: "Additional parameters to send in the request. [https://docs.firecrawl.dev/api-reference/endpoint/crawl-post) ...
+      description: "Additional parameters to send in the request. [https://docs.firecrawl.dev/api-reference/endpoint/crawl-post] ...

75-83: Validate merging of properties in run() method.
Destructuring this merges all user-provided fields into data, which can cause collisions in certain scenarios. Consider restricting or enumerating which fields are forwarded to ensure consistent input.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 41f2aee and 18d45f0.

⛔ Files ignored due to path filters (1)
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (7)
  • components/firecrawl/actions/crawl-url/crawl-url.mjs (2 hunks)
  • components/firecrawl/actions/get-crawl-status/get-crawl-status.mjs (2 hunks)
  • components/firecrawl/actions/scrape-page/scrape-page.mjs (2 hunks)
  • components/firecrawl/common/constants.mjs (1 hunks)
  • components/firecrawl/common/utils.mjs (1 hunks)
  • components/firecrawl/firecrawl.app.mjs (2 hunks)
  • components/firecrawl/package.json (1 hunks)
✅ Files skipped from review due to trivial changes (2)
  • components/firecrawl/common/constants.mjs
  • components/firecrawl/package.json
⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: Verify TypeScript components
  • GitHub Check: pnpm publish
  • GitHub Check: Publish TypeScript components
🔇 Additional comments (35)
components/firecrawl/common/utils.mjs (4)

1-1: Refactored function declaration looks fine.
Changing from an arrow function to a standard function declaration is acceptable and can improve clarity, especially if you plan to reuse the function or hoist it.


7-7: Good use of a dedicated function for parsing JSON conditionally.
Calling optionalParseAsJSON(item) inside the array map keeps the code clean and ensures each string item is safely parsed.


13-13: Consistent JSON parsing for string inputs.
Using optionalParseAsJSON here matches the logic for arrays and ensures uniform handling of string data.


26-39: Provides flexible handling of key-value pairs.
parseObjectEntries will parse each value if it's a parseable string. This function is particularly useful when you receive partially stringified JSON objects. Keep in mind it only does one-level parsing; for deeply nested objects, you may need a recursive approach.

components/firecrawl/actions/get-crawl-status/get-crawl-status.mjs (4)

5-5: Renamed action aligns with broader usage.
Changing the name to "Get Crawl Data" clarifies that the endpoint retrieves not just status but also data.


6-6: Documentation link updated.
Updating the URL to reflect the new endpoint path is appropriate. Ensure that the destination link remains valid in the future.


7-7: Version bump appears appropriate.
Incrementing to 0.0.2 signals minor changes. Make sure you maintain a consistent versioning strategy across related actions.


24-24: Minor improvement for summary formatting.
Including parentheses around the crawl ID is a stylistic choice that can improve readability.

components/firecrawl/actions/scrape-page/scrape-page.mjs (14)

1-2: New imports enable modular usage.
Importing OUTPUT_FORMAT_OPTIONS and parseObjectEntries helps encapsulate logic and constants outside this file, promoting code reuse.


8-9: Description updated to reflect new documentation.
Providing a direct link to the updated docs helps users quickly find relevant reference material.


10-10: Major version update.
Version 1.0.0 indicates a significant or breaking change, which is consistent with the added and removed properties in this action.


19-19: Clearer URL description.
The URL to scrape” is concise and easily understandable.


21-25: Flexible output format handling.
Defining formats as an array of strings and referencing OUTPUT_FORMAT_OPTIONS gives users a straightforward way to customize the scraped data’s output.


29-31: Selective content retrieval.
onlyMainContent is a handy property for focusing on key content. Ensure documentation clarifies what qualifies as “main” content.


34-39: Granular inclusion of tags.
Allowing users to specify includeTags can be extremely helpful. Thoroughly test edge cases where tags might overlap with excludeTags.


40-45: Granular exclusion of tags.
excludeTags is similarly valuable. Combining it with includeTags may raise potential conflicts; consider clarifications or priority rules in the docs.


46-51: Configurable request headers.
Providing a headers object promotes flexibility, such as setting user-agent or cookies faking.


52-57: Wait time for dynamic content.
waitFor helps ensure that dynamic content loads properly before scraping. Document potential performance impacts of long wait times.


58-63: Mobile emulation.
mobile can help replicate a mobile user experience. This is especially useful for verifying responsive layouts.


65-69: Supported request consumer control.
timeout provides a safeguard against long-running requests, preventing indefinite hangs.


78-80: Clean destructuring of props.
This pattern keeps code manageable by separating out additionalOptions and re-bundling everything else in data.


84-85: Efficient merging of options.
Using parseObjectEntries(additionalOptions) ensures each option is conditionally parsed without cluttering your main scraping logic.

components/firecrawl/firecrawl.app.mjs (4)

10-10: No issues with the updated description.
This description succinctly clarifies the purpose of the url property.


19-23: Minor improvement in property clarity.
Renaming and clarifying the crawlId description helps users better understand its usage. No further issues found.


54-54: Confirm removal of /status segment.
This new path, /crawl/${crawlId}, replaces the older /status. Make sure the upstream and downstream calls align with this updated endpoint.


27-27:

❓ Verification inconclusive

Verify the new API endpoint version.
Ensure that all references to the old v0 endpoint have been removed or refactored to avoid inconsistent API requests.


🏁 Script executed:

#!/bin/bash
# Searches for any references to the old v0 endpoint in the repository
rg "https://api\.firecrawl\.dev/v0"

Length of output: 37


API Endpoint Update Verified – Manual Confirmation Recommended
The search for any lingering "https://api.firecrawl.dev/v0" reference returned no results, indicating that the codebase now consistently uses the new v1 endpoint as shown in components/firecrawl/firecrawl.app.mjs (line 27). However, because the automated search produced minimal output, please manually verify across the repository to ensure no overlooked references to the old API remain.

  • Confirm that the file components/firecrawl/firecrawl.app.mjs now exclusively returns "https://api.firecrawl.dev/v1".
  • Double-check any related modules or configuration files for hardcoded URLs referencing v0.
components/firecrawl/actions/crawl-url/crawl-url.mjs (9)

7-8: Description and version updates look good.
These changes clearly communicate the action’s purpose and version.


18-23: Exclusion logic is clearly documented.
No issues found. The excludePaths property meets typical user needs for regulating URL path crawls.


31-33: Clarity on crawl depth.
Renaming and clarifying the maxDepth property helps users control recursion levels effectively.


37-39: No concerns with ignoring sitemaps.
This feature can be useful for advanced or custom crawls.


42-45: Good optional toggle for query parameters.
Disabling repeated scraping of different query params can significantly reduce redundant requests.


48-51: Limit parameter is appropriately placed.
Capping the maximum pages at the client side is standard.


54-57: Back-link crawling logic looks fine.
Allowing backward links can be beneficial, but remain mindful of potential infinite loop scenarios if link structures are cyclical.


86-86: Summary message is helpful.
The dynamic output with response.id is clear and user-friendly.


1-1:

✅ Verification successful

Validate import path.
Confirm that parseObjectEntries is defined and exported in ../../common/utils.mjs.


🏁 Script executed:

#!/bin/bash
fd utils.mjs --exec cat {} | rg 'parseObjectEntries'

Length of output: 186


Validated Import Path: The function parseObjectEntries is confirmed to be defined and exported in ../../common/utils.mjs (as evidenced by the search results). No changes are needed regarding the import path.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
components/firecrawl/actions/crawl-url/crawl-url.mjs (1)

71-71: Fix malformed Markdown link in description.

There's a syntax issue in the description where the markdown link is not properly formatted.

-      description: "Additional parameters to send in the request. [https://docs.firecrawl.dev/api-reference/endpoint/crawl-post) for available parameters. Values will be parsed as JSON where applicable.",
+      description: "Additional parameters to send in the request. [See the documentation](https://docs.firecrawl.dev/api-reference/endpoint/crawl-post) for available parameters. Values will be parsed as JSON where applicable.",
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 18d45f0 and ed7a9a5.

📒 Files selected for processing (1)
  • components/firecrawl/actions/crawl-url/crawl-url.mjs (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: pnpm publish
  • GitHub Check: Verify TypeScript components
  • GitHub Check: Publish TypeScript components
🔇 Additional comments (6)
components/firecrawl/actions/crawl-url/crawl-url.mjs (6)

7-8: Version increment and documentation update look good.

The description now clearly points to the relevant documentation and the version has been properly incremented to 1.0.0 to reflect the significant changes in the Firecrawl API.


18-23: New excludePaths property is well-documented.

The property is clear and the description provides a helpful example to illustrate how the regex pattern matching works.


24-29: The includePaths label is correct.

The label "Include Paths" now properly matches the property's purpose, addressing the previous review comment.


42-47: Good addition of new crawling parameters.

The new properties (ignoreQueryParameters, allowBackwardLinks, allowExternalLinks) are well-documented with clear descriptions that explain their purpose.

Also applies to: 54-65


74-83: Good refactoring of parameter handling in run method.

The destructuring approach improves code readability and simplifies parameter passing. The use of the parseObjectEntries utility function for additionalOptions is a clean way to handle the JSON parsing.


86-86: Updated API field name in summary export.

The code now correctly references response.id instead of the previous response.jobId, matching the changes in the Firecrawl V1 API response structure.

michelle0927
michelle0927 previously approved these changes Mar 6, 2025
Copy link
Collaborator

@michelle0927 michelle0927 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ed7a9a5 and 1516703.

📒 Files selected for processing (1)
  • components/firecrawl/actions/crawl-url/crawl-url.mjs (2 hunks)
🧰 Additional context used
🪛 Biome (1.9.4)
components/firecrawl/actions/crawl-url/crawl-url.mjs

[error] 71-71: expected , but instead found webhook

Remove webhook

(parse)


[error] 71-71: expected , but instead found ": {"

Remove ": {"

(parse)


[error] 71-71: expected : but instead found url

Remove url

(parse)


[error] 71-71: expected , but instead found ": "

Remove ": "

(parse)


[error] 71-71: expected : but instead found https

Remove https

(parse)


[error] 71-71: expected , but instead found :

Remove :

(parse)


[error] 72-72: Expected a property, a shorthand property, a getter, a setter, or a method but instead found '}'.

Expected a property, a shorthand property, a getter, a setter, or a method here.

(parse)

🪛 GitHub Actions: Pull Request Checks
components/firecrawl/actions/crawl-url/crawl-url.mjs

[error] 71-71: Parsing error: Unexpected token webhook

⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: pnpm publish
  • GitHub Check: Publish TypeScript components
  • GitHub Check: Verify TypeScript components
🔇 Additional comments (13)
components/firecrawl/actions/crawl-url/crawl-url.mjs (13)

1-1: No actionable feedback.


7-8: No actionable feedback.


18-21: No actionable feedback.


24-27: No actionable feedback.


31-33: No actionable feedback.


37-39: No actionable feedback.


42-45: No actionable feedback.


48-51: No actionable feedback.


54-57: No actionable feedback.


60-63: No actionable feedback.


75-78: No actionable feedback.


81-82: No actionable feedback.


86-86: No actionable feedback.

Comment on lines 66 to 72
additionalOptions: {
propDefinition: [
firecrawl,
"waitFor",
"additionalOptions",
],
optional: true,
description: "Additional parameters to send in the request. [https://docs.firecrawl.dev/api-reference/endpoint/crawl-post) for available parameters. Values will be parsed as JSON where applicable. For example, to add the `webhook` param, use the value `{"webhook": {"url": "https://your-server-webhook-api.com","headers": {},"metadata": {},"events": ["completed"]}}`",
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix the parse error in the description.

Pipeline logs indicate a parsing error at line 71 due to bracket mismatch and unescaped backticks in the doc link and JSON example. This breaks the build. Please apply the following fix:

-      description: "Additional parameters to send in the request. [https://docs.firecrawl.dev/api-reference/endpoint/crawl-post) for available parameters. Values will be parsed as JSON where applicable. For example, to add the `webhook` param, use the value `{\"webhook\": {\"url\": \"https://your-server-webhook-api.com\",\"headers\": {},\"metadata\": {},\"events\": [\"completed\"]}}`",
+      description: "Additional parameters to send in the request. (https://docs.firecrawl.dev/api-reference/endpoint/crawl-post) for available parameters. Values will be parsed as JSON where applicable. For example, to add the 'webhook' param, use the value '{\"webhook\":{\"url\":\"https://your-server-webhook-api.com\",\"headers\":{},\"metadata\":{},\"events\":[\"completed\"]}}'.",
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
additionalOptions: {
propDefinition: [
firecrawl,
"waitFor",
"additionalOptions",
],
optional: true,
description: "Additional parameters to send in the request. [https://docs.firecrawl.dev/api-reference/endpoint/crawl-post) for available parameters. Values will be parsed as JSON where applicable. For example, to add the `webhook` param, use the value `{"webhook": {"url": "https://your-server-webhook-api.com","headers": {},"metadata": {},"events": ["completed"]}}`",
},
additionalOptions: {
propDefinition: [
firecrawl,
"additionalOptions",
],
description: "Additional parameters to send in the request. (https://docs.firecrawl.dev/api-reference/endpoint/crawl-post) for available parameters. Values will be parsed as JSON where applicable. For example, to add the 'webhook' param, use the value '{\"webhook\":{\"url\":\"https://your-server-webhook-api.com\",\"headers\":{},\"metadata\":{},\"events\":[\"completed\"]}}'.",
},
🧰 Tools
🪛 Biome (1.9.4)

[error] 71-71: expected , but instead found webhook

Remove webhook

(parse)


[error] 71-71: expected , but instead found ": {"

Remove ": {"

(parse)


[error] 71-71: expected : but instead found url

Remove url

(parse)


[error] 71-71: expected , but instead found ": "

Remove ": "

(parse)


[error] 71-71: expected : but instead found https

Remove https

(parse)


[error] 71-71: expected , but instead found :

Remove :

(parse)


[error] 72-72: Expected a property, a shorthand property, a getter, a setter, or a method but instead found '}'.

Expected a property, a shorthand property, a getter, a setter, or a method here.

(parse)

🪛 GitHub Actions: Pull Request Checks

[error] 71-71: Parsing error: Unexpected token webhook

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
components/firecrawl/actions/crawl-url/crawl-url.mjs (1)

71-71: ⚠️ Potential issue

Fix the URL formatting and escape quotes in description.

The documentation URL has mismatched brackets and there might still be potential parsing issues with backticks and quotes in the JSON example.

-      description: "Additional parameters to send in the request. [https://docs.firecrawl.dev/api-reference/endpoint/crawl-post) for available parameters. Values will be parsed as JSON where applicable. For example, to add the `webhook` param, use the value `{\"webhook\": {\"url\": \"https://your-server-webhook-api.com\",\"headers\": {},\"metadata\": {},\"events\": [\"completed\"]}}`",
+      description: "Additional parameters to send in the request. See [documentation](https://docs.firecrawl.dev/api-reference/endpoint/crawl-post) for available parameters. Values will be parsed as JSON where applicable. For example, to add the webhook param, use the value '{\"webhook\":{\"url\":\"https://your-server-webhook-api.com\",\"headers\":{},\"metadata\":{},\"events\":[\"completed\"]}}'",
🧹 Nitpick comments (1)
components/firecrawl/actions/crawl-url/crawl-url.mjs (1)

66-72: Consider adding error handling for JSON parsing.

While the code uses parseObjectEntries to handle JSON parsing, there's no explicit error handling if the additionalOptions contains invalid JSON.

async run({ $ }) {
  const {
    firecrawl, additionalOptions, ...data
  } = this;
+  let parsedOptions = {};
+  try {
+    if (additionalOptions) {
+      parsedOptions = parseObjectEntries(additionalOptions);
+    }
+  } catch (error) {
+    throw new Error(`Error parsing additionalOptions: ${error.message}`);
+  }
  const response = await firecrawl.crawl({
    $,
    data: {
      ...data,
-      ...(additionalOptions && parseObjectEntries(additionalOptions)),
+      ...parsedOptions,
    },
  });
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1516703 and c207f4f.

📒 Files selected for processing (1)
  • components/firecrawl/actions/crawl-url/crawl-url.mjs (2 hunks)
🔇 Additional comments (5)
components/firecrawl/actions/crawl-url/crawl-url.mjs (5)

1-1: Good addition of utility import.

The imported parseObjectEntries utility function is appropriately used later to handle JSON parsing for the additionalOptions, which improves code maintainability.


7-8: Version bump correctly reflects significant changes.

The version has been appropriately incremented from 0.0.1 to 1.0.0, indicating a major update with breaking changes to the component's interface. The updated description with documentation link is helpful for users.


24-27: Label and property name for includePaths now match correctly.

The label now correctly states "Include Paths" which aligns with the property name and prevents user confusion.


75-83: Clean code refactoring of the run method.

The updated implementation properly destructures properties and uses the parseObjectEntries utility to handle JSON parsing, resulting in cleaner and more maintainable code.


86-86: Updated summary message to use correct ID property.

The summary message has been updated to use response.id instead of the previous response.jobId, which aligns with the updated API response structure.

@GTFalcao
Copy link
Collaborator Author

GTFalcao commented Mar 7, 2025

/approve

@GTFalcao GTFalcao merged commit 33f6217 into master Mar 7, 2025
11 checks passed
@GTFalcao GTFalcao deleted the 15780-firecrawl-v1-migration branch March 7, 2025 21:17
@coderabbitai coderabbitai bot mentioned this pull request Mar 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] firecrawl action to crawl url throws error for invalid url

4 participants