You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using the Spider API from n8n (HTTP Request node) to extract content from a website, the text returned by both /crawl and /scrape endpoints is coming back as one long, “compressed” string without visible line breaks or paragraph separation, which makes it very hard to read or post‑process.
Title
Text output from /crawl and /scrape is returned as a single unformatted blob (hard to read / no paragraphs)
Issue description
When using the Spider API from n8n (HTTP Request node) to extract content from a website, the text returned by both /crawl and /scrape endpoints is coming back as one long, “compressed” string without visible line breaks or paragraph separation, which makes it very hard to read or post‑process. This behavior is especially noticeable on sites where the page content is in Hebrew.
Requests are sent as JSON via n8n’s HTTP Request node (v3).
Example /crawl body (simplified):
Since return_format is set to "markdown" and readability is true, I expected the response body to include visible paragraph breaks, headings, and line breaks that reflect the page structure (e.g. \n\n between paragraphs, # for headings, etc.).
In other words, a reasonably formatted Markdown or plain text representation of the page, suitable for direct reading or passing to an LLM without extra heavy preprocessing.
What actually happens
The content (or equivalent text field) is returned as a single long string, with the text “glued together” and minimal or no visible line breaks.
Even when inspecting the raw JSON (outside of n8n’s UI), the text is effectively one blob, so it’s not just a visualization issue.
Environment
Spider API via HTTPS
n8n version: 1.122.5 (Self‑Hosted)
HTTP Request node v3, Body Content Type = JSON, Using Fields Below
metadata and readability are sent as booleans, not strings
Any guidance or clarification on how to get better‑formatted text from the API would be greatly appreciated.
This discussion was converted from issue #338 on December 23, 2025 12:08.
Heading
Bold
Italic
Quote
Code
Link
Numbered list
Unordered list
Task list
Attach files
Mention
Reference
Menu
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
When using the Spider API from n8n (HTTP Request node) to extract content from a website, the text returned by both /crawl and /scrape endpoints is coming back as one long, “compressed” string without visible line breaks or paragraph separation, which makes it very hard to read or post‑process.
Title
Text output from /crawl and /scrape is returned as a single unformatted blob (hard to read / no paragraphs)
Issue description
When using the Spider API from n8n (HTTP Request node) to extract content from a website, the text returned by both /crawl and /scrape endpoints is coming back as one long, “compressed” string without visible line breaks or paragraph separation, which makes it very hard to read or post‑process. This behavior is especially noticeable on sites where the page content is in Hebrew.
What I’m doing
Using POST https://api.spider.cloud/crawl to discover URLs and then POST https://api.spider.cloud/scrape to extract the page content.
Requests are sent as JSON via n8n’s HTTP Request node (v3).
Example /crawl body (simplified):
Since return_format is set to "markdown" and readability is true, I expected the response body to include visible paragraph breaks, headings, and line breaks that reflect the page structure (e.g. \n\n between paragraphs, # for headings, etc.).
In other words, a reasonably formatted Markdown or plain text representation of the page, suitable for direct reading or passing to an LLM without extra heavy preprocessing.
What actually happens
The content (or equivalent text field) is returned as a single long string, with the text “glued together” and minimal or no visible line breaks.
Even when inspecting the raw JSON (outside of n8n’s UI), the text is effectively one blob, so it’s not just a visualization issue.
Environment
Spider API via HTTPS
n8n version: 1.122.5 (Self‑Hosted)
HTTP Request node v3, Body Content Type = JSON, Using Fields Below
metadata and readability are sent as booleans, not strings
Any guidance or clarification on how to get better‑formatted text from the API would be greatly appreciated.
Beta Was this translation helpful? Give feedback.
All reactions