Text output from /crawl and /scrape is returned as a single unformatted blob (hard to read / no paragraphs) in Hebrew website text #340

Gilad123 · 2025-12-08T20:16:11Z

Gilad123
Dec 8, 2025

When using the Spider API from n8n (HTTP Request node) to extract content from a website, the text returned by both /crawl and /scrape endpoints is coming back as one long, “compressed” string without visible line breaks or paragraph separation, which makes it very hard to read or post‑process.

Title
Text output from /crawl and /scrape is returned as a single unformatted blob (hard to read / no paragraphs)

Issue description
When using the Spider API from n8n (HTTP Request node) to extract content from a website, the text returned by both /crawl and /scrape endpoints is coming back as one long, “compressed” string without visible line breaks or paragraph separation, which makes it very hard to read or post‑process. This behavior is especially noticeable on sites where the page content is in Hebrew.

What I’m doing
Using POST https://api.spider.cloud/crawl to discover URLs and then POST https://api.spider.cloud/scrape to extract the page content.

Requests are sent as JSON via n8n’s HTTP Request node (v3).

Example /crawl body (simplified):

Since return_format is set to "markdown" and readability is true, I expected the response body to include visible paragraph breaks, headings, and line breaks that reflect the page structure (e.g. \n\n between paragraphs, # for headings, etc.).

In other words, a reasonably formatted Markdown or plain text representation of the page, suitable for direct reading or passing to an LLM without extra heavy preprocessing.

What actually happens
The content (or equivalent text field) is returned as a single long string, with the text “glued together” and minimal or no visible line breaks.

Even when inspecting the raw JSON (outside of n8n’s UI), the text is effectively one blob, so it’s not just a visualization issue.

Environment
Spider API via HTTPS

n8n version: 1.122.5 (Self‑Hosted)

HTTP Request node v3, Body Content Type = JSON, Using Fields Below

metadata and readability are sent as booleans, not strings

Any guidance or clarification on how to get better‑formatted text from the API would be greatly appreciated.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

spider-rs

Text output from /crawl and /scrape is returned as a single unformatted blob (hard to read / no paragraphs) in Hebrew website text #340

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

spider-rs

Text output from /crawl and /scrape is returned as a single unformatted blob (hard to read / no paragraphs) in Hebrew website text #340

Uh oh!

Gilad123 Dec 8, 2025

Replies: 0 comments

Gilad123
Dec 8, 2025