From eddc615ff25555e7dff5b20c9dff3316ed5ad86e Mon Sep 17 00:00:00 2001 From: sbadoiu Date: Thu, 25 Sep 2025 14:54:30 +0100 Subject: [PATCH 1/7] add dynamic content info --- .../browser-rendering/single-page-application.mdx | 10 ++++++++++ 1 file changed, 10 insertions(+) create mode 100644 src/content/partials/browser-rendering/single-page-application.mdx diff --git a/src/content/partials/browser-rendering/single-page-application.mdx b/src/content/partials/browser-rendering/single-page-application.mdx new file mode 100644 index 000000000000000..98ec0f79284ba3e --- /dev/null +++ b/src/content/partials/browser-rendering/single-page-application.mdx @@ -0,0 +1,10 @@ +### Single Page Applications (SPAs) + +When scraping a Single Page Application (SPA) with dynamic content, you must ensure the page has fully loaded. To do this, you have two main options: + +:::note +1. Use `waitForSelector` to wait for a specific element to appear on the page. This is often the most reliable and efficient method. +2. Use `goToOptions` with "networkidle0" or "networkidle2" +- `"networkidle0"` waits for all network connections to be idle, meaning all resources (including asynchronous JavaScript) have been loaded +- `"networkidle2"` is a more efficient alternative that waits until there are only two or fewer ongoing network connections +::: From 883dfb6cab4883ed6b13d71dc24720863de62aa9 Mon Sep 17 00:00:00 2001 From: sbadoiu Date: Thu, 25 Sep 2025 14:55:56 +0100 Subject: [PATCH 2/7] add dynamic content info to content endpoint --- .../docs/browser-rendering/rest-api/content-endpoint.mdx | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/src/content/docs/browser-rendering/rest-api/content-endpoint.mdx b/src/content/docs/browser-rendering/rest-api/content-endpoint.mdx index 7c50f322f85c1b7..c8307452ea006c1 100644 --- a/src/content/docs/browser-rendering/rest-api/content-endpoint.mdx +++ b/src/content/docs/browser-rendering/rest-api/content-endpoint.mdx @@ -63,3 +63,8 @@ Many more options exist, like setting HTTP headers using `setExtraHTTPHeaders`, file="setting-custom-user-agent" product="browser-rendering" /> + + From b78e12b93bc17848a314937f24936aee818163fb Mon Sep 17 00:00:00 2001 From: sbadoiu Date: Thu, 25 Sep 2025 15:13:07 +0100 Subject: [PATCH 3/7] add dynamic content info to all ednpoints --- .../docs/browser-rendering/rest-api/json-endpoint.mdx | 6 ++++++ .../docs/browser-rendering/rest-api/links-endpoint.mdx | 6 ++++++ .../docs/browser-rendering/rest-api/markdown-endpoint.mdx | 6 ++++++ .../docs/browser-rendering/rest-api/pdf-endpoint.mdx | 6 ++++++ .../docs/browser-rendering/rest-api/scrape-endpoint.mdx | 6 ++++++ .../browser-rendering/rest-api/screenshot-endpoint.mdx | 6 ++++++ src/content/docs/browser-rendering/rest-api/snapshot.mdx | 5 +++++ .../browser-rendering/single-page-application.mdx | 8 ++++---- 8 files changed, 45 insertions(+), 4 deletions(-) diff --git a/src/content/docs/browser-rendering/rest-api/json-endpoint.mdx b/src/content/docs/browser-rendering/rest-api/json-endpoint.mdx index 0d9bba0df2b105e..00a3a7fe40e49f7 100644 --- a/src/content/docs/browser-rendering/rest-api/json-endpoint.mdx +++ b/src/content/docs/browser-rendering/rest-api/json-endpoint.mdx @@ -354,3 +354,9 @@ In this example, Browser Rendering first calls Anthropic's Claude Sonnet 4 model file="setting-custom-user-agent" product="browser-rendering" /> + + + diff --git a/src/content/docs/browser-rendering/rest-api/links-endpoint.mdx b/src/content/docs/browser-rendering/rest-api/links-endpoint.mdx index ac8cc44d9c79ed1..ec451448d48512d 100644 --- a/src/content/docs/browser-rendering/rest-api/links-endpoint.mdx +++ b/src/content/docs/browser-rendering/rest-api/links-endpoint.mdx @@ -233,3 +233,9 @@ curl -X POST 'https://api.cloudflare.com/client/v4/accounts//browser- file="setting-custom-user-agent" product="browser-rendering" /> + + + diff --git a/src/content/docs/browser-rendering/rest-api/markdown-endpoint.mdx b/src/content/docs/browser-rendering/rest-api/markdown-endpoint.mdx index 44b04c2fc28e9d7..3fbc02a67abdaff 100644 --- a/src/content/docs/browser-rendering/rest-api/markdown-endpoint.mdx +++ b/src/content/docs/browser-rendering/rest-api/markdown-endpoint.mdx @@ -103,3 +103,9 @@ curl -X 'POST' 'https://api.cloudflare.com/client/v4/accounts//browse 1. **Content extraction:** Convert a blog post or article into Markdown format for storage or further processing. 2. **Static site generation:** Retrieve structured Markdown content for use in static site generators like Jekyll or Hugo. 3. **Automated summarization:** Extract key content from web pages while ignoring CSS, scripts, or unnecessary elements. + + + diff --git a/src/content/docs/browser-rendering/rest-api/pdf-endpoint.mdx b/src/content/docs/browser-rendering/rest-api/pdf-endpoint.mdx index 8261c23b3a3bf39..8d338338e2e5bef 100644 --- a/src/content/docs/browser-rendering/rest-api/pdf-endpoint.mdx +++ b/src/content/docs/browser-rendering/rest-api/pdf-endpoint.mdx @@ -142,3 +142,9 @@ curl -X POST https://api.cloudflare.com/client/v4/accounts//browser- file="setting-custom-user-agent" product="browser-rendering" /> + + + diff --git a/src/content/docs/browser-rendering/rest-api/scrape-endpoint.mdx b/src/content/docs/browser-rendering/rest-api/scrape-endpoint.mdx index 86ea9575e0d06f8..fd5a0b9e6da8f05 100644 --- a/src/content/docs/browser-rendering/rest-api/scrape-endpoint.mdx +++ b/src/content/docs/browser-rendering/rest-api/scrape-endpoint.mdx @@ -108,3 +108,9 @@ Many more options exist, like setting HTTP credentials using `authenticate`, set file="setting-custom-user-agent" product="browser-rendering" /> + + + diff --git a/src/content/docs/browser-rendering/rest-api/screenshot-endpoint.mdx b/src/content/docs/browser-rendering/rest-api/screenshot-endpoint.mdx index 50c7be03a708f7a..12abe26546f0840 100644 --- a/src/content/docs/browser-rendering/rest-api/screenshot-endpoint.mdx +++ b/src/content/docs/browser-rendering/rest-api/screenshot-endpoint.mdx @@ -164,3 +164,9 @@ Many more options exist, like setting HTTP credentials using `authenticate`, set file="setting-custom-user-agent" product="browser-rendering" /> + + + diff --git a/src/content/docs/browser-rendering/rest-api/snapshot.mdx b/src/content/docs/browser-rendering/rest-api/snapshot.mdx index 0d6faabed910e04..978078d05dea928 100644 --- a/src/content/docs/browser-rendering/rest-api/snapshot.mdx +++ b/src/content/docs/browser-rendering/rest-api/snapshot.mdx @@ -108,3 +108,8 @@ curl -X POST 'https://api.cloudflare.com/client/v4/accounts//browser- /> Many more options exist, like setting HTTP credentials using `authenticate`, setting `cookies`, and using `gotoOptions` to control page load behaviour - check the endpoint [reference](/api/resources/browser_rendering/subresources/snapshot/) for all available parameters. + + diff --git a/src/content/partials/browser-rendering/single-page-application.mdx b/src/content/partials/browser-rendering/single-page-application.mdx index 98ec0f79284ba3e..3aef6dc8179a659 100644 --- a/src/content/partials/browser-rendering/single-page-application.mdx +++ b/src/content/partials/browser-rendering/single-page-application.mdx @@ -3,8 +3,8 @@ When scraping a Single Page Application (SPA) with dynamic content, you must ensure the page has fully loaded. To do this, you have two main options: :::note -1. Use `waitForSelector` to wait for a specific element to appear on the page. This is often the most reliable and efficient method. -2. Use `goToOptions` with "networkidle0" or "networkidle2" -- `"networkidle0"` waits for all network connections to be idle, meaning all resources (including asynchronous JavaScript) have been loaded -- `"networkidle2"` is a more efficient alternative that waits until there are only two or fewer ongoing network connections +Use `waitForSelector` to wait for a specific element to appear on the page. This is often the most reliable and efficient method. +Use `goToOptions` with "networkidle0" or "networkidle2" + - `"networkidle0"` waits for all network connections to be idle, meaning all resources (including asynchronous JavaScript) have been loaded + - `"networkidle2"` is a more efficient alternative that waits until there are only two or fewer ongoing network connections ::: From 75df8550678a3d9542cbfc267f385396eb19a296 Mon Sep 17 00:00:00 2001 From: sbadoiu Date: Thu, 25 Sep 2025 15:15:24 +0100 Subject: [PATCH 4/7] add dynamic content info to puppeteer --- src/content/docs/browser-rendering/platform/puppeteer.mdx | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/src/content/docs/browser-rendering/platform/puppeteer.mdx b/src/content/docs/browser-rendering/platform/puppeteer.mdx index 7ed91d8163f0612..23b6546e9bde035 100644 --- a/src/content/docs/browser-rendering/platform/puppeteer.mdx +++ b/src/content/docs/browser-rendering/platform/puppeteer.mdx @@ -71,6 +71,11 @@ await page.setUserAgent( The `userAgent` parameter does not bypass bot protection. Requests from Browser Rendering will always be identified as a bot. ::: + + ## Session management In order to facilitate browser session management, we've added new methods to `puppeteer`: From 9034d65d849ffba6c5ff13ccf7974a27596fc944 Mon Sep 17 00:00:00 2001 From: sbadoiu Date: Thu, 25 Sep 2025 15:26:14 +0100 Subject: [PATCH 5/7] remove dynamic content info to puppeteer --- src/content/docs/browser-rendering/platform/puppeteer.mdx | 5 ----- 1 file changed, 5 deletions(-) diff --git a/src/content/docs/browser-rendering/platform/puppeteer.mdx b/src/content/docs/browser-rendering/platform/puppeteer.mdx index 23b6546e9bde035..7ed91d8163f0612 100644 --- a/src/content/docs/browser-rendering/platform/puppeteer.mdx +++ b/src/content/docs/browser-rendering/platform/puppeteer.mdx @@ -71,11 +71,6 @@ await page.setUserAgent( The `userAgent` parameter does not bypass bot protection. Requests from Browser Rendering will always be identified as a bot. ::: - - ## Session management In order to facilitate browser session management, we've added new methods to `puppeteer`: From 569a6b8306a62cbea1dfd32abe92febe848feda4 Mon Sep 17 00:00:00 2001 From: sbadoiu Date: Thu, 25 Sep 2025 15:41:49 +0100 Subject: [PATCH 6/7] add backticks --- .../partials/browser-rendering/single-page-application.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/content/partials/browser-rendering/single-page-application.mdx b/src/content/partials/browser-rendering/single-page-application.mdx index 3aef6dc8179a659..3fb2458a1202259 100644 --- a/src/content/partials/browser-rendering/single-page-application.mdx +++ b/src/content/partials/browser-rendering/single-page-application.mdx @@ -4,7 +4,7 @@ When scraping a Single Page Application (SPA) with dynamic content, you must ens :::note Use `waitForSelector` to wait for a specific element to appear on the page. This is often the most reliable and efficient method. -Use `goToOptions` with "networkidle0" or "networkidle2" +Use `goToOptions` with `"networkidle0"` or `"networkidle2"` - `"networkidle0"` waits for all network connections to be idle, meaning all resources (including asynchronous JavaScript) have been loaded - `"networkidle2"` is a more efficient alternative that waits until there are only two or fewer ongoing network connections ::: From 2942e1f5feebe16dcadd6c2826717c0250a75c4e Mon Sep 17 00:00:00 2001 From: sbadoiu Date: Thu, 25 Sep 2025 15:50:19 +0100 Subject: [PATCH 7/7] add new paragraph --- .../partials/browser-rendering/single-page-application.mdx | 1 + 1 file changed, 1 insertion(+) diff --git a/src/content/partials/browser-rendering/single-page-application.mdx b/src/content/partials/browser-rendering/single-page-application.mdx index 3fb2458a1202259..73b84a929c1b775 100644 --- a/src/content/partials/browser-rendering/single-page-application.mdx +++ b/src/content/partials/browser-rendering/single-page-application.mdx @@ -4,6 +4,7 @@ When scraping a Single Page Application (SPA) with dynamic content, you must ens :::note Use `waitForSelector` to wait for a specific element to appear on the page. This is often the most reliable and efficient method. + Use `goToOptions` with `"networkidle0"` or `"networkidle2"` - `"networkidle0"` waits for all network connections to be idle, meaning all resources (including asynchronous JavaScript) have been loaded - `"networkidle2"` is a more efficient alternative that waits until there are only two or fewer ongoing network connections