initial docs for json endpoint

daisyfaithauma · daisyfaithauma · commit e4af71181d58 · 2025-03-19T16:14:33.000Z
diff --git a/src/content/docs/browser-rendering/rest-api/json-endpoint.mdx b/src/content/docs/browser-rendering/rest-api/json-endpoint.mdx
@@ -0,0 +1,135 @@
+---
+pcx_content_type: how-to
+title: Capture Webpage Data in JSON Format
+sidebar:
+  order: 9
+---
+
+The `/json` endpoint extracts structured data from a webpage. You can specify the expected output using either a **prompt** or a **response_format** parameter (which accepts a JSON schema). The endpoint returns the extracted data in JSON format.
+
+## Parameters
+
+| Parameter           | Mandatory | Note                                                                             |
+| ------------------- | --------- | -------------------------------------------------------------------------------- |
+| **url**             | **yes**   | The URL of the webpage to extract data from.                                     |
+| **prompt**          | **no**    | **Must supply one of `prompt` or `response_format`**.                            |
+| **response_format** | **no**    | **Must supply one of `prompt` or `response_format`**. May include a JSON schema. |
+
+## Basic Usage
+
+### With a Prompt and JSON Schema
+
+This example captures webpage data by providing both a prompt and a JSON schema. If multiple headings exist, the first occurrence of each (e.g. `h1`, `h2`) is returned.
+
+```bash
+curl --request POST 'https://api.cloudflare.com/client/v4/accounts/CF_ACCOUNT_ID/browser-rendering/json' \
+  --header 'authorization: Bearer CF_API_TOKEN' \
+  --header 'content-type: application/json' \
+  --data '{
+    "url": "http://demoto.xyz/headings",
+    "prompt": "Get the heading from the page. If there are many then grab the first one.",
+    "response_format": {
+      "type": "json_schema",
+      "json_schema": {
+        "type": "object",
+        "properties": {
+          "h1": {
+            "type": "string"
+          },
+          "h2": {
+            "type": "string"
+          }
+        },
+        "required": [
+          "h1"
+        ]
+      }
+    }
+  }'
+```
+
+#### JSON Response
+
+```json title="json response"
+{
+	"success": true,
+	"result": {
+		"h1": "Heading 1",
+		"h2": "Heading 2"
+	}
+}
+```
+
+### With Only a Prompt
+
+In this example, only a prompt is provided. The endpoint will use the prompt to extract the heading information from the page.
+
+```bash
+curl --request POST 'https://api.cloudflare.com/client/v4/accounts/CF_ACCOUNT_ID/browser-rendering/json' \
+  --header 'authorization: Bearer CF_API_TOKEN' \
+  --header 'content-type: application/json' \
+  --data '{
+    "url": "http://demoto.xyz/headings",
+    "prompt": "Get the heading from the page in the form of an object like h1, h2. If there are many headings of the same kind then grab the first one."
+  }'
+```
+
+#### JSON Response
+
+```json title="json response"
+{
+	"success": true,
+	"result": {
+		"h1": "Heading 1",
+		"h2": "Heading 2"
+	}
+}
+```
+
+### With Only a JSON Schema (No Prompt)
+
+In this case, you supply a JSON schema via the `response_format` parameter. The schema defines the structure of the extracted data.
+
+```bash
+curl --request POST 'https://api.cloudflare.com/client/v4/accounts/CF_ACCOUNT_ID/browser-rendering/json' \
+  --header 'authorization: Bearer CF_API_TOKEN' \
+  --header 'content-type: application/json' \
+  --data '{
+    "url": "http://demoto.xyz/headings",
+    "response_format": {
+      "type": "json_schema",
+      "json_schema": {
+        "type": "object",
+        "properties": {
+          "h1": {
+            "type": "string"
+          },
+          "h2": {
+            "type": "string"
+          }
+        },
+        "required": [
+          "h1"
+        ]
+      }
+    }
+  }'
+```
+
+#### JSON Response
+
+```json title="json response"
+{
+	"success": true,
+	"result": {
+		"h1": "Heading 1",
+		"h2": "Heading 2"
+	}
+}
+```
+
+## Potential Use-Cases
+
+1. **Extract Movie Data:** Retrieve details like name, genre, and release date for the top 10 action movies from the IMDB top 250 list by supplying the appropriate IMDB link and JSON schema.
+2. **Weather Information:** Fetch current weather conditions for a location (e.g., Edinburgh) using a weather website link (like from BBC Weather).
+3. **Trending News:** Extract top trending posts on Hacker News by providing the Hacker News link along with a JSON schema that includes post title and body.