Skip to content

Commit e4af711

Browse files
initial docs for json endpoint
1 parent a45f272 commit e4af711

File tree

1 file changed

+135
-0
lines changed

1 file changed

+135
-0
lines changed
Lines changed: 135 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,135 @@
1+
---
2+
pcx_content_type: how-to
3+
title: Capture Webpage Data in JSON Format
4+
sidebar:
5+
order: 9
6+
---
7+
8+
The `/json` endpoint extracts structured data from a webpage. You can specify the expected output using either a **prompt** or a **response_format** parameter (which accepts a JSON schema). The endpoint returns the extracted data in JSON format.
9+
10+
## Parameters
11+
12+
| Parameter | Mandatory | Note |
13+
| ------------------- | --------- | -------------------------------------------------------------------------------- |
14+
| **url** | **yes** | The URL of the webpage to extract data from. |
15+
| **prompt** | **no** | **Must supply one of `prompt` or `response_format`**. |
16+
| **response_format** | **no** | **Must supply one of `prompt` or `response_format`**. May include a JSON schema. |
17+
18+
## Basic Usage
19+
20+
### With a Prompt and JSON Schema
21+
22+
This example captures webpage data by providing both a prompt and a JSON schema. If multiple headings exist, the first occurrence of each (e.g. `h1`, `h2`) is returned.
23+
24+
```bash
25+
curl --request POST 'https://api.cloudflare.com/client/v4/accounts/CF_ACCOUNT_ID/browser-rendering/json' \
26+
--header 'authorization: Bearer CF_API_TOKEN' \
27+
--header 'content-type: application/json' \
28+
--data '{
29+
"url": "http://demoto.xyz/headings",
30+
"prompt": "Get the heading from the page. If there are many then grab the first one.",
31+
"response_format": {
32+
"type": "json_schema",
33+
"json_schema": {
34+
"type": "object",
35+
"properties": {
36+
"h1": {
37+
"type": "string"
38+
},
39+
"h2": {
40+
"type": "string"
41+
}
42+
},
43+
"required": [
44+
"h1"
45+
]
46+
}
47+
}
48+
}'
49+
```
50+
51+
#### JSON Response
52+
53+
```json title="json response"
54+
{
55+
"success": true,
56+
"result": {
57+
"h1": "Heading 1",
58+
"h2": "Heading 2"
59+
}
60+
}
61+
```
62+
63+
### With Only a Prompt
64+
65+
In this example, only a prompt is provided. The endpoint will use the prompt to extract the heading information from the page.
66+
67+
```bash
68+
curl --request POST 'https://api.cloudflare.com/client/v4/accounts/CF_ACCOUNT_ID/browser-rendering/json' \
69+
--header 'authorization: Bearer CF_API_TOKEN' \
70+
--header 'content-type: application/json' \
71+
--data '{
72+
"url": "http://demoto.xyz/headings",
73+
"prompt": "Get the heading from the page in the form of an object like h1, h2. If there are many headings of the same kind then grab the first one."
74+
}'
75+
```
76+
77+
#### JSON Response
78+
79+
```json title="json response"
80+
{
81+
"success": true,
82+
"result": {
83+
"h1": "Heading 1",
84+
"h2": "Heading 2"
85+
}
86+
}
87+
```
88+
89+
### With Only a JSON Schema (No Prompt)
90+
91+
In this case, you supply a JSON schema via the `response_format` parameter. The schema defines the structure of the extracted data.
92+
93+
```bash
94+
curl --request POST 'https://api.cloudflare.com/client/v4/accounts/CF_ACCOUNT_ID/browser-rendering/json' \
95+
--header 'authorization: Bearer CF_API_TOKEN' \
96+
--header 'content-type: application/json' \
97+
--data '{
98+
"url": "http://demoto.xyz/headings",
99+
"response_format": {
100+
"type": "json_schema",
101+
"json_schema": {
102+
"type": "object",
103+
"properties": {
104+
"h1": {
105+
"type": "string"
106+
},
107+
"h2": {
108+
"type": "string"
109+
}
110+
},
111+
"required": [
112+
"h1"
113+
]
114+
}
115+
}
116+
}'
117+
```
118+
119+
#### JSON Response
120+
121+
```json title="json response"
122+
{
123+
"success": true,
124+
"result": {
125+
"h1": "Heading 1",
126+
"h2": "Heading 2"
127+
}
128+
}
129+
```
130+
131+
## Potential Use-Cases
132+
133+
1. **Extract Movie Data:** Retrieve details like name, genre, and release date for the top 10 action movies from the IMDB top 250 list by supplying the appropriate IMDB link and JSON schema.
134+
2. **Weather Information:** Fetch current weather conditions for a location (e.g., Edinburgh) using a weather website link (like from BBC Weather).
135+
3. **Trending News:** Extract top trending posts on Hacker News by providing the Hacker News link along with a JSON schema that includes post title and body.

0 commit comments

Comments
 (0)