adding improvements from the latest round of reviews

Marcinthecloud · Marcinthecloud · commit 79680f63c731 · 2025-09-23T06:58:01.000-07:00
diff --git a/src/content/docs/r2-sql/get-started.mdx b/src/content/docs/r2-sql/get-started.mdx
@@ -28,7 +28,7 @@ This guide will instruct you through:
 
 1. Sign up for a [Cloudflare account](https://dash.cloudflare.com/sign-up).
 2. Install [Node.js](https://nodejs.org/en/).
-3. Install [Wrangler](/workers/wranger/install-and-update).
+3. Install [Wrangler](/workers/wrangler/install-and-update).
 
 :::note[Node.js version manager]
 Use a Node version manager like [Volta](https://volta.sh/) or [nvm](https://github.com/nvm-sh/nvm) to avoid permission issues and change Node.js versions.
@@ -47,13 +47,13 @@ You will need API tokens to interact with Cloudflare services.
 
 2. Select **Manage API tokens**.
 
-3. Select **Create API token**.
+3. Select **Create User API token**.
 
 4. Select the **R2 Token** text to edit your API token name.
 
 5. Under **Permissions**, choose the **Admin Read & Write** permission.
 
-6. Select **Create API Token**.
+6. Select **Create User API Token**.
 
 7. Note the **Token value**.
 
@@ -99,8 +99,6 @@ Create an R2 bucket:
 </TabItem>
 </Tabs>
 
-## 3. Enable R2 Data Catalog
-
 <Tabs syncKey='CLIvDash'>
 <TabItem label='Wrangler CLI'>
 
@@ -138,12 +136,12 @@ Copy the warehouse (ACCOUNTID_BUCKETNAME) and paste it in the `export` below. We
 export $WAREHOUSE= #Paste your warehouse here
 ```
 
-## 4. Create the data Pipeline
+## 3. Create the data Pipeline
 
 <Tabs syncKey='CLIvDash'>
 <TabItem label='Wrangler CLI'>
 
-### 4.1. Create the Pipeline Stream
+### 3.1. Create the Pipeline Stream
 
 First, create a schema file called `demo_schema.json` with the following `json` schema:
 
@@ -201,7 +199,7 @@ Input Schema:
 └────────────┴────────┴────────────┴──────────┘
 ```
 
-### 4.2. Create the Pipeline Sink
+### 3.2. Create the Pipeline Sink
 
 Create a sink that writes data to your R2 bucket as Apache Iceberg tables:
 
@@ -219,7 +217,7 @@ npx wrangler pipelines sinks create demo_sink \
 This creates a `sink` configuration that will write to the Iceberg table `demo.first_table` in your R2 Data Catalog every 30 seconds. Pipelines automatically appends an `__ingest_ts` column that is used to partition the table by `DAY`.
 :::
 
-### 4.3. Create the Pipeline
+### 3.3. Create the Pipeline
 
 Pipelines are SQL statements that reads data from the stream, does some work, and writes it to the sink.
 
@@ -295,7 +293,7 @@ export STREAM_ENDPOINT= #the http ingest endpoint from the output (see example b
 </Tabs>
 
 
-## 5. Send some data
+## 4. Send some data
 
 Next, send some events to our stream:
 
@@ -327,7 +325,7 @@ curl -X POST "$STREAM_ENDPOINT" \
 
 This will send 4 events in one `POST`. Since our Pipeline is filtering out records with `numbers` less than 5, `user_id` `3` and `4` should not appear in the table. Feel free to change values and send more events.
 
-## 6. Query the table with R2 SQL
+## 5. Query the table with R2 SQL
 
 After you have sent your events to the stream, it will take about 30 seconds for the data to show in the table, since that is what we configured our `roll interval` to be in the Sink.
 
diff --git a/src/content/docs/r2-sql/query-data.mdx b/src/content/docs/r2-sql/query-data.mdx
@@ -8,6 +8,9 @@ sidebar:
 import {
 	Render,
 	LinkCard,
+	Tabs,
+	TabItem,
+	Steps
 } from "~/components";
 
 :::note
@@ -24,8 +27,8 @@ R2 SQL can currently be accessed via Wrangler commands or a REST API.
 
 To query Apache Iceberg tables in R2 Data Catalog, you must provide a Cloudflare API token with R2 SQL, R2 Data Catalog, and R2 storage permissions.
 
-### Create API token in the dashboard
-
+<Tabs syncKey='CLIvDash'>
+<TabItem label='Dashboard'>
 Create an [API token](https://dash.cloudflare.com/profile/api-tokens) with:
 
 - Access to R2 Data Catalog (**minimum**: read-only)
@@ -34,8 +37,8 @@ Create an [API token](https://dash.cloudflare.com/profile/api-tokens) with:
 
 Wrangler now supports the environment variable `WRANGLER_R2_SQL_AUTH_TOKEN` which you can use to `export` your token.
 
-### Create API token via API
-
+</TabItem>
+<TabItem label='Via API'>
 To create an API token programmatically for use with R2 SQL, you will need to specify  R2 SQL, R2 Data Catalog, and R2 storage permission groups in your [Access Policy](/r2/api/tokens/#access-policy).
 
 #### Example Access Policy
@@ -66,7 +69,8 @@ To create an API token programmatically for use with R2 SQL, you will need to sp
 	}
 ]
 ```
-
+</TabItem>
+</Tabs>
 
 ## Query data via Wrangler
 
diff --git a/src/content/docs/r2-sql/tutorials/end-to-end-pipeline.mdx b/src/content/docs/r2-sql/tutorials/end-to-end-pipeline.mdx
@@ -110,13 +110,11 @@ Create an R2 bucket:
 </TabItem>
 </Tabs>
 
-## 3. Enable R2 Data Catalog
+Enable the catalog on your R2 bucket:
 
 <Tabs syncKey='CLIvDash'>
 <TabItem label='Wrangler CLI'>
 
-Enable the catalog on your R2 bucket:
-
 ```bash
 npx wrangler r2 bucket catalog enable fraud-pipeline
 ```
@@ -177,9 +175,9 @@ npx wrangler r2 bucket catalog compaction enable fraud-pipeline --token $WRANGLE
 </TabItem>
 </Tabs>
 
-## 4. Set up the pipeline infrastructure
+## 3. Set up the pipeline infrastructure
 
-### 4.1. Create the Pipeline stream
+### 3.1. Create the Pipeline stream
 
 <Tabs syncKey='CLIvDash'>
 <TabItem label='Wrangler CLI'>
@@ -191,7 +189,7 @@ First, create a schema file called `raw_transactions_schema.json` with the follo
     "fields": [
       {"name": "transaction_id", "type": "string", "required": true},
       {"name": "user_id", "type": "int64", "required": true},
-      {"name": "amount", "type": "f64", "required": false},
+      {"name": "amount", "type": "float64", "required": false},
       {"name": "transaction_timestamp", "type": "string", "required": false},
       {"name": "location", "type": "string", "required": false},
       {"name": "merchant_category", "type": "string", "required": false},
@@ -242,7 +240,7 @@ Input Schema:
 ├───────────────────────┼────────┼────────────┼──────────┤
 │ user_id               │ int64  │            │ Yes      │
 ├───────────────────────┼────────┼────────────┼──────────┤
-│ amount                │ f64    │            │ No       │
+│ amount                │float64 │            │ No       │
 ├───────────────────────┼────────┼────────────┼──────────┤
 │ transaction_timestamp │ string │            │ No       │
 ├───────────────────────┼────────┼────────────┼──────────┤
@@ -254,7 +252,7 @@ Input Schema:
 └───────────────────────┴────────┴────────────┴──────────┘
 ```
 
-### 4.2. Create the data sink
+### 3.2. Create the data sink
 
 Create a sink that writes data to your R2 bucket as Apache Iceberg tables:
 
@@ -272,7 +270,7 @@ npx wrangler pipelines sinks create raw_events_sink \
 This creates a `sink` configuration that will write to the Iceberg table `fraud_detection.transactions` in your R2 Data Catalog every 30 seconds. Pipelines automatically appends an `__ingest_ts` column that is used to partition the table by `DAY`.
 :::
 
-### 4.3. Create the pipeline
+### 3.3. Create the pipeline
 
 Connect your stream to your sink with SQL:
 
@@ -304,7 +302,7 @@ npx wrangler pipelines create raw_events_pipeline \
         "fields": [
             {"name": "transaction_id", "type": "string", "required": true},
             {"name": "user_id", "type": "int64", "required": true},
-            {"name": "amount", "type": "f64", "required": false},
+            {"name": "amount", "type": "float64", "required": false},
             {"name": "transaction_timestamp", "type": "string", "required": false},
             {"name": "location", "type": "string", "required": false},
             {"name": "merchant_category", "type": "string", "required": false},
@@ -341,7 +339,7 @@ npx wrangler pipelines create raw_events_pipeline \
 </TabItem>
 </Tabs>
 
-## 5. Generate fraud detection data
+## 4. Generate sample fraud detection data
 
 Create a Python script to generate realistic transaction data with fraud patterns:
 
@@ -491,11 +489,11 @@ pip install requests
 python fraud_data_generator.py
 ```
 
-## 6. Query your fraud data with R2 SQL
+## 5. Query the data with R2 SQL
 
 Now you can analyze your fraud detection data using R2 SQL. Here are some example queries:
 
-### 6.1. View recent transactions
+### 5.1. View recent transactions
 
 ```bash
 npx wrangler r2 sql query "$WAREHOUSE" "
@@ -513,7 +511,7 @@ AND is_fraud = true
 LIMIT 10"
 ```
 
-### 6.2. Filter the raw transactions into a new table to highlight high-value transactions
+### 5.2. Filter the raw transactions into a new table to highlight high-value transactions
 
 Create a new sink that will write the filtered data to a new Apache Iceberg table in R2 Data Catalog: