cloudflare
diff --git a/‎.gitignore‎
Lines changed: 2 additions & 1 deletion b/‎.gitignore‎
Lines changed: 2 additions & 1 deletion
diff --git a/‎src/content/docs/r2/sql/get-started.mdx‎
Lines changed: 210 additions & 0 deletions b/‎src/content/docs/r2/sql/get-started.mdx‎
Lines changed: 210 additions & 0 deletions
diff --git a/‎src/content/docs/r2/sql/platform/limitations-best-practices.mdx‎
Lines changed: 12 additions & 14 deletions b/‎src/content/docs/r2/sql/platform/limitations-best-practices.mdx‎
Lines changed: 12 additions & 14 deletions
diff --git a/‎src/content/docs/r2/sql/platform/pricing.mdx‎
Lines changed: 1 addition & 1 deletion b/‎src/content/docs/r2/sql/platform/pricing.mdx‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎src/content/docs/r2/sql/platform/sql-reference.mdx‎
Lines changed: 15 additions & 8 deletions b/‎src/content/docs/r2/sql/platform/sql-reference.mdx‎
Lines changed: 15 additions & 8 deletions
@@ -29,4 +29,5 @@ pnpm-debug.log*
 /assets/secrets
 /worker/functions/
 
-.idea
+.idea
+package-lock.json
@@ -0,0 +1,210 @@
+---
+pcx_content_type: get-started
+title: Getting started
+head: []
+sidebar:
+  order: 2
+description: Learn how to get up and running with R2 SQL using R2 Data Catalog and Pipelines
+---
+import {
+	Render,
+	LinkCard,
+} from "~/components";
+
+## Overview
+
+This guide will instruct you through:
+
+- Creating an [R2 bucket](/r2/buckets/) and enabling its [data catalog](/r2/data-catalog/).
+- Using Wrangler to create a Pipeline Stream, Sink, and the SQL that reads from the stream and writes it to the sink
+- Sending some data to the stream via the HTTP Streams endpoint
+- Querying the data using R2 SQL
+
+## Prerequisites
+
+1. Sign up for a [Cloudflare account](https://dash.cloudflare.com/sign-up).
+2. Install [Node.js](https://nodejs.org/en/).
+3. Install [Wrangler](/workers/wranger/install-and-update)
+
+:::note[Node.js version manager]
+Use a Node version manager like [Volta](https://volta.sh/) or [nvm](https://github.com/nvm-sh/nvm) to avoid permission issues and change Node.js versions. Wrangler requires a Node version of 16.17.0 or later.
+:::
+
+## 1. Set up authentication
+
+You'll need API tokens to interact with Cloudflare services.
+
+### Custom API Token
+1. Go to **My Profile** → **API Tokens** in the Cloudflare dashboard
+2. Select **Create Token** → **Custom token**
+3. Add the following permissions:
+   - **Workers Pipelines** - Read, Send, Edit
+	 - **Workers R2 Storage** - Edit, Read
+   - **Workers R2 Data Catalog** - Edit, Read
+   - **Workers R2 SQL** - Read
+
+Export your new token as an environment variable:
+
+```bash
+export WRANGLER_R2_SQL_AUTH_TOKEN=your_token_here
+```
+
+If this is your first time using Wrangler, make sure to login.
+```bash
+npx wrangler login
+```
+
+## 2. Create an R2 bucket
+
+Create a new R2 bucket:
+
+```bash
+npx wrangler r2 bucket create r2-sql-demo
+```
+
+## 3. Enable R2 Data Catalog
+
+Enable [R2 Data Catalog](/r2/data-catalog/) feature on your bucket to use Apache Iceberg tables:
+
+```bash
+npx wrangler r2 bucket catalog enable r2-sql-demo
+```
+## 4. Create the data Pipeline
+
+### 1. Create the Pipeline Stream
+
+First, create a schema file called `demo_schema.json` with the following `json` schema:
+```json
+{
+  "fields": [
+    {"name": "user_id", "type": "int64", "required": true},
+    {"name": "payload", "type": "string", "required": false},
+		{"name": "numbers", "type": "int32", "required": false}
+  ]
+}
+```
+Next, crete the stream we'll use to ingest events to:
+
+```bash
+npx wrangler pipelines streams create demo_stream \
+  --schema-file demo_schema.json \
+	--http-enabled true \
+  --http-auth false
+```
+:::note
+Note the **HTTP Ingest Endpoint URL** from the output. This is the endpoint you'll use to send data to your pipeline.
+:::
+
+```bash
+# The http ingest endpoint from the output (see example below)
+export STREAM_ENDPOINT= #the http ingest endpoint from the output (see example below)
+```
+The output should look like this:
+```sh
+🌀 Creating stream 'demo_stream'...
+✨ Successfully created stream 'demo_stream' with id 'stream_id'.
+
+Creation Summary:
+General:
+  Name:  demo_stream
+
+HTTP Ingest:
+  Enabled:         Yes
+  Authentication:  No
+  Endpoint:        https://stream_id.ingest.cloudflare.com
+  CORS Origins:    None
+
+Input Schema:
+┌────────────┬────────┬────────────┬──────────┐
+│ Field Name │ Type   │ Unit/Items │ Required │
+├────────────┼────────┼────────────┼──────────┤
+│ user_id    │ int64  │            │ Yes      │
+├────────────┼────────┼────────────┼──────────┤
+│ payload    │ string │            │ No       │
+├────────────┼────────┼────────────┼──────────┤
+│ numbers    │ int32  │            │ No       │
+└────────────┴────────┴────────────┴──────────┘
+```
+
+
+### 2. Create the Pipeline Sink
+
+Create a sink that writes data to your R2 bucket as Apache Iceberg tables:
+
+```bash
+npx wrangler pipelines sinks create demo_sink \
+  --type "r2-data-catalog" \
+	--bucket "r2-sql-demo" \
+	--roll-interval 30 \
+	--namespace "demo" \
+	--table "first_table" \
+	--catalog-token $WRANGLER_R2_SQL_AUTH_TOKEN
+```
+
+:::note
+This creates a `sink` configuration that will write to the Iceberg table demo.first_table in your R2 Data Catalog every 30 seconds. Pipelines automatically appends an `__ingest_ts` column that is used to partition the table by `DAY`
+:::
+
+### 3. Create the Pipeline
+
+Pipelines are SQL statements read data from the stream, does some work, and writes it to the sink
+
+```bash
+npx wrangler pipelines create demo_pipeline \
+  --sql "INSERT INTO demo_sink SELECT * FROM demo_stream WHERE numbers > 5;"
+```
+:::note
+Note that there is a filter on this statement that will only send events where `numbers` is greater than 5
+:::
+
+## 5. Send some data
+
+Next, let's send some events to our stream:
+
+```curl
+curl -X POST "$STREAM_ENDPOINT" \
+  -H "Authorization: Bearer YOUR_API_TOKEN" \
+  -H "Content-Type: application/json" \
+  -d '[
+    {
+      "user_id": 1,
+      "payload": "you should see this",
+      "numbers": 42
+    },
+    {
+      "user_id": 2,
+      "payload": "you should also see this",
+      "numbers": 100
+    },
+    {
+      "user_id": 3,
+      "payload": null,
+      "numbers": 1
+    },
+    {
+      "user_id": 4,
+      "numbers": null
+    }
+  ]'
+```
+This will send 4 events in one `POST`. Since our Pipeline is filtering out records with `numbers` less than 5, `user_id` `3` and `4` should not appear in the table. Feel free to change values and send more events.
+
+## 6. Query the table with R2 SQL
+
+After you've sent your events to the stream, it will take about 30 seconds for the data to show in the table since that's what we configured our `roll interval` to be in the Sink.
+
+```bash
+npx wrangler r2 sql query "SELECT * FROM demo.first_table LIMIT 10"
+```
+
+<LinkCard
+	title="Managing R2 Data Catalogs"
+	href="/r2/data-catalog/manage-catalogs/"
+	description="Enable or disable R2 Data Catalog on your bucket, retrieve configuration details, and authenticate your Iceberg engine."
+/>
+
+<LinkCard
+	title="Try another example"
+	href="/r2/sql/tutorials/end-to-end-pipeline"
+	description="Detailed tutorial for setting up a simple fruad detection data pipeline and generate events for it in Python."
+/>
@@ -21,20 +21,20 @@ R2 SQL is designed for querying **partitioned** Apache Iceberg tables in your R2
 
 | Feature | Supported | Notes |
 | :---- | :---- | :---- |
-| Basic SELECT | Yes | Columns, \*, aliases |
-| SQL Functions | No | No COUNT, AVG, etc. |
-| Single table FROM | Yes | With aliasing |
+| Basic SELECT | Yes | Columns, \* |
+| Aggregation functions | No | No COUNT, AVG, etc. |
+| Single table FROM | Yes | Note, aliasing not supported|
+| WHERE clause | Yes | Filters, comparisons, equality, etc |
 | JOINs | No | No table joins |
-| WHERE with time | Yes | Required |
 | Array filtering | No | No array type support |
 | JSON filtering | No | No nested object queries |
 | Simple LIMIT | Yes | 1-10,000 range |
-| ORDER BY | Yes | Only on partition key |
+| ORDER BY | Yes | Any columns of the partition key only|
 | GROUP BY | No | Not supported |
 
 ## Supported SQL Clauses
 
-R2 SQL supports a limited set of SQL clauses: `SELECT`, `FROM`, `WHERE`, and `LIMIT`. All other SQL clauses are not supported at the moment. New features will release often, keep an eye on this page and the changelog\[LINK TO CHANGE LOG\] for the latest.
+R2 SQL supports a limited set of SQL clauses: `SELECT`, `FROM`, `WHERE`, and `LIMIT`. All other SQL clauses are not supported at the moment. New features will be released in the future, keep an eye on this page and the changelog\[LINK TO CHANGE LOG\] for the latest.
 
 ---
 
@@ -50,7 +50,7 @@ R2 SQL supports a limited set of SQL clauses: `SELECT`, `FROM`, `WHERE`, and `LI
 - **No JSON field querying**: Cannot query individual fields from JSON objects
 - **No SQL functions**: Functions like `AVG()`, `COUNT()`, `MAX()`, `MIN()`, quantiles are not supported
 - **No synthetic data**: Cannot create synthetic columns like `SELECT 1 AS what, "hello" AS greeting`
-- **Field aliasing**: `SELECT field AS another_name`
+- **No field aliasing**: `SELECT field AS another_name`
 
 
 ### Examples
@@ -85,7 +85,7 @@ SELECT 1 AS synthetic_column
 - **No schema evolution**: Schema cannot be altered (no ALTER TABLE, migrations)
 - **Immutable datasets**: No UPDATE or DELETE operations allowed
 - **Fully defined schema**: Dynamic or union-type fields are not supported
-- **Table aliasing**: `SELECT * FROM table_name AS alias`
+- **No table aliasing**: `SELECT * FROM table_name AS alias`
 
 ### Examples
 
@@ -105,13 +105,12 @@ SELECT * FROM (SELECT * FROM events WHERE status = 200)
 
 ### Supported Features
 
-- **Time filtering**: Queries should include a time filter
-- **Simple type filtering**: Supports `string`, `boolean`, and `number` types
+- **Simple type filtering**: Supports `string`, `boolean`, `number` types, and timestamps expressed as RFC3339
 - **Boolean logic**: Supports `AND`, `OR`, `NOT` operators
 - **Comparison operators**: `>`, `>=`, `=`, `<`, `<=`, `!=`
 - **Grouped conditions**: `WHERE col_a="hello" AND (col_b>5 OR col_c != 3)`
-- **Pattern mating:** `WHERE col_a LIKE ‘%hello w%’`
-- **NULL Handling:** `WHERE col_a IS NOT NULL`
+- **Pattern matching:** `WHERE col_a LIKE ‘hello w%’` (prefix matching only)
+- **NULL Handling :** `WHERE col_a IS NOT NULL` (`IS`/`IS NOT`)
 
 ### Limitations
 
@@ -208,5 +207,4 @@ The following SQL clauses are **not supported**:
 2. **Use specific column selection** instead of `SELECT *` when possible for better performance
 3. **Structure your data** to avoid nested JSON objects if you need to filter on those fields
 
----
-
+---
@@ -14,4 +14,4 @@ R2 SQL is currently not billed during open beta but will eventually be billed on
 
 During the first phase of the R2 SQL open beta, you will not be billed for R2 SQL usage. You will be billed only for R2 usage.
 
-We plan to price based on the volume of data queried by R2 SQL. We will provide at least 30 days' notice and exact pricing before charging.
+We plan to price based on the volume of data queried by R2 SQL. We will provide at least 30 days notice and exact pricing before charging.
@@ -93,6 +93,7 @@ SELECT * WHERE condition [AND|OR condition ...]
 - `column_name <= value`
 - `column_name < value`
 - `column_name != value`
+- `column_name LIKE value%`
 
 #### Logical Operators
 
@@ -104,11 +105,12 @@ SELECT * WHERE condition [AND|OR condition ...]
 - **integer** \- Whole numbers
 - **float** \- Decimal numbers
 - **string** \- Text values (quoted)
+- **timestamp** - RFC3339 format (`'YYYY-DD-MMT-HH:MM:SSZ'`)
 
 ### Examples
 
 ```sql
-SELECT * FROM table_name WHERE timestamp BETWEEN '2025-01-01' AND '2025-01-02'
+SELECT * FROM table_name WHERE timestamp BETWEEN '2025-09-24T01:00:00Z' AND '2025-09-25T01:00:00Z'
 SELECT * FROM table_name WHERE status = 200
 SELECT * FROM table_name WHERE response_time > 1000
 SELECT * FROM table_name WHERE user_id IS NOT NULL
@@ -123,19 +125,21 @@ SELECT * FROM table_name WHERE (status = 404 OR status = 500) AND timestamp > '2
 ### Syntax
 
 ```sql
---Note: ORDERY BY only supports ordering by the partition key
+--Note: ORDER BY only supports ordering by the partition key
 ORDER BY partition_key [DESC]
 ```
 
-- **Default**: Ascending order (ASC)
+- **ASC**: Ascending order
 - **DESC**: Descending order
+- **Default**: partition_key DESC
+- Can contain any columns from the partition key
 
 ### Examples
 
 ```sql
-SELECT * FROM table_name WHERE ... ORDER BY partitionKey
-SELECT * FROM table_name WHERE ... ORDER BY partitionKey DESC
-SELECT * FROM table_name WHERE ... ORDER BY partitionKey DESC
+SELECT * FROM table_name WHERE ... ORDER BY paetition_key_A
+SELECT * FROM table_name WHERE ... ORDER BY partition_key_B DESC
+SELECT * FROM table_name WHERE ... ORDER BY partitionKey_A ASC
 
 ```
 
@@ -151,6 +155,7 @@ LIMIT number
 
 - **Range**: 1 to 10,000
 - **Type**: Integer only
+- **Default**: 500
 
 ### Examples
 
@@ -167,7 +172,7 @@ SELECT * FROM table_name WHERE ... LIMIT 100
 ```sql
 SELECT *
 FROM http_requests
-WHERE timestamp BETWEEN '2024-01-01' AND '2024-01-02'
+WHERE timestamp BETWEEN '2025-09-24T01:00:00Z' AND '2025-09-25T01:00:00Z'
 LIMIT 100
 ```
 
@@ -215,6 +220,8 @@ LIMIT 500
 | `integer` | Whole numbers | `1`, `42`, `-10`, `0` |
 | `float` | Decimal numbers | `1.5`, `3.14`, `-2.7`, `0.0` |
 | `string` | Text values | `'hello'`, `'GET'`, `'2024-01-01'` |
+| `boolean` | boolean values | `true`, `false` |
+| `timestamp` | RFC3339 | `'2025-09-24T01:00:00Z'` |
 
 ### Type Usage in Conditions
 
@@ -237,7 +244,7 @@ SELECT * FROM table_name WHERE country_code = 'US'
 
 ## Operator Precedence
 
-1. **Comparison operators**: `=`, `!=`, `<`, `<=`, `>`, `>=`, `BETWEEN`, `IS NULL`, `IS NOT NULL`
+1. **Comparison operators**: `=`, `!=`, `<`, `<=`, `>`, `>=`, `LIK#`, `BETWEEN`, `IS NULL`, `IS NOT NULL`
 2. **AND** (higher precedence)
 3. **OR** (lower precedence)
Original file line number	Diff line number	Diff line change
`@@ -14,4 +14,4 @@ R2 SQL is currently not billed during open beta but will eventually be billed on`
`14`	`14`
`15`	`15`	`During the first phase of the R2 SQL open beta, you will not be billed for R2 SQL usage. You will be billed only for R2 usage.`
`16`	`16`
`17`		`-We plan to price based on the volume of data queried by R2 SQL. We will provide at least 30 days' notice and exact pricing before charging.`
	`17`	`+We plan to price based on the volume of data queried by R2 SQL. We will provide at least 30 days notice and exact pricing before charging.`