Skip to content

Commit ae072d8

Browse files
maheshwaripdcpenakodster28
authored
Apply suggestions from code review
Co-authored-by: Denise Peña <[email protected]> Co-authored-by: Kody Jackson <[email protected]>
1 parent 7dcde9e commit ae072d8

File tree

10 files changed

+65
-66
lines changed

10 files changed

+65
-66
lines changed

src/content/docs/pipelines/build-with-pipelines/http.mdx

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ head:
1010

1111
import { Render, PackageManagers } from "~/components";
1212

13-
Pipelines support data ingestion over HTTP. When you create a new pipeline, you'll receive a globally scalable ingestion endpoint. To ingest data, make HTTP POST requests to the endpoint.
13+
Pipelines support data ingestion over HTTP. When you create a new pipeline, you will receive a globally scalable ingestion endpoint. To ingest data, make HTTP POST requests to the endpoint.
1414

1515
```sh
1616
$ npx wrangler@latest pipelines create my-clickstream-pipeline --r2-bucket my-bucket
@@ -57,7 +57,7 @@ curl -X POST https://<PIPELINE-ID>.pipelines.cloudflare.com \
5757
```
5858
5959
## Turning HTTP ingestion off
60-
By default, ingestion via HTTP is turned on. You can turn it off by excluding it from the list of sources, by using `--sources` when creating or updating a pipeline.
60+
By default, ingestion via HTTP is turned on. You can turn it off by excluding it from the list of sources by using `--sources` when creating or updating a pipeline.
6161
6262
```sh
6363
$ npx wrangler pipelines create [PIPELINE-NAME] --r2-bucket [R2-BUCKET-NAME] --sources worker
@@ -76,9 +76,9 @@ Once authentication is turned on, you will need to include a Cloudflare API toke
7676
7777
### Get API token
7878
1. Log in to the [Cloudflare dashboard](https://dash.cloudflare.com) and select your account.
79-
2. Navigate to your [API Keys](https://dash.cloudflare.com/profile/api-tokens)
80-
3. Select *Create Token*
81-
4. Choose the template for Workers Pipelines. Click on *continue to summary*, and finally on *create token*. Make sure to copy the API token, and save it securely.
79+
2. Navigate to your [API Keys](https://dash.cloudflare.com/profile/api-tokens).
80+
3. Select **Create Token**.
81+
4. Choose the template for Workers Pipelines. Select **Continue to summary** > **Create token**. Make sure to copy the API token and save it securely.
8282
8383
### Making authenticated requests
8484
Include the API token you created in the previous step in the headers for your request:
@@ -91,7 +91,7 @@ curl https://<PIPELINE-ID>.pipelines.cloudflare.com
9191
```
9292
9393
## Specifying CORS Settings
94-
If you want to use your pipeline to ingest client side data, such as website clicks, you'll need to configure your [Cross-Origin Resource Sharing (CORS) settings](https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS).
94+
If you want to use your pipeline to ingest client side data, such as website clicks, you will need to configure your [Cross-Origin Resource Sharing (CORS) settings](https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS).
9595
9696
Without setting your CORS settings, browsers will restrict requests made to your pipeline endpoint. For example, if your website domain is `https://my-website.com`, and you want to post client side data to your pipeline at `https://<PIPELINE-ID>.pipelines.cloudflare.com`, without CORS settings, the request will fail.
9797
@@ -106,4 +106,4 @@ You can specify that all cross origin requests are accepted. We recommend only u
106106
$ npx wrangler pipelines update [PIPELINE-NAME] --cors-origins "*"
107107
```
108108
109-
After your the `--cors-origins` have been set on your pipeline, your pipeline will respond to preflight requests and POST requests with the appropriate `Access-Control-Allow-Origin` headers set.
109+
After the `--cors-origins` have been set on your pipeline, your pipeline will respond to preflight requests and `POST` requests with the appropriate `Access-Control-Allow-Origin` headers set.

src/content/docs/pipelines/build-with-pipelines/output-settings.mdx

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: Customize output settings
3-
pcx_content_type: concept
3+
pcx_content_type: how-to
44
sidebar:
55
order: 3
66
head:
@@ -10,7 +10,7 @@ head:
1010

1111
import { Render, PackageManagers } from "~/components";
1212

13-
Pipelines convert a stream of records into output files, and deliver the files to an R2 bucket in your account. This guide details how you can change the output destination, and how to customize batch settings to generate query ready files.
13+
Pipelines convert a stream of records into output files and deliver the files to an R2 bucket in your account. This guide details how you can change the output destination and customize batch settings to generate query ready files.
1414

1515
## Configure an R2 bucket as a destination
1616
To create or update a pipeline using Wrangler, run the following command in a terminal:
@@ -19,9 +19,9 @@ To create or update a pipeline using Wrangler, run the following command in a te
1919
npx wrangler pipelines create [PIPELINE-NAME] --r2-bucket [R2-BUCKET-NAME]
2020
```
2121

22-
After running this command, you'll be prompted to authorize Cloudflare Workers Pipelines to create an R2 API token on your behalf. Your pipeline uses the R2 API token to load data into your bucket. You can approve the request through the browser link which will open automatically.
22+
After running this command, you will be prompted to authorize Cloudflare Workers Pipelines to create an R2 API token on your behalf. Your pipeline uses the R2 API token to load data into your bucket. You can approve the request through the browser link which will open automatically.
2323

24-
If you prefer not to authenticate this way, you may pass your [R2 API Token](/r2/api/tokens/) to Wrangler:
24+
If you prefer not to authenticate this way, you can pass your [R2 API Token](/r2/api/tokens/) to Wrangler:
2525
```sh
2626
npx wrangler pipelines create [PIPELINE-NAME] --r2 [R2-BUCKET-NAME] --r2-access-key-id [ACCESS-KEY-ID] --r2-secret-access-key [SECRET-ACCESS-KEY]
2727
```
@@ -40,18 +40,18 @@ Output files are named using a [UILD](https://github.com/ulid/spec) slug, follow
4040
When configuring your pipeline, you can define how records are batched before they are delivered to R2. Batches of records are written out to a single output file.
4141

4242
Batching can:
43-
1. Reduce the number of output files written to R2, and thus reduce the [cost of writing data to R2](/r2/pricing/#class-a-operations)
44-
2. Increase the size of output files, making them more efficient to query
43+
- Reduce the number of output files written to R2 and thus reduce the [cost of writing data to R2](/r2/pricing/#class-a-operations).
44+
- Increase the size of output files making them more efficient to query.
4545

4646
There are three ways to define how ingested data is batched:
4747

48-
1. `batch-max-mb`: The maximum amount of data that will be batched, in megabytes. Default is 10 MB, maximum is 100 MB.
49-
2. `batch-max-rows`: The maximum number of rows or events in a batch before data is written. Default, and maximum, is 10,000 rows.
50-
3. `batch-max-seconds`: The maximum duration of a batch before data is written, in seconds. Default is 15 seconds, maximum is 300 seconds.
48+
1. `batch-max-mb`: The maximum amount of data that will be batched in megabytes. Default is `10 MB`, maximum is `100 MB`.
49+
2. `batch-max-rows`: The maximum number of rows or events in a batch before data is written. Default, and maximum, is `10,000` rows.
50+
3. `batch-max-seconds`: The maximum duration of a batch before data is written in seconds. Default is `15 seconds`, maximum is `300 seconds`.
5151

5252
Batch definitions are hints. A pipeline will follow these hints closely, but batches might not be exact.
5353

54-
All three batch definitions work together. Whichever limit is reached first triggers the delivery of a batch.
54+
All three batch definitions work together and whichever limit is reached first triggers the delivery of a batch.
5555

5656
For example, a `batch-max-mb` = 100 MB and a `batch-max-seconds` = 100 means that if 100 MB of events are posted to the pipeline, the batch will be delivered. However, if it takes longer than 100 seconds for 100 MB of events to be posted, a batch of all the messages that were posted during those 100 seconds will be created.
5757

@@ -66,7 +66,7 @@ For example:
6666
npx wrangler pipelines update [PIPELINE-NAME] --batch-max-mb 100 --batch-max-rows 10000 --batch-max-seconds 300
6767
```
6868

69-
#### Batch size limits
69+
### Batch size limits
7070

7171
| Setting | Default | Minimum | Maximum |
7272
| ----------------------------------------- | ----------- | --------- | ----------- |
@@ -96,7 +96,7 @@ For example:
9696
npx wrangler pipelines update [PIPELINE-NAME] --r2-prefix test
9797
```
9898

99-
After running the above command, the output files generated by your pipeline will be stored under the prefix "test". Files will remain partitioned. Your output will look like this:
99+
After running the above command, the output files generated by your pipeline will be stored under the prefix `test`. Files will remain partitioned. Your output will look like this:
100100
```sh
101101
- test/event_date=2025-04-01/hr=15/01JQWBZCZBAQZ7RJNZHN38JQ7V.json.gz
102102
- test/event_date=2025-04-01/hr=15/01JQWC16FXGP845EFHMG1C0XNW.json.gz

src/content/docs/pipelines/build-with-pipelines/shards.mdx

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -24,12 +24,12 @@ The default shard count will be set to `auto` in the future, with support for au
2424
Each pipeline is composed of stateless, independent shards. These shards are spun up when a pipeline is created. Each shard is composed of layers of [Durable Objects](/durable-objects). The Durable Objects buffer data, replicate for durability, handle compression, and delivery to R2.
2525

2626
When a record is sent to a pipeline:
27-
1. The Pipelines [Worker](/workers) receives the record
28-
2. The record is routed to to one of the shards
29-
3. The record is handled by a set of Durable Objects, which commmit the record to storage, and replicate for durability.
30-
4. Records accumulate, until the [batch definitions](/pipelines/build-with-pipelines/output-settings/#customize-batch-behavior) are met.
31-
5. The batch is written to an output file, and optionally compressed.
32-
6. The output file is delivered to the configured R2 bucket
27+
1. The Pipelines [Worker](/workers) receives the record.
28+
2. The record is routed to to one of the shards.
29+
3. The record is handled by a set of Durable Objects, which commit the record to storage and replicate for durability.
30+
4. Records accumulate until the [batch definitions](/pipelines/build-with-pipelines/output-settings/#customize-batch-behavior) are met.
31+
5. The batch is written to an output file and optionally compressed.
32+
6. The output file is delivered to the configured R2 bucket.
3333

3434
Increasing the number of shards will increase the maximum throughput of a pipeline, as well as the number of output files created.
3535

@@ -43,12 +43,12 @@ Increasing the shard count also increases the number of output files that your p
4343

4444
## How should I decide the number of shards to use?
4545
Choose a shard count based on these factors:
46-
* How many requests per second you will make to your pipeline
47-
* How much data per second you will send to your pipeline
46+
* The number of requests per second you will make to your pipeline
47+
* The amount of data per second you will send to your pipeline
4848

49-
Each shard is capable of handling approximately 7,000 requests per second, or ingesting 7 MB / s of data. Either factor might act as the bottleneck, so choose the shard count based on the higher number.
49+
Each shard is capable of handling approximately 7,000 requests per second, or ingesting 7 MB/s of data. Either factor might act as the bottleneck, so choose the shard count based on the higher number.
5050

51-
For example, if you estimate that you will ingest 70 MB / s, making 70,000 requests per second, setup a pipeline with 10 shards. However, if you estimate that you will ingest 70 MB / s while making 100,000 requests per second, setup a pipeline with 15 shards.
51+
For example, if you estimate that you will ingest 70 MB/s, making 70,000 requests per second, setup a pipeline with 10 shards. However, if you estimate that you will ingest 70 MB/s while making 100,000 requests per second, setup a pipeline with 15 shards.
5252

5353
## Limits
5454
| Setting | Default | Minimum | Maximum |

src/content/docs/pipelines/concepts/how-pipelines-work.mdx

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ sidebar:
55
order: 1
66
---
77

8-
Cloudflare Pipelines let you ingest data from a source, and deliver to a sink. It's built for high volume, real time data streams. Each pipeline can ingest up to 100 MB/s of data, via HTTP or a Worker, and load the data as files in an R2 bucket.
8+
Cloudflare Pipelines let you ingest data from a source and deliver to a sink. It is built for high volume, real time data streams. Each pipeline can ingest up to 100 MB/s of data, via HTTP or a Worker, and load the data as files in an R2 bucket.
99

1010
This guide explains how a pipeline works.
1111

@@ -24,7 +24,7 @@ Multiple sources can be active on a single pipeline simultaneously. For example,
2424
Pipelines can ingest JSON serializable records.
2525

2626
### Sinks
27-
Pipelines supports delivering data into [R2 Object Storage](/r2/). Ingested data is delivered as newline delimited JSON files (`ndjson`), with optional compression. Multiple pipelines can be configured to deliver data to the same R2 bucket.
27+
Pipelines supports delivering data into [R2 Object Storage](/r2/). Ingested data is delivered as newline delimited JSON files (`ndjson`) with optional compression. Multiple pipelines can be configured to deliver data to the same R2 bucket.
2828

2929
## Data durability
3030
Pipelines are designed to be reliable. Any data which is successfully ingested will be delivered to the configured R2 bucket, provided that the [R2 API credentials associated with a pipeline](/r2/api/tokens/) remain valid.
@@ -43,8 +43,8 @@ Pipelines update without dropping records. Updating an existing pipeline effecti
4343
This means that updates might take a few minutes to go into effect. For example, if you update a pipeline's sink, previously ingested data might continue to be delivered into the old sink.
4444

4545
## Backpressure behavior
46-
If you send too much data, the pipeline will communicate backpressure by returning a 429 response to HTTP requests, or throwing an error if using the Workers API. Refer to the [limits](/pipelines/platform/limits) to learn how much volume a single pipeline can support. You might see 429 responses if you are sending too many requests, or sending too much data.
46+
If you send too much data, the pipeline will communicate backpressure by returning a 429 response to HTTP requests, or throwing an error if using the Workers API. Refer to the [limits](/pipelines/platform/limits) to learn how much volume a single pipeline can support. You might see 429 responses if you are sending too many requests or sending too much data.
4747

4848
If you are consistently seeing backpressure from your pipeline, consider the following strategies:
49-
* Increase the [shard count](/pipelines/build-with-pipelines/shards), to increase the maxiumum throughput of your pipeline.
50-
* Send data to a second pipeline if you receive an error. You can setup multiple pipelines to write to the same R2 bucket.
49+
* Increase the [shard count](/pipelines/build-with-pipelines/shards) to increase the maximum throughput of your pipeline.
50+
* Send data to a second pipeline if you receive an error. You can set up multiple pipelines to write to the same R2 bucket.

src/content/docs/pipelines/getting-started.mdx

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -13,10 +13,10 @@ import { Render, PackageManagers } from "~/components";
1313
Cloudflare Pipelines allows you to ingest load high volumes of real time streaming data, and load into [R2 Object Storage](/r2/), without managing any infrastructure.
1414

1515
By following this guide, you will:
16-
1. Setup an R2 bucket
17-
2. Create a pipeline, with HTTP as a source, and an R2 bucket as a sink
18-
3. Send data to your pipeline's HTTP ingestion endpoint
19-
4. Verify the output delivered to R2
16+
1. Setup an R2 bucket.
17+
2. Create a pipeline, with HTTP as a source, and an R2 bucket as a sink.
18+
3. Send data to your pipeline's HTTP ingestion endpoint.
19+
4. Verify the output delivered to R2.
2020

2121
:::note
2222

@@ -53,7 +53,7 @@ To create a pipeline using Wrangler, run the following command in a terminal, an
5353
npx wrangler pipelines create my-clickstream-pipeline --r2-bucket my-bucket --batch-max-seconds 5 --compression none
5454
```
5555

56-
After running this command, you'll be prompted to authorize Cloudflare Workers Pipelines to create an R2 API token on your behalf. These tokens used by your pipeline when loading data into your bucket. You can approve the request through the browser link which will open automatically.
56+
After running this command, you will be prompted to authorize Cloudflare Workers Pipelines to create an R2 API token on your behalf. These tokens used by your pipeline when loading data into your bucket. You can approve the request through the browser link which will open automatically.
5757

5858
If you prefer not to authenticate this way, you may pass your [R2 API Token](/r2/api/tokens/) to Wrangler:
5959
```sh
@@ -62,12 +62,12 @@ npx wrangler pipelines create my-clickstream-pipeline --r2-bucket my-bucket --r2
6262

6363
When choosing a name for your pipeline:
6464

65-
1. Ensure it is descriptive and relevant to the type of events you intend to ingest. You cannot change the name of the pipeline after creating it.
66-
2. Pipeline names must be between 1 and 63 characters long.
67-
3. The name cannot contain special characters outside dashes (`-`).
65+
- Ensure it is descriptive and relevant to the type of events you intend to ingest. You cannot change the name of the pipeline after creating it.
66+
- The pipeline name must be between 1 and 63 characters long.
67+
- The name cannot contain special characters outside dashes (`-`).
6868
4. The name must start and end with a letter or a number.
6969

70-
You'll notice that we have set two optional flags while creating the pipeline: `--batch-max-seconds` and `--compression`. We've added these flags to make it faster for you to see the output of your first pipeline. For production use cases, we recommend keeping the default settings.
70+
You will notice two optional flags are set while creating the pipeline: `--batch-max-seconds` and `--compression`. These flags are added to make it faster for you to see the output of your first pipeline. For production use cases, we recommend keeping the default settings.
7171

7272
Once you create your pipeline, you will receive a HTTP endpoint which you can post data to. You should see output as shown below:
7373

@@ -133,7 +133,7 @@ Open the [R2 dashboard](https://dash.cloudflare.com/?to=/:account/r2/overview),
133133
134134
## Next steps
135135
136-
* Learn about how to [setup authentication, or CORS settings](/pipelines/build-with-pipelines/http), on your HTTP endpoint
137-
* Send data to your Pipeline from a Cloudflare Worker, using our [Workers API documentation](/pipelines/build-with-pipelines/workers-apis)
136+
* Learn about how to [setup authentication, or CORS settings](/pipelines/build-with-pipelines/http), on your HTTP endpoint.
137+
* Send data to your Pipeline from a Cloudflare Worker using the [Workers API documentation](/pipelines/build-with-pipelines/workers-apis).
138138
139139
If you have any feature requests or notice any bugs, share your feedback directly with the Cloudflare team by joining the [Cloudflare Developers community on Discord](https://discord.cloudflare.com).

src/content/docs/pipelines/observability/metrics.mdx

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -45,9 +45,9 @@ The `pipelinesDeliverynAdaptiveGroups` dataset provides the following dimensions
4545

4646
## Query via the GraphQL API
4747

48-
You can programmatically query analytics for your Workflows via the [GraphQL Analytics API](/analytics/graphql-api/). This API queries the same datasets as the Cloudflare dashboard, and supports GraphQL [introspection](/analytics/graphql-api/features/discovery/introspection/).
48+
You can programmatically query analytics for your Workflows via the [GraphQL Analytics API](/analytics/graphql-api/). This API queries the same datasets as the Cloudflare dashboard and supports GraphQL [introspection](/analytics/graphql-api/features/discovery/introspection/).
4949

50-
Pipelines GraphQL datasets require an `accountTag` filter, with your Cloudflare account ID.
50+
Pipelines GraphQL datasets require an `accountTag` filter with your Cloudflare account ID.
5151

5252
### Measure total bytes & records ingested over time period
5353

0 commit comments

Comments
 (0)