Skip to content

Commit 591e655

Browse files
committed
Improved language
1 parent ae072d8 commit 591e655

File tree

3 files changed

+6
-8
lines changed

3 files changed

+6
-8
lines changed

src/content/docs/pipelines/build-with-pipelines/shards.mdx

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -34,14 +34,14 @@ When a record is sent to a pipeline:
3434
Increasing the number of shards will increase the maximum throughput of a pipeline, as well as the number of output files created.
3535

3636
### Example
37-
Your workload might require making 5,0000 requests per second to a pipeline. If you create a pipeline with a single shard, all 5,000 requests will be routed to the same shard. If your pipeline has been configured with a maximum batch duration of 1 second, every second, all 5,000 requests will be batched, and a single file will be delivered.
37+
Your workload might require making 5,000 requests per second to a pipeline. If you create a pipeline with a single shard, all 5,000 requests will be routed to the same shard. If your pipeline has been configured with a maximum batch duration of 1 second, every second, all 5,000 requests will be batched, and a single file will be delivered.
3838

3939
Increasing the shard count to 2 will double the number of output files. The 5,000 requests will be split into 2,500 requests to each shard. Every second, each shard will create a batch of data, and deliver to R2.
4040

41-
## Why shouldn't I set the shard count to the maximum?
41+
## Considerations while increasing the shard count
4242
Increasing the shard count also increases the number of output files that your pipeline generates. This in turn increases the [cost of writing data to R2](/r2/pricing/#class-a-operations), as each file written to R2 counts as a single class A operation. Additionally, smaller files are slower, and more expensive, to query. Rather than setting the maximum, choose a shard count based on your workload needs.
4343

44-
## How should I decide the number of shards to use?
44+
## Determine the right number of shards
4545
Choose a shard count based on these factors:
4646
* The number of requests per second you will make to your pipeline
4747
* The amount of data per second you will send to your pipeline

src/content/docs/pipelines/concepts/how-pipelines-work.mdx

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -7,8 +7,6 @@ sidebar:
77

88
Cloudflare Pipelines let you ingest data from a source and deliver to a sink. It is built for high volume, real time data streams. Each pipeline can ingest up to 100 MB/s of data, via HTTP or a Worker, and load the data as files in an R2 bucket.
99

10-
This guide explains how a pipeline works.
11-
1210
![Pipelines Architecture](~/assets/images/pipelines/architecture.png)
1311

1412
## Supported sources, data formats, and sinks
@@ -27,15 +25,15 @@ Pipelines can ingest JSON serializable records.
2725
Pipelines supports delivering data into [R2 Object Storage](/r2/). Ingested data is delivered as newline delimited JSON files (`ndjson`) with optional compression. Multiple pipelines can be configured to deliver data to the same R2 bucket.
2826

2927
## Data durability
30-
Pipelines are designed to be reliable. Any data which is successfully ingested will be delivered to the configured R2 bucket, provided that the [R2 API credentials associated with a pipeline](/r2/api/tokens/) remain valid.
28+
Pipelines are designed to be reliable. Any data which is successfully ingested will be delivered, at least once, to the configured R2 bucket, provided that the [R2 API credentials associated with a pipeline](/r2/api/tokens/) remain valid. Ordering of records is best effort.
3129

3230
Each pipeline maintains a storage buffer. Requests to send data to a pipeline receive a successful response only after the data is committed to this storage buffer.
3331

3432
Ingested data accumulates, until a sufficiently [large batch of data](/pipelines/build-with-pipelines/output-settings/#customize-batch-behavior) has been filled. Once the batch reaches its target size, the entire batch of data is converted to a file and delivered to R2.
3533

3634
Transient failures, such as network connectivity issues, are automatically retried.
3735

38-
However, if the [R2 API credentials associated with a pipeline](/r2/api/tokens/) expire or are revoked, data delivery will fail. In this scenario, some data might continue to accumulate in the buffers, but the pipeline will eventually start rejecting requests.
36+
However, if the [R2 API credentials associated with a pipeline](/r2/api/tokens/) expire or are revoked, data delivery will fail. In this scenario, some data might continue to accumulate in the buffers, but the pipeline will eventually start rejecting requests once the buffers are full.
3937

4038
## Updating a pipeline
4139
Pipelines update without dropping records. Updating an existing pipeline effectively creates a new instance of the pipeline. Requests are gracefully re-routed to the new instance. The old instance continues to write data into your configured sink. Once the old instance is fully drained, it is spun down.

src/content/docs/pipelines/platform/limits.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ import { Render } from "~/components"
1919
| Maximum batch duration | 300s |
2020

2121

22-
## What happens if I exceed the requests per second or throughput limits?
22+
## Exceeding requests per second or throughput limits
2323
If you consistently exceed the requests per second or throughput limits, your pipeline might not be able to keep up with the load. The pipeline will communicate backpressure by returning a 429 response to HTTP requests or throwing an error if using the Workers API.
2424

2525
If you are consistently seeing backpressure from your pipeline, consider the following strategies:

0 commit comments

Comments
 (0)