You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/content/docs/pipelines/build-with-pipelines/shards.mdx
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -34,14 +34,14 @@ When a record is sent to a pipeline:
34
34
Increasing the number of shards will increase the maximum throughput of a pipeline, as well as the number of output files created.
35
35
36
36
### Example
37
-
Your workload might require making 5,0000 requests per second to a pipeline. If you create a pipeline with a single shard, all 5,000 requests will be routed to the same shard. If your pipeline has been configured with a maximum batch duration of 1 second, every second, all 5,000 requests will be batched, and a single file will be delivered.
37
+
Your workload might require making 5,000 requests per second to a pipeline. If you create a pipeline with a single shard, all 5,000 requests will be routed to the same shard. If your pipeline has been configured with a maximum batch duration of 1 second, every second, all 5,000 requests will be batched, and a single file will be delivered.
38
38
39
39
Increasing the shard count to 2 will double the number of output files. The 5,000 requests will be split into 2,500 requests to each shard. Every second, each shard will create a batch of data, and deliver to R2.
40
40
41
-
## Why shouldn't I set the shard count to the maximum?
41
+
## Considerations while increasing the shard count
42
42
Increasing the shard count also increases the number of output files that your pipeline generates. This in turn increases the [cost of writing data to R2](/r2/pricing/#class-a-operations), as each file written to R2 counts as a single class A operation. Additionally, smaller files are slower, and more expensive, to query. Rather than setting the maximum, choose a shard count based on your workload needs.
43
43
44
-
## How should I decide the number of shards to use?
44
+
## Determine the right number of shards
45
45
Choose a shard count based on these factors:
46
46
* The number of requests per second you will make to your pipeline
47
47
* The amount of data per second you will send to your pipeline
Copy file name to clipboardExpand all lines: src/content/docs/pipelines/concepts/how-pipelines-work.mdx
+2-4Lines changed: 2 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,8 +7,6 @@ sidebar:
7
7
8
8
Cloudflare Pipelines let you ingest data from a source and deliver to a sink. It is built for high volume, real time data streams. Each pipeline can ingest up to 100 MB/s of data, via HTTP or a Worker, and load the data as files in an R2 bucket.
@@ -27,15 +25,15 @@ Pipelines can ingest JSON serializable records.
27
25
Pipelines supports delivering data into [R2 Object Storage](/r2/). Ingested data is delivered as newline delimited JSON files (`ndjson`) with optional compression. Multiple pipelines can be configured to deliver data to the same R2 bucket.
28
26
29
27
## Data durability
30
-
Pipelines are designed to be reliable. Any data which is successfully ingested will be deliveredto the configured R2 bucket, provided that the [R2 API credentials associated with a pipeline](/r2/api/tokens/) remain valid.
28
+
Pipelines are designed to be reliable. Any data which is successfully ingested will be delivered, at least once, to the configured R2 bucket, provided that the [R2 API credentials associated with a pipeline](/r2/api/tokens/) remain valid. Ordering of records is best effort.
31
29
32
30
Each pipeline maintains a storage buffer. Requests to send data to a pipeline receive a successful response only after the data is committed to this storage buffer.
33
31
34
32
Ingested data accumulates, until a sufficiently [large batch of data](/pipelines/build-with-pipelines/output-settings/#customize-batch-behavior) has been filled. Once the batch reaches its target size, the entire batch of data is converted to a file and delivered to R2.
35
33
36
34
Transient failures, such as network connectivity issues, are automatically retried.
37
35
38
-
However, if the [R2 API credentials associated with a pipeline](/r2/api/tokens/) expire or are revoked, data delivery will fail. In this scenario, some data might continue to accumulate in the buffers, but the pipeline will eventually start rejecting requests.
36
+
However, if the [R2 API credentials associated with a pipeline](/r2/api/tokens/) expire or are revoked, data delivery will fail. In this scenario, some data might continue to accumulate in the buffers, but the pipeline will eventually start rejecting requests once the buffers are full.
39
37
40
38
## Updating a pipeline
41
39
Pipelines update without dropping records. Updating an existing pipeline effectively creates a new instance of the pipeline. Requests are gracefully re-routed to the new instance. The old instance continues to write data into your configured sink. Once the old instance is fully drained, it is spun down.
Copy file name to clipboardExpand all lines: src/content/docs/pipelines/platform/limits.mdx
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -19,7 +19,7 @@ import { Render } from "~/components"
19
19
| Maximum batch duration | 300s |
20
20
21
21
22
-
## What happens if I exceed the requests per second or throughput limits?
22
+
## Exceeding requests per second or throughput limits
23
23
If you consistently exceed the requests per second or throughput limits, your pipeline might not be able to keep up with the load. The pipeline will communicate backpressure by returning a 429 response to HTTP requests or throwing an error if using the Workers API.
24
24
25
25
If you are consistently seeing backpressure from your pipeline, consider the following strategies:
0 commit comments