You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/content/docs/pipelines/concepts/how-pipelines-work.mdx
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,9 +5,9 @@ sidebar:
5
5
order: 1
6
6
---
7
7
8
-
Cloudflare Pipelines let you ingest data from a source, and deliver to a destination. It's built for high volume, real time data streams. Each Pipeline can ingest up to 100 MB/s of data, via HTTP or a Worker, and load the data as files in an R2 bucket.
8
+
Cloudflare Pipelines let you ingest data from a source, and deliver to a destination. It's built for high volume, real time data streams. Each pipeline can ingest up to 100 MB/s of data, via HTTP or a Worker, and load the data as files in an R2 bucket.
@@ -16,7 +16,7 @@ Pipelines supports ingestion via [HTTP](/pipelines/build-with-pipelines/http), o
16
16
17
17
A pipeline can ingest JSON-serializable records.
18
18
19
-
Finally, Pipelines supports R2 as a sink. Ingested data is written to output files, compressed, and delivered to an R2 bucket. Output files are generated as newline delimited JSON files (`ndjson`). The filename of each output file is prefixed by the event date and time, to make querying the data more efficient. For example, an output fle might be named like this: `event_date=2025-04-03/hr=15/01JQY361X75TMYSQZGWC6ZDMR2.json.gz`. Each line in an output file maps to a single record ingested by a pipeline.
19
+
Finally, Pipelines supports writing data to [R2 Object Storage](/r2/). Ingested data is written to output files, compressed, and delivered to an R2 bucket. Output files are generated as newline delimited JSON files (`ndjson`). The filename of each output file is prefixed by the event date and time, to make querying the data more efficient. For example, an output fle might be named like this: `event_date=2025-04-03/hr=15/01JQY361X75TMYSQZGWC6ZDMR2.json.gz`. Each line in an output file maps to a single record ingested by a pipeline.
20
20
21
21
We plan to support more sources, data formats, and sinks, in the future.
22
22
@@ -27,7 +27,7 @@ Any data sent to a pipeline is durably committed to storage. Pipelines use [SQLi
27
27
28
28
Ingested data is buffered until a sufficiently large batch of data has accumulated. Batching is useful to reduce the number of output files written out to R2. [Batch sizes are customizable](/pipelines/build-with-pipelines/output-settings/#customize-batch-behavior), in terms of data volume, rows, or time.
29
29
30
-
Finally, the the batch of data is converted into output files, which are compressed, and delivered to the configured R2 bucket. Any transient failures, such as network failures, are automatically retried.
30
+
Finally, when a batch has reached its target size, it is written out to a file. The file is compressed, and is delivered to the configured R2 bucket. Any transient failures, such as network failures, are automatically retried.
31
31
32
32
## How a Pipeline handles updates
33
33
Data delivery is guaranteed even while updating an existing pipeline. Updating an existing pipeline effectively creates a new deployment, including all your previously configured options. Requests are gracefully re-routed to the new pipeline. The old pipeline continues to write data into your destination. Once the old pipeline is fully drained, it is spun down.
0 commit comments