Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
46 commits
Select commit Hold shift + click to select a range
bec1ef4
update local playbook + ss schema id validation
micheleRP Mar 31, 2025
896d86a
update playbook
micheleRP Mar 31, 2025
399fde9
fix link
micheleRP Mar 31, 2025
9549015
fix playbook
micheleRP Apr 1, 2025
1e1f74b
single sourcing updates
micheleRP Apr 1, 2025
d2efddf
conditionalize audit logging
micheleRP Apr 1, 2025
866b6bc
minor edits
micheleRP Apr 2, 2025
093d21e
single source data transforms
micheleRP Apr 2, 2025
fcdadc4
single source transforms SDK reference
micheleRP Apr 2, 2025
549aa20
fix links
micheleRP Apr 2, 2025
3639348
tag cluster properties in cloud
micheleRP Apr 2, 2025
c503200
add SS + conditionalizing
micheleRP Apr 3, 2025
107c2ec
clean up conditionalizing
micheleRP Apr 3, 2025
4aacff3
Update modules/develop/pages/data-transforms/deploy.adoc
micheleRP Apr 3, 2025
3d6eb11
fix tags
micheleRP Apr 3, 2025
3e1904d
add topic properties reference
micheleRP Apr 4, 2025
792e2f7
typo
micheleRP Apr 4, 2025
95abfe2
Fix broken tag
JakeSCahill Apr 4, 2025
f66f6fa
Fix tags
JakeSCahill Apr 4, 2025
49214c1
unconditionalize fixes
micheleRP Apr 5, 2025
22cc20e
fix conditionals
micheleRP Apr 5, 2025
65959ce
fix conditionals
micheleRP Apr 6, 2025
31787e3
conditionalize console
micheleRP Apr 6, 2025
3e23dd1
conditionalize wasm properties in text
micheleRP Apr 6, 2025
c79d85b
fix audit_excluded_topics
micheleRP Apr 7, 2025
6ad5d7a
Merge branch 'main' into DOC-1160-single-source-SM-for-Cloud-cluster-…
JakeSCahill Apr 7, 2025
4d9e3d6
rename topic-iceberg-integration to about-iceberg-topics
micheleRP Apr 7, 2025
9dfe691
rename about-iceberg-topics.adoc
micheleRP Apr 7, 2025
0835a1d
fix errors
micheleRP Apr 7, 2025
5caf1cd
Merge branch 'main' into DOC-1160-single-source-SM-for-Cloud-cluster-…
micheleRP Apr 7, 2025
1760058
Update modules/console/pages/ui/edit-topic-configuration.adoc
micheleRP Apr 7, 2025
af96d92
incorporate feedback from code review
micheleRP Apr 7, 2025
1a93d46
Merge branch 'main' into DOC-1160-single-source-SM-for-Cloud-cluster-…
micheleRP Apr 7, 2025
d9f6e10
update to latest description
micheleRP Apr 7, 2025
beac26f
contact RP to configure auditable events
micheleRP Apr 8, 2025
c4645d3
minor edit
micheleRP Apr 8, 2025
f561099
fix conditionals in audit-logging.adoc
micheleRP Apr 8, 2025
4f190b3
conditionalize rpk cluster config in auditing
micheleRP Apr 8, 2025
2d2a18a
conditionalize audit-loggin
micheleRP Apr 8, 2025
65517c2
fix condition
micheleRP Apr 8, 2025
240d403
Fix single-sourcing
JakeSCahill Apr 8, 2025
572dab6
Update audit-logging.adoc
JakeSCahill Apr 8, 2025
7d24f98
fix links for Cloud properties
micheleRP Apr 8, 2025
f7609a9
Update audit-logging.adoc
JakeSCahill Apr 8, 2025
d19fe4d
style edit, fix typos
micheleRP Apr 8, 2025
0f373ec
fix Next steps + local-antora-playbook
micheleRP Apr 8, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion modules/ROOT/nav.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -180,7 +180,7 @@
*** xref:manage:topic-recovery.adoc[Topic Recovery]
*** xref:manage:whole-cluster-restore.adoc[Whole Cluster Restore]
** xref:manage:iceberg/index.adoc[Iceberg]
*** xref:manage:iceberg/topic-iceberg-integration.adoc[About Iceberg Topics]
*** xref:manage:iceberg/about-iceberg-topics.adoc[About Iceberg Topics]
*** xref:manage:iceberg/use-iceberg-catalogs.adoc[Use Iceberg Catalogs]
*** xref:manage:iceberg/query-iceberg-topics.adoc[Query Iceberg Topics]
*** xref:manage:iceberg/redpanda-topics-iceberg-snowflake-catalog.adoc[Query Iceberg Topics with Snowflake]
Expand Down
21 changes: 13 additions & 8 deletions modules/console/pages/ui/data-transforms.adoc
Original file line number Diff line number Diff line change
@@ -1,43 +1,46 @@
= Manage Data Transforms in Redpanda Console
:description: You can use Redpanda Console to monitor the status and performance metrics of your transform functions. You can also view detailed logs and delete transform functions when they are no longer needed.
= Manage Data Transforms in {ui}
:description: Use {ui} to monitor the status and performance metrics of your transform functions. You can also view detailed logs and delete transform functions when they are no longer needed.
// tag::single-source[]

{description}

== Prerequisites

Before you begin, ensure that you have the following:

ifndef::env-cloud[]
- Redpanda Console must be xref:console:config/connect-to-redpanda.adoc[connected to a Redpanda cluster].
- Redpanda Console must be xref:console:config/connect-to-redpanda.adoc#admin[configured to connect to the Redpanda Admin API].
endif::[]
- xref:develop:data-transforms/configure.adoc#enable-transforms[Data transforms enabled] in your Redpanda cluster.
- At least one transform function deployed to your Redpanda cluster.

[[monitor]]
== Monitor transform functions

To monitor transform functions in Redpanda Console:
To monitor transform functions:

. Navigate to the *Transforms* menu.
. Click on the name of a transform function to view detailed information:
. Click the name of a transform function to view detailed information:
- The partitions that the function is running on
- The broker (node) ID
- Any lag (the amount of pending records on the input topic that have yet to be processed by the transform)

[[logs]]
== View logs

To view logs for a transform function in Redpanda Console:
To view logs for a transform function:

. Navigate to the *Transforms* menu.
. Click on the name of a transform function.
. Click the *Logs* tab to see the logs.

Redpanda Console displays a limited number of logs for transform functions. To view the full history of logs, use the xref:develop:data-transforms/monitor.adoc#logs[`rpk` command-line tool].
{ui} displays a limited number of logs for transform functions. To view the full history of logs, use the xref:develop:data-transforms/monitor.adoc#logs[`rpk` command-line tool].

[[delete]]
== Delete transform functions

To delete a transform function in Redpanda Console:
To delete a transform function:

1. Navigate to the *Transforms* menu.
2. Find the transform function you want to delete from the list.
Expand All @@ -50,4 +53,6 @@ Deleting a transform function will remove it from the cluster and stop any furth

- xref:develop:data-transforms/how-transforms-work.adoc[]
- xref:develop:data-transforms/deploy.adoc[]
- xref:develop:data-transforms/monitor.adoc[]
- xref:develop:data-transforms/monitor.adoc[]

// end::single-source[]
4 changes: 2 additions & 2 deletions modules/console/pages/ui/edit-topic-configuration.adoc
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
= Edit Topic Configuration in the {ui}
= Edit Topic Configuration in {ui}
:page-aliases: manage:console/edit-topic-configuration.adoc
// tag::single-source[]
:description: Use {ui} to edit the configuration of existing topics in a cluster.
// tag::single-source[]

{description}

Expand Down
2 changes: 1 addition & 1 deletion modules/console/pages/ui/programmable-push-filters.adoc
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
= Filter Messages with JavaScript in {ui}
:page-aliases: console:features/programmable-push-filters.adoc, reference:console/programmable-push-filters.adoc
// Do not put page aliases in the single-sourced content
// tag::single-source[]
:description: Learn how to filter Kafka records using custom JavaScript code within {ui}.
// tag::single-source[]

You can use push-down filters in {ui} to search through large Kafka topics that may contain millions of records. Filters are JavaScript functions executed on the backend, evaluating each record individually. Your function must return a boolean:

Expand Down
2 changes: 1 addition & 1 deletion modules/console/pages/ui/record-deserialization.adoc
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
= View Deserialized Messages in {ui}
:page-aliases: console:features/record-deserialization.adoc, manage:console/protobuf.adoc, reference:console/record-deserialization.adoc
// tag::single-source[]
:description: Learn how {ui} deserializes messages.
// tag::single-source[]

In Redpanda, the messages exchanged between producers and consumers contain raw bytes. Schemas work as an agreed-upon format, like a contract, for producers and consumers to serialize and deserialize those messages. If a producer breaks this contract, consumers can fail.

Expand Down
4 changes: 2 additions & 2 deletions modules/console/pages/ui/schema-reg.adoc
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
= Use Schema Registry in {ui}
:page-aliases: manage:schema-reg/schema-reg-ui.adoc
:page-categories: Management, Schema Registry
// tag::single-source[]
:description: Perform common Schema Registry management operations in the {ui}.
// tag::single-source[]

In {ui}, the *Schema Registry* menu lists registered and verified schemas, including their serialization format and versions. Select an individual schema to see which topics it applies to.

[NOTE]
====
The Schema Registry is built into Redpanda, and you can use it with the Schema Registry API or with {ui}. This section describes Schema Registry operations available in {ui}.
The Schema Registry is built into Redpanda, and you can use it with the Schema Registry API or with the UI. This section describes Schema Registry operations available in the UI.
====

ifndef::env-cloud[]
Expand Down
14 changes: 10 additions & 4 deletions modules/develop/pages/data-transforms/build.adoc
Original file line number Diff line number Diff line change
@@ -1,14 +1,20 @@
= Develop Data Transforms
:description: Learn how to initialize a data transforms project and write transform functions in your chosen language.
:page-categories: Development, Stream Processing, Data Transforms
// tag::single-source[]

{description}

== Prerequisites

You must have the following development tools installed on your host machine:

ifdef::env-cloud[]
* The xref:manage:rpk/rpk-install.adoc[`rpk` command-line client] installed.
endif::[]
ifndef::env-cloud[]
* The xref:get-started:rpk-install.adoc[`rpk` command-line client] installed on your host machine and configured to connect to your Redpanda cluster.
endif::[]
* For Golang projects, you must have at least version 1.20 of https://go.dev/doc/install[Go^].
* For Rust projects, you must have the latest stable version of https://rustup.rs/[Rust^].
* For JavaScript and TypeScript projects, you must have the https://nodejs.org/en/download/package-manager[latest long-term-support release of Node.js^].
Expand Down Expand Up @@ -46,9 +52,7 @@ For example, if you choose `tinygo-no-goroutines`, the following project files a
The `transform.go` file contains a boilerplate transform function.
The `transform.yaml` file specifies the configuration settings for the transform function.

See also:

- xref:develop:data-transforms/configure.adoc[]
See also: xref:develop:data-transforms/configure.adoc[]

== Build transform functions

Expand Down Expand Up @@ -284,7 +288,7 @@ See also:
- xref:develop:data-transforms/monitor#logs[View logs for transform functions]
- xref:develop:data-transforms/monitor.adoc[Monitor data transforms]
- xref:develop:data-transforms/configure.adoc#log[Configure transform logging]
- xref:reference:rpk/rpk-transform/rpk-transform-logs.adoc[]
- xref:reference:rpk/rpk-transform/rpk-transform-logs.adoc[`rpk transform logs` reference]

=== Avoid state management

Expand Down Expand Up @@ -458,3 +462,5 @@ xref:develop:data-transforms/configure.adoc[]
- xref:develop:data-transforms/how-transforms-work.adoc[]
- xref:reference:data-transforms/sdks.adoc[]
- xref:reference:rpk/rpk-transform/rpk-transform.adoc[`rpk transform` commands]

// end::single-source[]
22 changes: 15 additions & 7 deletions modules/develop/pages/data-transforms/configure.adoc
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
= Configure Data Transforms
:description: pass:q[Learn how to configure data transforms in Redpanda, including editing the `transform.yaml` file, environment variables, and memory settings. This topic covers both the configuration of transform functions and the WebAssembly (Wasm) engine's environment.]
:page-categories: Development, Stream Processing, Data Transforms
// tag::single-source[]

{description}

Expand Down Expand Up @@ -41,7 +42,7 @@ env:

You can set the name of the transform function, environment variables, and input and output topics on the command-line when you deploy the transform. These command-line settings take precedence over those specified in the `transform.yaml` file.

See xref:develop:data-transforms/deploy.adoc[].
See xref:develop:data-transforms/deploy.adoc[]

[[built-in]]
=== Built-In environment variables
Expand All @@ -62,19 +63,20 @@ This section covers how to configure the Wasm engine environment using Redpanda

To use data transforms, you must enable it for a Redpanda cluster using the xref:reference:properties/cluster-properties.adoc#data_transforms_enabled[`data_transforms_enabled`] property.

[[resources]]
ifndef::env-cloud[]
=== Configure memory resources for data transforms

Redpanda reserves memory for each transform function within the broker. You need enough memory for your input record and output record to be in memory at the same time.

Set the following properties based on the number of functions you have and the amount of memory you anticipate needing.
Set the following based on the number of functions you have and the amount of memory you anticipate needing.

- xref:reference:properties/cluster-properties.adoc#data_transforms_per_core_memory_reservation[`data_transforms_per_core_memory_reservation`]: Increase this setting if you plan to deploy a large number of data transforms or if your transforms are memory-intensive. Reducing it may limit the number of concurrent transforms.

- xref:reference:properties/cluster-properties.adoc#data_transforms_per_function_memory_limit[`data_transforms_per_function_memory_limit`]: Adjust this setting if individual transform functions require more memory to process records efficiently. Reducing it may cause memory errors in complex transforms.

The maximum number of functions that can be deployed to a cluster is equal to `data_transforms_per_core_memory_reservation` / `data_transforms_per_function_memory_limit`. When that limit is hit, Redpanda cannot allocate memory for the VM and the transforms stay in `errored` states.
endif::[]

ifndef::env-cloud[]
[[binary-size]]
=== Configure maximum binary size

Expand All @@ -88,25 +90,31 @@ Increase this setting if your Wasm binaries are larger than the default limit. S
You can set the interval at which data transforms commit their progress using the xref:reference:properties/cluster-properties.adoc#data_transforms_commit_interval_ms[`data_transforms_commit_interval_ms`] property.

Adjust this setting to control how frequently the transform function's progress is committed. Shorter intervals may provide more frequent progress updates but can increase load. Longer intervals reduce load but may delay progress updates.
endif::[]

[[log]]
=== Configure transform logging
The following properties configure logging for data transforms:

Redpanda provides several properties to configure logging for data transforms:

ifndef::env-cloud[]
- xref:reference:properties/cluster-properties.adoc#data_transforms_logging_buffer_capacity_bytes[`data_transforms_logging_buffer_capacity_bytes`]: Increase this value if your transform logs are large or if you need to buffer more log data before flushing. Reducing this value may cause more frequent log flushing.

- xref:reference:properties/cluster-properties.adoc#data_transforms_logging_flush_interval_ms[`data_transforms_logging_flush_interval_ms`]: Adjust this value to control how frequently logs are flushed to the `transform_logs` topic. Shorter intervals provide more frequent log updates but can increase load. Longer intervals reduce load but may delay log updates.
endif::[]

- xref:reference:properties/cluster-properties.adoc#data_transforms_logging_line_max_bytes[`data_transforms_logging_line_max_bytes`]: Increase this value if your log messages are frequently truncated. Setting this value too low may truncate important log information.

ifndef::env-cloud[]
[[runtime-limit]]
=== Configure runtime limits

You can set the maximum runtime for starting up a data transform and the time it takes for a single record to be transformed using the xref:reference:properties/cluster-properties.adoc#data_transforms_runtime_limit_ms[`data_transforms_runtime_limit_ms`] property.

Adjust this value only if your transform functions need more time to process each record or to start up.
endif::[]

== Next steps

xref:develop:data-transforms/deploy.adoc[].
xref:develop:data-transforms/deploy.adoc[]

// end::single-source[]
16 changes: 15 additions & 1 deletion modules/develop/pages/data-transforms/deploy.adoc
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
= Deploy Data Transforms
:description: Learn how to build, deploy, share, and troubleshoot data transforms in Redpanda.
:page-categories: Development, Stream Processing, Data Transforms
// tag::single-source[]

{description}

Expand All @@ -10,7 +11,12 @@
Before you begin, ensure that you have the following:

- xref:develop:data-transforms/configure.adoc#enable-transforms[Data transforms enabled] in your Redpanda cluster.
ifndef::env-cloud[]
- The xref:get-started:rpk-install.adoc[`rpk` command-line client] installed on your host machine and configured to connect to your Redpanda cluster.
endif::[]
ifdef::env-cloud[]
- The xref:manage:rpk/rpk-install.adoc[`rpk` command-line client].
endif::[]
- A xref:develop:data-transforms/build.adoc[data transform] project.

[[build]]
Expand Down Expand Up @@ -120,7 +126,14 @@ rpk transform delete <transform-name>

For more details about this command, see xref:reference:rpk/rpk-transform/rpk-transform-delete.adoc[].

ifndef::env-cloud[]
TIP: You can also xref:console:ui/data-transforms.adoc#delete[delete transform functions in Redpanda Console].
endif::[]

ifdef::env-cloud[]
TIP: You can also delete transform functions in Redpanda Cloud.
endif::[]


== Troubleshoot

Expand All @@ -144,5 +157,6 @@ Invalid WebAssembly - the binary is missing required transform functions. Check
All transform functions must register a callback with the `OnRecordWritten()` method. For more details, see xref:develop:data-transforms/build.adoc[].

== Next steps
xref:develop:data-transforms/monitor.adoc[Set up monitoring] for data transforms.

xref:develop:data-transforms/monitor.adoc[Set up monitoring] for data transforms.
// end::single-source[]
10 changes: 7 additions & 3 deletions modules/develop/pages/data-transforms/how-transforms-work.adoc
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
= How Data Transforms Work
:page-categories: Development, Stream Processing, Data Transforms
include::develop:partial$data-transforms-ga-notice.adoc[]
:description: Learn how Redpanda data transforms work.
// tag::single-source[]

Redpanda provides the framework to build and deploy inline transformations (data transforms) on data written to Redpanda topics, delivering processed and validated data to consumers in the format they expect. Redpanda does this directly inside the broker, eliminating the need to manage a separate stream processing environment or use third-party tools.

Expand All @@ -22,7 +22,9 @@ To execute a transform function, Redpanda uses just-in-time (JIT) compilation to
When you deploy a data transform to a Redpanda broker, it stores the Wasm bytecode and associated metadata, such as input and output topics and environment variables. The broker then replicates this data across the cluster using internal Kafka topics. When the data is distributed, each shard runs its own instance of the transform function. This process includes several resource management features:

- Each shard can run only one instance of the transform function at a time to ensure efficient resource utilization and prevent overload.
ifndef::env-cloud[]
- Memory for each function is reserved within the broker with the `data_transforms_per_core_memory_reservation` and `data_transforms_per_function_memory_limit` properties. See xref:develop:data-transforms/configure.adoc#resources[Configure memory for data transforms].
endif::[]
- CPU time is dynamically allocated to the Wasm runtime to ensure that the code does not run forever and cannot block the broker from handling traffic or doing other work, such as Tiered Storage uploads.

== Flow of data transforms
Expand Down Expand Up @@ -74,6 +76,8 @@ This section outlines the limitations of data transforms. These constraints are

== Suggested reading

- xref:reference:data-transform-golang-sdk.adoc[]
- xref:reference:data-transform-rust-sdk.adoc[]
- xref:reference:data-transforms/golang-sdk.adoc[]
- xref:reference:data-transforms/rust-sdk.adoc[]
- xref:reference:rpk/rpk-transform/rpk-transform.adoc[`rpk transform` commands]

// end::single-source[]
22 changes: 19 additions & 3 deletions modules/develop/pages/data-transforms/monitor.adoc
Original file line number Diff line number Diff line change
@@ -1,12 +1,19 @@
= Monitor Data Transforms
:description: This topic provides guidelines on how to monitor the health of your data transforms and view logs.
:page-categories: Development, Stream Processing, Data Transforms
// tag::single-source[]

{description}

== Prerequisites

xref:manage:monitoring.adoc[Set up monitoring] for your Redpanda cluster.
ifndef::env-cloud[]
xref:manage:monitoring.adoc[Set up monitoring] for your cluster.
endif::[]

ifdef::env-cloud[]
xref:manage:monitor-cloud.adoc[Set up monitoring] for your cluster.
endif::[]

== Performance

Expand Down Expand Up @@ -39,7 +46,9 @@ If memory usage is consistently high or exceeds the maximum allocated memory:

- Review and optimize your transform functions to reduce memory consumption. This step can involve optimizing data structures, reducing memory allocations, and ensuring efficient handling of records.

ifndef::env-cloud[]
- Consider increasing the allocated memory for the Wasm engine. Adjust the xref:develop:data-transforms/configure.adoc#resources[`data_transforms_per_core_memory_reservation`] and xref:develop:data-transforms/configure.adoc#resources[`data_transforms_per_function_memory_limit settings`] to provide more memory to each function and the overall Wasm engine.
endif::[]

== Throughput

Expand All @@ -62,11 +71,18 @@ rpk transform logs <transform-name>

Replace `<transform-name>` with the xref:develop:data-transforms/configure.adoc[configured name] of the transform function.

TIP: You can also xref:console:ui/data-transforms.adoc#logs[view logs in Redpanda Console].
ifndef::env-cloud[]
TIP: You can also xref:console:ui/data-transforms.adoc#logs[view logs in {ui}].
endif::[]

ifdef::env-cloud[]
TIP: You can also view logs in the UI.
endif::[]

By default, Redpanda provides several settings to manage logging for data transforms, such as buffer capacity, flush interval, and maximum log line length. These settings ensure that logging operates efficiently without overwhelming the system. However, you may need to adjust these settings based on your specific requirements and workloads. For information on how to configure logging, see the xref:develop:data-transforms/configure.adoc#log[Configure transform logging] section of the configuration guide.

== Suggested reading

- xref:reference:public-metrics-reference.adoc#data_transform_metrics[Data transforms metrics]
- xref:console:ui/data-transforms.adoc[]

// end::single-source[]
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,7 @@
:description: Choose your deployment environment to get started with building and deploying your first transform function in Redpanda.
:page-aliases: reference:rpk/rpk-wasm/rpk-wasm.adoc, reference:rpk/rpk-wasm.adoc, reference:rpk/rpk-wasm/rpk-wasm-deploy.adoc, reference:rpk/rpk-wasm/rpk-wasm-generate.adoc, reference:rpk/rpk-wasm/rpk-wasm-remove.adoc, data-management:data-transform.adoc, labs:data-transform/index.adoc
:page-layout: index
:page-categories: Development, Stream Processing, Data Transforms
:page-categories: Development, Stream Processing, Data Transforms
// tag::single-source[]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the single-source tag here? I think this is just an index page in Self-Managed because there is a standalone quickstart for k8s

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe not, but tagging it without using it does not do any harm, right @JakeSCahill? e.g., in the cluster properties reference, I left tags around properties that were removed from Cloud just to save time in the future (since I think they'll be coming soon). Let me know if we should remove instead.


// end::single-source[]
Loading
Loading