diff --git a/docs/tutorial/cleanup.md b/docs/tutorial/cleanup.md index cf571dc2..130d6887 100644 --- a/docs/tutorial/cleanup.md +++ b/docs/tutorial/cleanup.md @@ -4,32 +4,83 @@ This is a part of the [Charmed Apache Kafka Tutorial](index.md). (remove-kafka-and-juju)= -## Remove Charmed Apache Kafka and Juju - -If you're done using Charmed Apache Kafka and Juju and would like to free up resources on your machine, you can safely remove both. +## Remove tutorial ```{caution} -Removing Charmed Apache Kafka as shown below will delete all the data in the Apache Kafka. Further, when you remove Juju as shown below you lose access to any other applications you have hosted on Juju. +Removing a Juju model may result in data loss for all applications in this model. ``` -To remove Charmed Apache Kafka and the model it is hosted on run the command: +To remove Charmed Apache Kafka and the `tutorial` model it is hosted on, +along with all other applications: ```shell juju destroy-model tutorial --destroy-storage --force ``` -Next step is to remove the Juju controller. You can see all of the available controllers by entering `juju controllers`. To remove the controller enter: +This will remove all applications in the `tutorial` model (Charmed Apache Kafka, +OpenSearch, PostgreSQL). +Your Juju controller and other models (if any) will remain intact for future use. + +(remove-juju)= +## (Optional) Remove Juju and LXD + +If you don't need Juju anymore and want to free up additional resources on your machine, +you can remove the Juju controller and Juju itself. + +```{caution} +When you remove Juju as shown below, +you lose access to any other applications you have hosted on Juju. +``` + +### Remove the Juju controller + +Check the list of controllers: + +```shell +juju controllers +``` + +Remove the Juju controller created in this tutorial: ```shell juju destroy-controller overlord ``` -Finally to remove Juju altogether, enter: +### Remove Juju + +To remove Juju altogether: ```shell sudo snap remove juju --purge ``` +### Clean up LXD + +If you also want to remove LXD containers and free up all resources: + +List all remaining LXD containers: + +```shell +lxc list +``` + +Delete unnecessary containers: + +```shell +lxc delete --force +``` + +If you want to uninstall LXD completely: + +```shell +sudo snap remove lxd --purge +``` + +```{warning} +Only remove LXD if you're not using it for other purposes. +LXD may be managing other containers or VMs on your system. +``` + ## What's next? In this tutorial, we've successfully deployed Apache Kafka, added/removed replicas, added/removed users to/from the cluster, and even enabled and disabled TLS. @@ -42,4 +93,3 @@ If you're looking for what to do next you can: - [Report](https://github.com/canonical/kafka-operator/issues) any problems you encountered. - [Give us your feedback](https://matrix.to/#/#charmhub-data-platform:ubuntu.com). - [Contribute to the code base](https://github.com/canonical/kafka-operator) - diff --git a/docs/tutorial/deploy.md b/docs/tutorial/deploy.md index d4528629..3a242439 100644 --- a/docs/tutorial/deploy.md +++ b/docs/tutorial/deploy.md @@ -3,83 +3,101 @@ This is a part of the [Charmed Apache Kafka Tutorial](index.md). -## Deploy Charmed Apache Kafka - To deploy Charmed Apache Kafka, all you need to do is run the following commands, which will automatically fetch [Apache Kafka](https://charmhub.io/kafka?channel=4/edge) from [Charmhub](https://charmhub.io/) and deploy it to your model. -For example, to deploy a cluster of three Apache Kafka brokers, you can simply run: +Charmed Apache Kafka can run both with `roles=broker` and/or `roles=controller`. With this configuration option, the charm can be deployed either as a single application running both Apache Kafka brokers and KRaft controllers, or as multiple applications with a separate controller cluster and broker cluster. + +For this tutorial, we will deploy brokers separately. +To deploy a cluster of three Apache Kafka brokers: ```shell -juju deploy kafka -n 3 --channel 4/edge --roles=broker +juju deploy kafka -n 3 --channel 4/edge --config roles=broker ``` -Apache Kafka also uses the KRaft consensus protocol for coordinating broker information, topic + partition metadata and Access Control Lists (ACLs), ran as a quorum of controller nodes using the Raft consensus algorithm. +Juju will now fetch Charmed Apache Kafka and begin deploying it to the LXD cloud. +Now check the Juju model status: -```{note} -KRaft replaces the dependency on Apache ZooKeeper for metadata management. -For more information on the differences between the two solutions, please refer to the -[upstream Apache Kafka documentation](https://kafka.apache.org/41/getting-started/zk2kraft/). +```shell +juju status ``` -Charmed Apache Kafka can run both with `roles=broker` and/or `roles=controller`. With this configuration option, the charm can be deployed either as a single application running both Apache Kafka brokers and KRaft controllers, or as multiple applications with a separate controller cluster and broker cluster. +Wait for the `blocked` status with the message +`application needs to be related with a KRaft controller`. + +Apache Kafka uses the KRaft consensus protocol for coordinating broker information, +topic + partition metadata and Access Control Lists (ACLs), ran as a quorum of +controller nodes using the Raft consensus algorithm. KRaft replaces the dependency on +Apache ZooKeeper for metadata management. For more information on the differences +between the two solutions, please refer to the +[upstream Apache Kafka documentation](https://kafka.apache.org/41/getting-started/zk2kraft/). To deploy a cluster of three KRaft controllers, run: ```shell -juju deploy kafka -n 3 --channel 4/edge --roles=controller kraft +juju deploy kafka -n 3 --channel 4/edge --config roles=controller kraft ``` -After this, it is necessary to connect the two clusters, taking care to specify which cluster is the orchestrator: +After this, it is necessary to connect the two deployed applications, +taking care to specify which cluster is the orchestrator by selecting the specific relation types: ```shell juju integrate kafka:peer-cluster-orchestrator kraft:peer-cluster ``` -Juju will now fetch Charmed Apache Kafka and begin deploying both applications to the LXD cloud before connecting them to exchange access credentials and machine endpoints. This process can take several minutes depending on the resources available on your machine. You can track the progress by running: +Juju will now connect applications to exchange access credentials and machine endpoints. +This process can take several minutes depending on the resources available on your machine. +You can track the progress by running: ```shell -watch -n 1 --color juju status --color +watch juju status --color ``` -This command is useful for checking the status of both Charmed Apache Kafka applications, and for gathering information about the machines hosting the two applications. Some of the helpful information it displays includes IP addresses, ports, status etc. -The command updates the status of the cluster every second and as the application starts you can watch the status and messages both applications change. +This command is useful for checking the status of both Charmed Apache Kafka applications, +and for gathering information about the machines hosting the two applications. +Some of the helpful information it displays includes IP addresses, ports, status etc. +The command updates the status of the cluster every two seconds and as the application starts +you can watch the status and messages both applications change. -Wait until the application is ready - when it is ready, `watch -n 1 --color juju status --color` will show: +Wait until the applications are `active` and all units show `active`/`idle` status: ```shell -Model Controller Cloud/Region Version SLA Timestamp -tutorial overlord localhost/localhost 3.6.8 unsupported 15:53:00Z +Model Controller Cloud/Region Version SLA Timestamp +tutorial overlord localhost/localhost 3.6.13 unsupported 12:33:46Z App Version Status Scale Charm Channel Rev Exposed Message -kafka 4.0.0 active 3 kafka 4/edge 226 no -kraft 4.0.0 active 3 kafka 4/edge 226 no +kafka 4.0.0 active 3 kafka 4/edge 245 no +kraft 4.0.0 active 3 kafka 4/edge 245 no Unit Workload Agent Machine Public address Ports Message -kafka/0* active idle 0 10.233.204.241 19093/tcp -kafka/1 active idle 1 10.233.204.196 19093/tcp -kafka/2 active idle 2 10.233.204.148 19093/tcp -kraft/0 active idle 3 10.233.204.125 9098/tcp -kraft/1* active idle 4 10.233.204.36 9098/tcp -kraft/2 active idle 5 10.233.204.225 9098/tcp +kafka/0* active idle 0 10.109.154.47 19093/tcp +kafka/1 active idle 1 10.109.154.171 19093/tcp +kafka/2 active idle 2 10.109.154.82 19093/tcp +kraft/0* active idle 3 10.109.154.49 9098/tcp +kraft/1 active idle 4 10.109.154.148 9098/tcp +kraft/2 active idle 5 10.109.154.50 9098/tcp -Machine State Address Inst id Base AZ Message -0 started 10.233.204.241 juju-07a730-0 ubuntu@24.04 Running -1 started 10.233.204.196 juju-07a730-1 ubuntu@24.04 Running -2 started 10.233.204.148 juju-07a730-2 ubuntu@24.04 Running -3 started 10.233.204.125 juju-07a730-3 ubuntu@24.04 Running -4 started 10.233.204.36 juju-07a730-4 ubuntu@24.04 Running -5 started 10.233.204.225 juju-07a730-5 ubuntu@24.04 Running +Machine State Address Inst id Base AZ Message +0 started 10.109.154.47 juju-030538-0 ubuntu@24.04 dev Running +1 started 10.109.154.171 juju-030538-1 ubuntu@24.04 dev Running +2 started 10.109.154.82 juju-030538-2 ubuntu@24.04 dev Running +3 started 10.109.154.49 juju-030538-3 ubuntu@24.04 dev Running +4 started 10.109.154.148 juju-030538-4 ubuntu@24.04 dev Running +5 started 10.109.154.50 juju-030538-5 ubuntu@24.04 dev Running ``` -To exit the screen with `watch -n 1 --color juju status --color`, enter `Ctrl+c`. +To exit the screen, push `Ctrl+C`. ## Access Apache Kafka brokers -Once all the units are shown as `active|idle`, the credentials can be retrieved. +Once all the units are shown as `active`/`idle`, the credentials can be retrieved. -All sensitive configuration data used by Charmed Apache Kafka, such as passwords and SSL certificates, is stored in Juju secrets. See the [Juju secrets documentation](https://documentation.ubuntu.com/juju/3.6/reference/secret/) for more information. +All sensitive configuration data used by Charmed Apache Kafka, +such as passwords and SSL certificates, is stored in Juju secrets. +See the [Juju secrets documentation](https://documentation.ubuntu.com/juju/3.6/reference/secret/) +for more information. -To reveal the contents of the Juju secret containing sensitive cluster data for the Charmed Apache Kafka application, you can run: +To reveal the contents of the Juju secret containing sensitive cluster data +for the Charmed Apache Kafka application, you can run: ```shell juju show-secret --reveal cluster.kafka.app @@ -88,63 +106,86 @@ juju show-secret --reveal cluster.kafka.app The output of the previous command will look something like this: ```shell -d2lj5jgco3bs3dacm2tg: +d5ipahpdormt02antvpg: revision: 1 - checksum: a6517abdd5e22038bfafe988e6253bb03c0462067b50475789eb6bc658ee0b11 + checksum: f84bf383e76ddda391543d57a8b76dbef4e95813b820a466fb4815b098bda3b2 owner: kafka label: cluster.kafka.app - created: 2025-08-24T15:42:13Z - updated: 2025-08-24T15:42:13Z + created: 2026-01-13T00:43:58Z + updated: 2026-01-13T00:43:58Z content: - admin-password: dxpex3Uc1sWIBna83gELtJOhAuW2awji - sync-password: eqI0RLV1lRSaIIiDKf3yz0W66ajICmDT internal-ca: |- - + -----BEGIN CERTIFICATE----- + ... + -----END CERTIFICATE----- internal-ca-key: |- - + -----BEGIN RSA PRIVATE KEY----- + ... + -----END RSA PRIVATE KEY----- + operator-password: 0g7010iwtBrChk00Ad1pznzaZW0i2Pdt + replication-password: tatsvzFV3de4Ce2NEL2HVQWAlSpx7gyv ``` -The important line here for accessing the Apache Kafka cluster itself is `admin-password`, which tells us that `username=admin` and `password=dxpex3Uc1sWIBna83gELtJOhAuW2awji`. These are the credentials to use to successfully authenticate to the cluster. +The important line here for accessing the Apache Kafka cluster itself is `operator-password`, +which tells us that `username=operator` and `password=0g7010iwtBrChk00Ad1pznzaZW0i2Pdt`. +These are the credentials to use to successfully authenticate to the cluster. For simplicity, the password can also be directly retrieved by parsing the YAML response from the previous command directly using `yq`: ```shell -juju show-secret --reveal cluster.kafka.app | yq '.. | ."admin-password"? // empty' | tr -d '"' +juju show-secret --reveal cluster.kafka.app | yq -r '.[].content["operator-password"]' ``` ```{caution} -When no other application is integrated to Charmed Apache Kafka, the cluster is secured-by-default and external listeners (bound to port `9092`) are disabled, thus preventing any external incoming connection. +When no other application is integrated to Charmed Apache Kafka, +the cluster is secured-by-default and external listeners (bound to port `9092`) are disabled, +thus preventing any external incoming connection. ``` -We will also need a bootstrap server Apache Kafka broker address and port to initially connect to. When any application connects for the first time to a `bootstrap-server`, the client will automatically make a metadata request that returns the full set of Apache Kafka brokers with their addresses and ports. +We will also need a bootstrap server Apache Kafka broker address and port to initially connect to. +When any application connects for the first time to a `bootstrap-server`, +the client will automatically make a metadata request that returns the full set of +Apache Kafka brokers with their addresses and ports. To use `kafka/0` as the `bootstrap-server`, retrieve its IP address and add a port with: ```shell -bootstrap_address=$(juju show-unit kafka/0 | yq '.. | ."public-address"? // empty' | tr -d '"') +bootstrap_address=$(juju show-unit kafka/0 | yq '.. | ."public-address"? // ""' | tr -d '"' | tr -d '\r\n' | sed 's/^[[:space:]]*//;s/[[:space:]]*$//') -export BOOTSTRAP_SERVER=$bootstrap_address:19093 +export BOOTSTRAP_SERVER="${bootstrap_address}:19093" ``` where `19093` refers to the available open internal port on the broker unit. -It is always possible to run a command from within the Apache Kafka cluster using the internal listeners and ports in place of the external ones. For an explanation of Charmed Apache Kafka listeners, please refer to [Apache Kafka listeners](reference-broker-listeners). +It is always possible to run a command from within the Apache Kafka cluster using +the internal listeners and ports in place of the external ones. +For an explanation of Charmed Apache Kafka listeners, please refer to +[Apache Kafka listeners](reference-broker-listeners). -To jump in to a running Charmed Apache Kafka unit and run a command, for example listing files in a directory, you can do the following: +To jump in to a running Charmed Apache Kafka unit and run a command, +for example listing files in a directory, you can do the following: ```shell juju ssh kafka/leader sudo -i "ls \$BIN/bin" ``` -where the printed result will be the output from the `ls \$BIN/bin` command being executed on the `kafka` leader unit. +where the printed result will be the output from the `ls \$BIN/bin` command being executed +on the `kafka` leader unit. ```{note} -Charmed Apache Kafka exports (among others) four different environment variables for conveniently referencing various file-system directories relevant to the workload, `$BIN`, `$LOGS`, `$CONF` and `$DATA` - more information on these directories can be found in [File system paths](reference-file-system-paths). +Charmed Apache Kafka exports (among others) four different environment variables for conveniently +referencing various file-system directories relevant to the workload, +`$BIN`, `$LOGS`, `$CONF` and `$DATA` - more information on these directories can be found in +[File system paths](reference-file-system-paths). ``` -When the unit has started, the Charmed Apache Kafka Operator installs the [`charmed-kafka`](https://snapcraft.io/charmed-kafka) snap in the unit that provides a number of snap commands (that corresponds to the shell-script `bin/kafka-*.sh` commands in the Apache Kafka distribution) for performing various administrative and operational tasks. +When the unit has started, the Charmed Apache Kafka Operator installs the +[`charmed-kafka`](https://snapcraft.io/charmed-kafka) snap in the unit that provides a number +of snap commands (that corresponds to the shell-script `bin/kafka-*.sh` commands +in the Apache Kafka distribution) for performing various administrative and operational tasks. -Within the machine, Charmed Apache Kafka also creates a `$CONF/client.properties` file that already provides the relevant settings to connect to the cluster using the CLI. +Within the machine, Charmed Apache Kafka also creates a `$CONF/client.properties` +file that already provides the relevant settings to connect to the cluster using the CLI. For example, in order to create a topic, you can run: @@ -180,10 +221,13 @@ juju ssh kafka/0 sudo -i \ --command-config \$CONF/client.properties" ``` -For a full list of the available Charmed Kafka command-line tools, please refer to [snap commands](reference-snap-commands). +For a full list of the available Charmed Kafka command-line tools, please refer to +[snap commands](reference-snap-commands) reference. ## What's next? -Although the commands above can run within the cluster, it is generally recommended during operations to enable external listeners and use these for running the admin commands from outside the cluster. -To do so, as we will see in the next section, we will deploy a [data-integrator](https://charmhub.io/data-integrator) charm and relate it to Charmed Apache Kafka. - +Although the commands above can run within the cluster, it is generally recommended +during operations to enable external listeners and use these for running the admin commands +from outside the cluster. +To do so, as we will see in the next section, we will deploy a +[data-integrator](https://charmhub.io/data-integrator) charm and relate it to Charmed Apache Kafka. diff --git a/docs/tutorial/enable-encryption.md b/docs/tutorial/enable-encryption.md index 6b5d1d31..5dbf423d 100644 --- a/docs/tutorial/enable-encryption.md +++ b/docs/tutorial/enable-encryption.md @@ -3,8 +3,6 @@ This is a part of the [Charmed Apache Kafka Tutorial](index.md). -## Transport Layer Security (TLS) - [TLS](https://en.wikipedia.org/wiki/Transport_Layer_Security) is used to encrypt data exchanged between two applications; it secures data transmitted over the network. Typically, enabling TLS within a highly available database, and between a highly available database and client/server applications, requires domain-specific knowledge and a high level of expertise. Fortunately, the domain-specific knowledge has been encoded into Charmed Apache Kafka. This means (re-)configuring TLS on Charmed Apache Kafka is readily available and requires minimal effort on your end. Juju relations are particularly useful for enabling TLS. @@ -18,7 +16,7 @@ In this tutorial, we will distribute [self-signed certificates](https://en.wikip This setup is only for testing and demonstrating purposes and self-signed certificates are not recommended in a production cluster. For more information about which charm may better suit your use-case, please see the [Security with X.509 certificates](https://charmhub.io/topics/security-with-x-509-certificates) page. ``` -### Configure TLS +## Configure TLS Before enabling TLS on Charmed Apache Kafka we must first deploy the `self-signed-certificates` charm: @@ -26,82 +24,145 @@ Before enabling TLS on Charmed Apache Kafka we must first deploy the `self-signe juju deploy self-signed-certificates --config ca-common-name="Tutorial CA" ``` -Wait for the charm to settle into an `active/idle` state, as shown by the `juju status`: +Wait for the charm to settle into an `active`/`idle` state, as shown by the `juju status` command. + +
Output example ```shell -Model Controller Cloud/Region Version SLA Timestamp -tutorial overlord localhost/localhost 3.6.8 unsupported 23:27:35Z +Model Controller Cloud/Region Version SLA Timestamp +tutorial overlord localhost/localhost 3.6.13 unsupported 17:56:56Z + +App Version Status Scale Charm Channel Rev Exposed Message +data-integrator blocked 1 data-integrator latest/stable 180 no Please relate the data-integrator with the desired product +kafka 4.0.0 active 3 kafka 4/edge 245 no +kraft 4.0.0 active 3 kafka 4/edge 245 no +self-signed-certificates active 1 self-signed-certificates 1/stable 317 no + +Unit Workload Agent Machine Public address Ports Message +data-integrator/0* blocked idle 6 10.109.154.254 Please relate the data-integrator with the desired product +kafka/0* active idle 0 10.109.154.47 19093/tcp +kafka/1 active idle 1 10.109.154.171 19093/tcp +kafka/2 active idle 2 10.109.154.82 19093/tcp +kraft/0* active idle 3 10.109.154.49 9098/tcp +kraft/1 active idle 4 10.109.154.148 9098/tcp +kraft/2 active idle 5 10.109.154.50 9098/tcp +self-signed-certificates/0* active idle 8 10.109.154.248 + +Machine State Address Inst id Base AZ Message +0 started 10.109.154.47 juju-030538-0 ubuntu@24.04 dev Running +1 started 10.109.154.171 juju-030538-1 ubuntu@24.04 dev Running +2 started 10.109.154.82 juju-030538-2 ubuntu@24.04 dev Running +3 started 10.109.154.49 juju-030538-3 ubuntu@24.04 dev Running +4 started 10.109.154.148 juju-030538-4 ubuntu@24.04 dev Running +5 started 10.109.154.50 juju-030538-5 ubuntu@24.04 dev Running +6 started 10.109.154.254 juju-030538-6 ubuntu@24.04 dev Running +8 started 10.109.154.248 juju-030538-8 ubuntu@24.04 dev Running +``` -App Version Status Scale Charm Channel Rev Exposed Message -self-signed-certificates active 1 self-signed-certificates 1/edge 336 no +
-Unit Workload Agent Machine Public address Ports Message -self-signed-certificates/0* active idle 7 10.233.204.134 +To enable TLS on Charmed Apache Kafka, integrate with `self-signed-certificates` charm: -Machine State Address Inst id Base AZ Message -7 started 10.233.204.134 juju-07a730-7 ubuntu@24.04 Running +```shell +juju integrate kafka:certificates self-signed-certificates ``` -To enable TLS on Charmed Apache Kafka, integrate with `self-signed-certificates` charm: +After the charms settle into `active`/`idle` states, the Apache Kafka listeners +should now have been swapped to the default encrypted port `9093`. +This can be tested by testing whether the ports are open/closed with `telnet`: ```shell -juju integrate kafka:certificates self-signed-certificates +telnet 9092 +telnet 9093 +``` + +where `Public IP address` is the IP of any Charmed Apache Kafka application units. + +Both commands will be **unable to connect** now, as our Apache Kafka cluster +has no active listeners due to absence of integrated applications. + +```{caution} +When no other application is integrated to Charmed Apache Kafka, +the cluster is secured-by-default and external listeners (bound to port `9092`) are disabled, +thus preventing any external incoming connection. ``` -After the charms settle into `active/idle` states, the Apache Kafka listeners should now have been swapped to the -default encrypted port 9093. This can be tested by testing whether the ports are open/closed with `telnet`: +Let's integrate the `data-integrator` application to the Apache Kafka cluster: ```shell -telnet 9092 -telnet 9093 +juju integrate data-integrator kafka ``` -### Enable TLS encrypted connection +After all units are back to `active`/`idle`, you will see the new ports in the `juju status` output. +Now try connecting with `telnet` again: -Once TLS is configured on the cluster side, client applications should be configured as well to connect to -the correct port and trust the self-signed CA provided by the `self-signed-certificates` charm. +```shell +telnet 9092 +telnet 9093 +``` + +The `9092` port connection now should show a connection error, +while the `9093` port should establish a connection. -Make sure that the `kafka-test-app` is not connected to the Charmed Apache Kafka, by removing the relation if it exists: +## Enable TLS encrypted connection + +Once TLS is configured on the cluster side, client applications should be configured as well +to connect to the correct port and trust the self-signed CA provided by +the `self-signed-certificates` charm. + +Let's deploy our [Apache Kafka Test App](https://charmhub.io/kafka-test-app) again: ```shell -juju remove-relation kafka-test-app kafka +juju deploy kafka-test-app --channel edge ``` -Then, enable encryption on the `kafka-test-app` by relating with the `self-signed-certificates` charm: +Then, enable encryption on the `kafka-test-app` by integrating with +the `self-signed-certificates` charm: ```shell juju integrate kafka-test-app self-signed-certificates ``` -We can then set up the `kafka-test-app` to produce messages with the usual configuration (note that there is no difference -here with the unencrypted workflow): +We can then set up the `kafka-test-app` to produce messages with the usual configuration +(note that that the process here is the same as with the unencrypted workflow): ```shell -juju config kafka-test-app topic_name=HOT-TOPIC role=producer num_messages=25 +juju config kafka-test-app topic_name=HOT-TOPIC role=producer num_messages=20 ``` -Then relate with the `kafka` cluster: +Finally, relate with the `kafka` cluster: ```shell juju integrate kafka kafka-test-app ``` -As before, you can check that the messages are pushed into the Charmed Apache Kafka cluster by inspecting the logs: +Wait for `active`/`idle` status in `juju status` and check that the messages are pushed into +the Charmed Apache Kafka cluster by inspecting the logs: ```shell juju exec --application kafka-test-app "tail /tmp/*.log" ``` -Note that if the `kafka-test-app` was running before, there may be multiple logs related to the different -runs. Refer to the latest logs produced and also check that in the logs the connection is indeed established -with the encrypted port `9093`. +Refer to the latest logs produced and also check that in the logs the connection +is indeed established with the encrypted port `9093`. -### Remove external TLS certificate +## Remove external TLS certificate -To remove the external TLS and return to the locally generated one, remove relation with certificates provider: +To remove the external TLS and return to the locally generated one, +remove relation with certificates provider: ```shell juju remove-relation kafka self-signed-certificates ``` The Charmed Apache Kafka application is not using TLS anymore for client connections. + +## Clean up + +Before proceeding further, let's remove the `kafka-test-app` application: + +```shell +juju remove-relation kafka-test-app kafka +juju remove-relation kafka-test-app self-signed-certificates +juju remove-application kafka-test-app --destroy-storage +``` diff --git a/docs/tutorial/environment.md b/docs/tutorial/environment.md index e7c2ac75..4693d4a2 100644 --- a/docs/tutorial/environment.md +++ b/docs/tutorial/environment.md @@ -3,8 +3,6 @@ This is a part of the [Charmed Apache Kafka Tutorial](index.md). -## Setup the environment - For this tutorial, we will need to set up the environment with two main components, and extra command-line tooling: * [LXD](https://github.com/canonical/lxd) - a simple and lightweight virtual machine provisioner @@ -12,19 +10,20 @@ For this tutorial, we will need to set up the environment with two main componen * [yq](https://github.com/mikefarah/yq) - a command-line YAML processor * [jq](https://github.com/jqlang/jq) - a command-line JSON processor -### Prepare LXD - -The fastest, simplest way to get started with Charmed Apache Kafka is to set up a local LXD cloud. LXD is a system container and virtual machine manager; Apache Kafka will be run in one of these containers and managed by Juju. While this tutorial covers the basics of LXD, you can [learn more about LXD here](https://documentation.ubuntu.com/lxd/stable-5.21/). LXD comes pre-installed on Ubuntu 24.04 LTS. Verify that LXD is installed by entering the command `which lxd` into the command line, this will output: +## Prepare LXD -```shell -/snap/bin/lxd +The fastest, simplest way to get started with Charmed Apache Kafka is to set up a local LXD cloud. +LXD is a system container and virtual machine manager; +Apache Kafka will be run in one of these containers and managed by Juju. +While this tutorial covers the basics of LXD, you can +[learn more about LXD here](https://documentation.ubuntu.com/lxd/stable-5.21/). -# or for some systems +LXD comes pre-installed on Ubuntu 24.04 LTS. Verify that LXD is installed by entering the command +`which lxd`. This will output `/snap/bin/lxd` or, for some systems, `/usr/sbin/lxd`. -/usr/sbin/lxd -``` - -Although LXD is already installed, we need to run `lxd init` to perform post-installation tasks. For this tutorial, the default parameters are preferred and the network bridge should be set to have no IPv6 addresses since Juju does not support IPv6 addresses with LXD: +Although LXD is already installed, we need to run `lxd init` to perform post-installation tasks. +For this tutorial, the default parameters are preferred and the network bridge should be set +to have no IPv6 addresses since Juju does not support IPv6 addresses with LXD: ```shell lxd init --auto @@ -33,29 +32,42 @@ lxc network set lxdbr0 ipv6.address none You can list all LXD containers by entering the command `lxc list` into the command line. However, at this point of the tutorial, none should exist and you'll only see this as output: -``` +```text +------+-------+------+------+------+-----------+ | NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS | +------+-------+------+------+------+-----------+ ``` -### Install and prepare Juju +## Install and prepare Juju -[Juju](https://juju.is/) is an Operator Lifecycle Manager (OLM) for clouds, bare metal, LXD or Kubernetes. We will be using it to deploy and manage Charmed Apache Kafka. As may be true for LXD, Juju is installed from a snap package: +[Juju](https://juju.is/) is an Operator Lifecycle Manager (OLM) for clouds, bare metal, +LXD or Kubernetes. We will be using it to deploy and manage Charmed Apache Kafka. +As may be true for LXD, Juju is installed from a snap package: ```shell sudo snap install juju ``` -Juju already has built-in knowledge of LXD and how it works, so there is no additional setup or configuration needed. A Juju controller will be deployed, which will in turn manage the operations of Charmed Apache Kafka. All we need to do is run the following command to bootstrap a Juju controller named `overlord` to LXD. This bootstrapping process can take several minutes depending on the resources available on your machine: +Juju already has built-in knowledge of LXD and how it works, so there is no additional setup +or configuration needed. A Juju controller will be deployed, which will in turn +manage the operations of Charmed Apache Kafka. All we need to do is run the following command +to bootstrap a Juju controller named `overlord` to LXD. This bootstrapping process can take +several minutes depending on the resources available on your machine: ```shell juju bootstrap localhost overlord ``` -The Juju controller exists within an LXD container. You can verify this by entering the command `lxc list` and you should see the following: +The Juju controller exists within an LXD container. +To verify this, check the list of containers: +```shell +lxc list ``` + +
Output example + +```text +---------------+---------+-----------------------+------+-----------+-----------+ | NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS | +---------------+---------+-----------------------+------+-----------+-----------+ @@ -63,20 +75,31 @@ The Juju controller exists within an LXD container. You can verify this by enter +---------------+---------+-----------------------+------+-----------+-----------+ ``` -where `` is a unique combination of numbers and letters such as `9d7e4e-0` +where `` is a unique combination of numbers and letters such as `9d7e4e-0`. + +
-The controller can work with different models; models host applications such as Charmed Apache Kafka. Set up a specific model for Charmed Apache Kafka named `tutorial`: +The controller can work with different models; +models host applications such as Charmed Apache Kafka. +Set up a specific model for Charmed Apache Kafka named `tutorial`: ```shell juju add-model tutorial ``` -You can now view the model you created above by entering the command `juju status` into the command line. You should see the following: +Check the status of the model you created: +```shell +juju status ``` -Model Controller Cloud/Region Version SLA Timestamp -tutorial overlord localhost/localhost 3.6.8 unsupported 23:20:53Z + +
Output example + +```text +Model Controller Cloud/Region Version SLA Timestamp +tutorial overlord localhost/localhost 3.6.13 unsupported 12:10:54Z Model "admin/tutorial" is empty. ``` +
diff --git a/docs/tutorial/integrate-with-client-applications.md b/docs/tutorial/integrate-with-client-applications.md index 1996ce33..fa100d9b 100644 --- a/docs/tutorial/integrate-with-client-applications.md +++ b/docs/tutorial/integrate-with-client-applications.md @@ -3,15 +3,13 @@ This is a part of the [Charmed Apache Kafka Tutorial](index.md). -## Integrate with client applications - As mentioned in the previous section of the Tutorial, the recommended way to create and manage users is by means of another charm: the [Data Integrator Charm](https://charmhub.io/data-integrator). This lets us to encode users directly in the Juju model, and - as shown in the following - rotate user credentials with and without application downtime using relations. ```{note} Relations, or what Juju documentation describes also as [Integrations](https://documentation.ubuntu.com/juju/3.6/reference/relation/), let two charms to exchange information and interact with one another. Creating a relation between Charmed Apache Kafka and the Data Integrator will automatically generate a username, password, and assign relevant permissions on a given topic. This is the simplest method to create and manage users in Charmed Apache Kafka. ``` -### Data Integrator charm +## Data Integrator charm The [Data Integrator charm](https://charmhub.io/data-integrator) is a bare-bones charm for central management of database users, providing support for different kinds of data platforms (e.g. MongoDB, MySQL, PostgreSQL, Apache Kafka, OpenSearch, etc.) with a consistent, opinionated and robust user experience. To deploy the Data Integrator charm we can use the command `juju deploy` we have learned above: @@ -19,52 +17,58 @@ The [Data Integrator charm](https://charmhub.io/data-integrator) is a bare-bones juju deploy data-integrator --config topic-name=test-topic --config extra-user-roles=producer,consumer ``` -The expected output: +
Output example ```shell -Located charm "data-integrator" in charm-hub, revision 11 -Deploying "data-integrator" from charm-hub charm "data-integrator", revision 11 in channel stable on noble +Deployed "data-integrator" from charm-hub charm "data-integrator", revision 180 in channel latest/stable on ubuntu@24.04/stable ``` -### Relate to Charmed Apache Kafka +
-Now that the Database Integrator charm has been set up, we can relate it to Charmed Apache Kafka. This will automatically create a username, password, and database for the Database Integrator charm. Relate the two applications with: +To automatically create a username, password, and database for the Database Integrator charm, +integrate it to the Charmed Apache Kafka: ```shell juju integrate data-integrator kafka ``` -Wait for `watch -n 1 --color juju status --color` to show: +Wait for the status to become `active`/`idle` with the +`watch juju status --color` command. + +
Output example ```shell -Model Controller Cloud/Region Version SLA Timestamp -tutorial overlord localhost/localhost 3.6.8 unsupported 17:00:08Z +Model Controller Cloud/Region Version SLA Timestamp +tutorial overlord localhost/localhost 3.6.13 unsupported 12:50:51Z App Version Status Scale Charm Channel Rev Exposed Message data-integrator active 1 data-integrator latest/stable 180 no -kafka 4.0.0 active 3 kafka 4/edge 226 no -kraft 4.0.0 active 3 kafka 4/edge 226 no - -Unit Workload Agent Machine Public address Ports Message -data-integrator/0* active idle 6 10.233.204.111 -kafka/0* active idle 0 10.233.204.241 9092,19093/tcp -kafka/1 active idle 1 10.233.204.196 9092,19093/tcp -kafka/2 active idle 2 10.233.204.148 9092,19093/tcp -kraft/0 active idle 3 10.233.204.125 9098/tcp -kraft/1* active idle 4 10.233.204.36 9098/tcp -kraft/2 active idle 5 10.233.204.225 9098/tcp - -Machine State Address Inst id Base AZ Message -0 started 10.233.204.241 juju-07a730-0 ubuntu@24.04 Running -1 started 10.233.204.196 juju-07a730-1 ubuntu@24.04 Running -2 started 10.233.204.148 juju-07a730-2 ubuntu@24.04 Running -3 started 10.233.204.125 juju-07a730-3 ubuntu@24.04 Running -4 started 10.233.204.36 juju-07a730-4 ubuntu@24.04 Running -5 started 10.233.204.225 juju-07a730-5 ubuntu@24.04 Running -6 started 10.233.204.111 juju-07a730-6 ubuntu@24.04 Running +kafka 4.0.0 active 3 kafka 4/edge 245 no +kraft 4.0.0 active 3 kafka 4/edge 245 no + +Unit Workload Agent Machine Public address Ports Message +data-integrator/0* active idle 6 10.109.154.254 +kafka/0* active executing 0 10.109.154.47 9092,19093/tcp +kafka/1 active idle 1 10.109.154.171 9092,19093/tcp +kafka/2 active idle 2 10.109.154.82 9092,19093/tcp +kraft/0* active idle 3 10.109.154.49 9098/tcp +kraft/1 active idle 4 10.109.154.148 9098/tcp +kraft/2 active idle 5 10.109.154.50 9098/tcp + +Machine State Address Inst id Base AZ Message +0 started 10.109.154.47 juju-030538-0 ubuntu@24.04 dev Running +1 started 10.109.154.171 juju-030538-1 ubuntu@24.04 dev Running +2 started 10.109.154.82 juju-030538-2 ubuntu@24.04 dev Running +3 started 10.109.154.49 juju-030538-3 ubuntu@24.04 dev Running +4 started 10.109.154.148 juju-030538-4 ubuntu@24.04 dev Running +5 started 10.109.154.50 juju-030538-5 ubuntu@24.04 dev Running +6 started 10.109.154.254 juju-030538-6 ubuntu@24.04 dev Running ``` -To retrieve information such as the username, password, and topic. Enter: +
+ +After the integration is all set, try retrieving credentials such as the username, +password, and topic: ```shell juju run data-integrator/leader get-credentials @@ -73,48 +77,62 @@ juju run data-integrator/leader get-credentials This should output something like: ```yaml +Running operation 1 with 1 task + - task 2 on unit-data-integrator-0 + +Waiting for task 2... kafka: - consumer-group-prefix: relation-9- - data: '{"extra-user-roles": "producer,consumer", "provided-secrets": "[\"mtls-cert\"]", - "requested-secrets": "[\"username\", \"password\", \"tls\", \"tls-ca\", \"uris\", - \"read-only-uris\"]", "topic": "test-topic"}' - endpoints: 10.233.204.148:9092,10.233.204.196:9092,10.233.204.241:9092 - password: JwxZmIgHIkafm0T6nyPIKbF8m29EALoI + consumer-group-prefix: relation-8- + data: '{"resource": "test-topic", "salt": "92Lgizh3GIHxOqTr", "extra-user-roles": + "producer,consumer", "provided-secrets": ["mtls-cert"], "requested-secrets": ["username", + "password", "tls", "tls-ca", "uris", "read-only-uris"]}' + endpoints: 10.109.154.171:9092,10.109.154.47:9092,10.109.154.82:9092 + password: Pw5UJtv1dcOkQzN47qVQNYNSsajeMD5Q + resource: test-topic + salt: kAh9cPhs7LnU9OBv tls: disabled topic: test-topic - username: relation-9 + username: relation-8 + version: v0 ok: "True" ``` -Make note of the values for `bootstrap-server`, `username` and `password`, we'll be using them later. - +Make note of the values for `endpoints`, `username` and `password`, we'll be using them later. -### Produce/consume messages +## Non-charmed applications -We will now use the username and password to produce some messages to Apache Kafka. To do so, we will first deploy the [Apache Kafka Test App](https://charmhub.io/kafka-test-app): a simplistic charm meant only for testing, that also bundles some Python scripts to push data to Apache Kafka, e.g: +We will now use the username and password to produce some messages to Apache Kafka. +To do so, we will first deploy the [Apache Kafka Test App](https://charmhub.io/kafka-test-app): +a simplistic charm meant only for testing, that also bundles some Python scripts to push data +to Apache Kafka: ```shell juju deploy kafka-test-app --channel edge ``` -Once the charm is up and running, you can log into the container +Wait for the charm to become `active`/`idle`, and log into the container: ```shell juju ssh kafka-test-app/0 /bin/bash ``` -and make sure that the Python virtual environment libraries are visible: +Make sure that the Python virtual environment libraries are visible: ```shell export PYTHONPATH="/var/lib/juju/agents/unit-kafka-test-app-0/charm/venv:/var/lib/juju/agents/unit-kafka-test-app-0/charm/lib" ``` -Once this is set up, you should be able to use the `client.py` script that exposes some functionality to produce and consume messages. -You can explore the usage of the script +Once this is set up, you can use the `client.py` script that exposes some functionality to produce and consume messages. + +Let's try that script runs: ```shell python3 -m charms.kafka.v0.client --help +``` +
Output example + +```text usage: client.py [-h] [-t TOPIC] [-u USERNAME] [-p PASSWORD] [-c CONSUMER_GROUP_PREFIX] [-s SERVERS] [-x SECURITY_PROTOCOL] [-n NUM_MESSAGES] [-r REPLICATION_FACTOR] [--num-partitions NUM_PARTITIONS] [--producer] [--consumer] [--cafile-path CAFILE_PATH] [--certfile-path CERTFILE_PATH] [--keyfile-path KEYFILE_PATH] [--mongo-uri MONGO_URI] [--origin ORIGIN] @@ -149,100 +167,166 @@ options: --origin ORIGIN ``` -Using this script, you can therefore start producing messages (change the values of `username`, `password` and `bootstrap-servers` to the ones obtained from the `data-integrator` application in the previous section): +
+ +Now let's try producing and then consuming some messages. + +Change the values of `username`, `password` and `endpoints` to the ones obtained +from the `data-integrator` application in the previous section and run the script +to produce message: ```shell python3 -m charms.kafka.v0.client \ - -u relation-6 \ - -p S4IeRaYaiiq0tsM7m2UZuP2mSI573IGV \ + -u \ + -p \ -t test-topic \ - -s "10.244.26.43:9092,10.244.26.6:9092,10.244.26.19:9092" \ + -s "" \ -n 10 \ -r 3 \ --num-partitions 1 \ - --producer \ + --producer ``` -Let this run for a few seconds, then halt the process with `Ctrl+c`. +Let this run for a few seconds, then halt the process by pushing `Ctrl+C`. Now, consume them with: ```shell python3 -m charms.kafka.v0.client \ - -u relation-6 \ - -p S4IeRaYaiiq0tsM7m2UZuP2mSI573IGV \ + -u \ + -p \ -t test-topic \ - -s "10.244.26.43:9092,10.244.26.6:9092,10.244.26.19:9092" \ - -c "cg" \ - --consumer \ + -s "" \ + --consumer ``` -Now you know how to use credentials provided by related charms to successfully read/write data from Charmed Apache Kafka! +After a few seconds, all previously produced messaged will be consumed, showed in the output, +but the script will continue indefinitely waiting for more. +Since we know that no more messages will be produced now, we can stop the script with `Ctrl+C`. -### Charm client applications +Now you know how to use credentials provided by related charms to successfully read/write data +from Charmed Apache Kafka! -Actually, the Data Integrator is only a very special client charm, that implements the `kafka_client` relation interface for exchanging data with Charmed Apache Kafka and user management via relations. +## Charmed applications -For example, the steps above for producing and consuming messages to Apache Kafka have also been implemented in the `kafka-test-app` charm (that also implements the `kafka_client` relation) providing a fully integrated charmed user experience, where producing/consuming messages can simply be achieved using relations. +The Data Integrator is a very special client charm, +that implements the `kafka_client` relation interface for exchanging data with +Charmed Apache Kafka and user management via relations. -#### Producing messages +For example, the steps above for producing and consuming messages to Apache Kafka +have also been implemented in the `kafka-test-app` charm (that also implements +the `kafka_client` relation) providing a fully integrated charmed user experience, +where producing/consuming messages can simply be achieved using relations. -To produce messages to Apache Kafka, we need to configure the `kafka-test-app` to act as a producer, publishing messages to a specific topic: +### Producing messages + +To produce messages to Apache Kafka, we need to configure the `kafka-test-app` +to act as a producer, publishing messages to a specific topic: ```shell juju config kafka-test-app topic_name=TOP-PICK role=producer num_messages=20 ``` -To start producing messages to Apache Kafka, we simply relate the Apache Kafka Test App with Apache Kafka: +To start producing messages to Apache Kafka, we simply integrate the Apache Kafka Test App +with Apache Kafka: ```shell juju integrate kafka-test-app kafka ``` ```{note} -This will both take care of creating a dedicated user (as was done for the `data-integrator`) as well as start a producer process publishing messages to the `TOP-PICK` topic, basically automating what was done before by hand. +This will both take care of creating a dedicated user (as was done for the `data-integrator`) +as well as start a producer process publishing messages to the `TOP-PICK` topic, +basically automating what was done before by hand. ``` -After some time, the `juju status` output should show +After some time, check the status: ```shell -Model Controller Cloud/Region Version SLA Timestamp -tutorial overlord localhost/localhost 3.6.8 unsupported 18:58:47+02:00 +juju status +``` + +
Output example -App Version Status Scale Charm Channel Rev Address Exposed Message -... -kafka-test-app active 1 kafka-test-app edge 8 10.152.183.60 no Topic TOP-PICK enabled with process producer -... +```shell +Model Controller Cloud/Region Version SLA Timestamp +tutorial overlord localhost/localhost 3.6.13 unsupported 14:27:10Z -Unit Workload Agent Address Ports Message -... -kafka-test-app/0* active idle 10.1.36.88 Topic TOP-PICK enabled with process producer -... +App Version Status Scale Charm Channel Rev Exposed Message +data-integrator active 1 data-integrator latest/stable 180 no +kafka 4.0.0 active 3 kafka 4/edge 245 no +kafka-test-app active 1 kafka-test-app latest/edge 15 no Topic TOP-PICK enabled with process producer +kraft 4.0.0 active 3 kafka 4/edge 245 no + +Unit Workload Agent Machine Public address Ports Message +data-integrator/0* active idle 6 10.109.154.254 +kafka-test-app/0* active idle 7 10.109.154.242 Topic TOP-PICK enabled with process producer +kafka/0* active executing 0 10.109.154.47 9092,19093/tcp +kafka/1 active idle 1 10.109.154.171 9092,19093/tcp +kafka/2 active idle 2 10.109.154.82 9092,19093/tcp +kraft/0* active idle 3 10.109.154.49 9098/tcp +kraft/1 active idle 4 10.109.154.148 9098/tcp +kraft/2 active idle 5 10.109.154.50 9098/tcp + +Machine State Address Inst id Base AZ Message +0 started 10.109.154.47 juju-030538-0 ubuntu@24.04 dev Running +1 started 10.109.154.171 juju-030538-1 ubuntu@24.04 dev Running +2 started 10.109.154.82 juju-030538-2 ubuntu@24.04 dev Running +3 started 10.109.154.49 juju-030538-3 ubuntu@24.04 dev Running +4 started 10.109.154.148 juju-030538-4 ubuntu@24.04 dev Running +5 started 10.109.154.50 juju-030538-5 ubuntu@24.04 dev Running +6 started 10.109.154.254 juju-030538-6 ubuntu@24.04 dev Running +7 started 10.109.154.242 juju-030538-7 ubuntu@22.04 dev Running ``` -announcing that the process has started. To make sure that this is indeed the case, you can check the logs of the process: +
+ +To make sure that the process has started, check the logs of the process: ```shell juju exec --application kafka-test-app "tail /tmp/*.log" ``` -To stop the process (although it is very likely that the process has already stopped given the low number of messages that were provided) and remove the user, you can just remove the relation: +Make sure to see the following messages: + +```text +INFO [__main__] (MainThread) (produce_message) Message published to topic=TOP-PICK, message content: {"timestamp": 1768919219.744478, "_id": "9f4da8c1df2547f18c4d3365f7fb1c54", "origin": "juju-030538-7 (10.109.154.242)", "content": "Message #11"} +``` + +To stop the process (although it is very likely that the process has already stopped +given the low number of messages that were provided) and remove the user, +you can just remove the relation: ```shell juju remove-relation kafka-test-app kafka ``` -#### Consuming messages +### Consuming messages -Note that the `kafka-test-app` charm can also similarly be used to consume messages by changing its configuration to: +The `kafka-test-app` charm can be used to consume messages by changing its configuration: ```shell juju config kafka-test-app topic_name=TOP-PICK role=consumer consumer_group_prefix=cg ``` -After configuring the Apache Kafka Test App, just relate it again with the Charmed Apache Kafka. This will again create a new user and start the consumer process. +After configuring the Apache Kafka Test App, just relate it again with the Charmed Apache Kafka. -## What's next? +```shell +juju integrate kafka-test-app kafka +``` + +This will again create a new user and start the consumer process. +You can check progress with `juju status`. -In the next section, we will learn how to rotate and manage the passwords for the Apache Kafka users, both the admin one and the ones managed by the Data Integrator. +Wait for everything to be `active` and `idle` again. +Now you can remove the relation and the entire `kafka-test-app` application entirely +as we won't need them anymore. + +```shell +juju remove-relation kafka-test-app kafka +juju remove-application kafka-test-app --destroy-storage +``` + +## What's next? +In the next section, we will learn how to rotate and manage the passwords for the Apache Kafka users, both the admin user and the ones managed by the Data Integrator. diff --git a/docs/tutorial/manage-passwords.md b/docs/tutorial/manage-passwords.md index c881fa84..991fba71 100644 --- a/docs/tutorial/manage-passwords.md +++ b/docs/tutorial/manage-passwords.md @@ -3,27 +3,31 @@ This is a part of the [Charmed Apache Kafka Tutorial](index.md). -## Manage passwords +Passwords help to secure the Apache Kafka cluster and are essential for security. +Over time it is a good practice to change the password frequently. +Here we will go through setting and changing the password both for the built-in user +and external Charmed Apache Kafka users managed by the `data-integrator`. -Passwords help to secure the Apache Kafka cluster and are essential for security. Over time it is a good practice to change the password frequently. Here we will go through setting and changing the password both for the `admin` user and external Charmed Apache Kafka users managed by the `data-integrator`. +## The built-in user -### The admin user +The built-in admin user (`operator`) password management is handled directly by the charm, +by using Juju actions. -The admin user password management is handled directly by the charm, by using Juju actions. +### Retrieve the password -#### Retrieve the password +As a reminder, the admin password is stored in a Juju secret that was created and managed by +the Charmed Apache Kafka application. The password in in the `operator-password` field. -As a reminder, the `admin` password is stored in a Juju secret that was created and managed by the Charmed Apache Kafka application. - -Get the current value of the `admin` user password from the secret with following: +Get the current value of the admin user password from the secret: ```shell -juju show-secret --reveal cluster.kafka.app | yq '.. | ."admin-password"? // empty' | tr -d '"' +juju show-secret --reveal cluster.kafka.app | yq -r '.[].content["operator-password"]' ``` -#### Change the password +### Change the password -You can change the admin password to a new password by creating a new Juju secret, and updating the Charmed Apache Kafka application of the correct secret to use. +You can change the admin password to a new password by creating a new Juju secret, +and updating the Charmed Apache Kafka application of the correct secret to use. First, create the Juju secret with the new password you wish to use: @@ -31,7 +35,8 @@ First, create the Juju secret with the new password you wish to use: juju add-secret internal-kafka-users admin=mynewpassword ``` -Note the generated secret ID that you see as a response. It will look something like `secret:d2lkl00co3bs3dacm300`. +Note the generated secret ID that you see as a response. +It will look something like `secret:d5nc29hlshbc45lnf07g`. Now, grant Charmed Apache Kafka access to the new secret: @@ -39,115 +44,152 @@ Now, grant Charmed Apache Kafka access to the new secret: juju grant-secret internal-kafka-users kafka ``` -Finally, inform Charmed Apache Kafka of the new secret to use for it's internal system users using the secret ID saved earlier: +Finally, inform Charmed Apache Kafka of the new secret to use for it's internal system users +using the secret ID saved earlier: ```shell -juju config kafka system-users=secret:d2lkl00co3bs3dacm300 +juju config kafka system-users=secret:d5nc29hlshbc45lnf07g ``` -Now, Charmed Apache Kafka will be able to read the new `admin` password from the correct secret, and will proceed to apply the new password on each unit with a rolling-restart of the services with the new configuration. +Now, Charmed Apache Kafka will be able to read the new admin password from the correct secret, +and will proceed to apply the new password on each unit with a rolling-restart of the services +with the new configuration. -### External Apache Kafka users +## External Apache Kafka users -Unlike internal user management of `admin` users, the password management for external Apache Kafka users is instead managed using relations. Let's see this into play with the Data Integrator charm, that we have deployed in the previous part of the tutorial. +Unlike internal user management of the built-in admin user, the password management for external +Apache Kafka users is instead managed using relations. Let's see this into play with +the Data Integrator charm, that we have deployed in the previous part of the tutorial. -#### Retrieve the password +### Retrieve the password -The `data-integrator` exposes an action to retrieve the credentials, e.g: +The `data-integrator` exposes an action to retrieve the credentials: ```shell juju run data-integrator/leader get-credentials ``` +
Output example + Running the command should output: -```shell +```shell kafka: - endpoints: 10.244.26.43:9092,10.244.26.6:9092,10.244.26.19:9092 - password: S4IeRaYaiiq0tsM7m2UZuP2mSI573IGV + consumer-group-prefix: relation-8- + data: '{"resource": "test-topic", "salt": "yOIRb9uVUuJuKFVc", "extra-user-roles": + "producer,consumer", "provided-secrets": ["mtls-cert"], "requested-secrets": ["username", + "password", "tls", "tls-ca", "uris", "read-only-uris"]}' + endpoints: 10.109.154.171:9092,10.109.154.47:9092,10.109.154.82:9092 + password: RdRjZkXUC3dAb5VRFw2470fnoKrsRIXU + resource: test-topic + salt: W34UoIPzckdMJ6DU tls: disabled topic: test-topic - username: relation-6 - zookeeper-uris: 10.244.26.121:2181,10.244.26.129:2181,10.244.26.174:2181,10.244.26.251:2181,10.244.26.28:2181/kafka + username: relation-8 + version: v0 ok: "True" ``` -#### Rotate the password +
+ +### Rotate the password -The easiest way to rotate user credentials using the `data-integrator` is by removing and then re-integrating the `data-integrator` with the `kafka` charm +The easiest way to rotate user credentials using the `data-integrator` is by removing +and then re-integrating the `data-integrator` with the `kafka` charm: ```shell juju remove-relation kafka data-integrator +``` -# wait for the relation to be torn down +Wait for the relation to be torn down and add integration again: +```shell juju integrate kafka data-integrator ``` -The successful credential rotation can be confirmed by retrieving the new password with the action `get-credentials` +The successful credential rotation can be confirmed by retrieving the new password +with the action `get-credentials`: ```shell -juju run data-integrator/leader get-credentials +juju run data-integrator/leader get-credentials ``` +
Output example + Running the command should now output a different password: -```shell +```shell kafka: - endpoints: 10.244.26.43:9092,10.244.26.6:9092,10.244.26.19:9092 - password: ToVfqYQ7tWmNmjy2tJTqulZHmJxJqQ22 + consumer-group-prefix: relation-9- + data: '{"resource": "test-topic", "salt": "iGWWWoUwCy39ou6f", "extra-user-roles": + "producer,consumer", "provided-secrets": ["mtls-cert"], "requested-secrets": ["username", + "password", "tls", "tls-ca", "uris", "read-only-uris"]}' + endpoints: 10.109.154.171:9092,10.109.154.47:9092,10.109.154.82:9092 + password: EEiI2gboTp2dF0NOcogtbrOWBTxkd5YB + resource: test-topic + salt: 7WqLjlZjeUvlEWrA tls: disabled topic: test-topic - username: relation-11 - zookeeper-uris: 10.244.26.121:2181,10.244.26.129:2181,10.244.26.174:2181,10.244.26.251:2181,10.244.26.28:2181/kafka + username: relation-9 + version: v0 ok: "True" ``` -To rotate external passwords with no or limited downtime, please refer to the how-to guide on [app management](how-to-client-connections). +
+ +To rotate external passwords with no or limited downtime, +see the how-to guide on [app management](how-to-client-connections). -#### Remove the user +### Remove the user -To remove the user, remove the relation. Removing the relation automatically removes the user that was created when the relation was created. Enter the following to remove the relation: +Removing the relation automatically removes the user that was created when the relation was created. +To remove the user, remove the relation: ```shell juju remove-relation kafka data-integrator ``` +
Output example + The output of the Juju model should be something like this: ```shell Model Controller Cloud/Region Version SLA Timestamp -tutorial overlord localhost/localhost 3.6.8 unsupported 23:12:02Z +tutorial overlord localhost/localhost 3.6.13 unsupported 17:12:02Z App Version Status Scale Charm Channel Rev Exposed Message data-integrator blocked 1 data-integrator latest/stable 180 no Please relate the data-integrator with the desired product -kafka 4.0.0 active 3 kafka 4/edge 226 no -kraft 4.0.0 active 3 kafka 4/edge 226 no - -Unit Workload Agent Machine Public address Ports Message -data-integrator/0* blocked idle 6 10.233.204.111 Please relate the data-integrator with the desired product -kafka/0* active idle 0 10.233.204.241 19093/tcp -kafka/1 active idle 1 10.233.204.196 19093/tcp -kafka/2 active idle 2 10.233.204.148 19093/tcp -kraft/0 active idle 3 10.233.204.125 9098/tcp -kraft/1* active idle 4 10.233.204.36 9098/tcp -kraft/2 active idle 5 10.233.204.225 9098/tcp - -Machine State Address Inst id Base AZ Message -0 started 10.233.204.241 juju-07a730-0 ubuntu@24.04 Running -1 started 10.233.204.196 juju-07a730-1 ubuntu@24.04 Running -2 started 10.233.204.148 juju-07a730-2 ubuntu@24.04 Running -3 started 10.233.204.125 juju-07a730-3 ubuntu@24.04 Running -4 started 10.233.204.36 juju-07a730-4 ubuntu@24.04 Running -5 started 10.233.204.225 juju-07a730-5 ubuntu@24.04 Running -6 started 10.233.204.111 juju-07a730-6 ubuntu@24.04 Running +kafka 4.0.0 active 3 kafka 4/edge 245 no +kraft 4.0.0 active 3 kafka 4/edge 245 no + +Unit Workload Agent Machine Public address Ports Message +data-integrator/0* blocked idle 6 10.109.154.254 Please relate the data-integrator with the desired product +kafka/0* active executing 0 10.109.154.47 9092,19093/tcp +kafka/1 active executing 1 10.109.154.171 9092,19093/tcp +kafka/2 active executing 2 10.109.154.82 9092,19093/tcp +kraft/0* active idle 3 10.109.154.49 9098/tcp +kraft/1 active idle 4 10.109.154.148 9098/tcp +kraft/2 active idle 5 10.109.154.50 9098/tcp + +Machine State Address Inst id Base AZ Message +0 started 10.109.154.47 juju-030538-0 ubuntu@24.04 dev Running +1 started 10.109.154.171 juju-030538-1 ubuntu@24.04 dev Running +2 started 10.109.154.82 juju-030538-2 ubuntu@24.04 dev Running +3 started 10.109.154.49 juju-030538-3 ubuntu@24.04 dev Running +4 started 10.109.154.148 juju-030538-4 ubuntu@24.04 dev Running +5 started 10.109.154.50 juju-030538-5 ubuntu@24.04 dev Running +6 started 10.109.154.254 juju-030538-6 ubuntu@24.04 dev Running ``` +
+ ```{note} -The operations above would also apply to charmed applications that implement the `kafka_client` relation, for which password rotation and user deletion can be achieved in the same consistent way. +The operations above would also apply to charmed applications that implement +the `kafka_client` relation, for which password rotation and user deletion +can be achieved in the same consistent way. ``` ## What's next? -In the next part, we will now see how easy it is to enable encryption across the board, to make sure no one is eavesdropping, sniffing or snooping your traffic by enabling TLS. - +In the next part, we will now see how easy it is to enable encryption across the board, +to make sure no one is eavesdropping, sniffing or snooping your traffic by enabling TLS. diff --git a/docs/tutorial/rebalance-partitions.md b/docs/tutorial/rebalance-partitions.md index 548ad304..33ea7279 100644 --- a/docs/tutorial/rebalance-partitions.md +++ b/docs/tutorial/rebalance-partitions.md @@ -1,99 +1,114 @@ (tutorial-rebalance-partitions)= -# 7. Rebalance and Reassign Partitions +# 7. Rebalance and reassign partitions This is a part of the [Charmed Apache Kafka Tutorial](index.md). -## Partition rebalancing and reassignment +By default, when adding more brokers to a Charmed Apache Kafka cluster, the current +allocated partitions on the original brokers are not automatically redistributed across +the new brokers. This can lead to inefficient resource usage and over-provisioning. +On the other hand, when removing brokers to reduce capacity, partitions assigned +to the removed brokers are also not redistributed, which can result in under-replicated data +at best and permanent data loss at worst. -By default, when adding more brokers to a Charmed Apache Kafka cluster, the current allocated partitions on the original brokers are not automatically redistributed across the new brokers. This can lead to inefficient resource usage and over-provisioning. On the other hand, when removing brokers to reduce capacity, partitions assigned to the removed brokers are also not redistributed, which can result in under-replicated data at best and permanent data loss at worst. +To address this, we can make use of +[LinkedIn's Cruise Control](https://github.com/linkedin/cruise-control), which is bundled as part +of the Charmed Apache Kafka [snap](https://github.com/canonical/charmed-kafka-snap) +and [rock](https://github.com/canonical/charmed-kafka-rock). -To address this, we can make use of [LinkedIn's Cruise Control](https://github.com/linkedin/cruise-control), which is bundled as part of the Charmed Apache Kafka [snap](https://github.com/canonical/charmed-kafka-snap) and [rock](https://github.com/canonical/charmed-kafka-rock). - -At a high level, Cruise Control is made up of the following five components: + -### Deploying partition balancer +The Charmed Apache Kafka charm has a configuration option `roles`, which takes +a list of possible values. Different roles can be configured to run on the same machine, +or as separate Juju applications. -The Charmed Apache Kafka charm has a configuration option `roles`, which takes a list of possible values. -Different roles can be configured to run on the same machine, or as separate Juju applications. +The `balancer` role is required to run the Cruise Control. +We will need to add this role to one of the existing Juju applications: +either `kafka` with the `broker` role, or `kraft` with the `controller` role. -The two necessary roles for cluster rebalancing are: -- `broker` - running Apache Kafka -- `balancer` - running Cruise Control +We recommend combining `controller` and `balancer` role together on the `kraft` application, +due to higher performance demands of the `broker` role. -```{note} -It is recommended to deploy a separate Juju application for running Cruise Control in production environments. -``` +## Setup -For the purposes of this tutorial, we will be deploying a single Charmed Apache Kafka unit to serve as the `balancer`: +Let's add the role `balancer` to the existing `kraft` Juju application: -```bash -juju deploy kafka --config roles=balancer cruise-control +```shell +juju config kraft roles=balancer,controller ``` -Earlier in the tutorial, we covered enabling TLS encryption, so we will repeat that step here for the new `cruise-control` application: +Wait for the status to become `active`/`idle`: -```bash -juju integrate cruise-control:certificates self-signed-certificates +```shell +watch juju status --color ``` -Now, to make the new `cruise-control` application aware of the existing Apache Kafka cluster, we will integrate the two applications using the `peer_cluster` relation interface, ensuring that the `broker` cluster is using the `peer-cluster` relation-endpoint, and the `balancer` cluster is using the `peer-cluster-orchestrator` relation-endpoint: +## Adding new brokers + +Let's scale-out the `kafka` application to four units (add one more): ```bash -juju integrate kafka:peer-cluster-orchestrator cruise-control:peer-cluster +juju add-unit kafka ``` -### Adding new brokers +Wait for the additional unit to be fully deployed and active: -After completing the steps in the [Integrate with client applications](integrate-with-client-applications) tutorial page, you should have three `kafka` units and a client application actively writing messages to an existing topic. Let's scale-out the `kafka` application to four units: - -```bash -juju add-unit kafka 4 +```shell +watch juju status --color ``` -By default, no partitions are allocated for the new unit `3`. You can see that by checking the log directory assignment: +By default, no partitions are allocated for the new unit `3`, +that should have broker id `103`. +Check that via the log directory assignment: ```bash -juju ssh kafka/leader sudo -i \ - 'charmed-kafka.log-dirs' \ - '--describe' \ - '--bootstrap-server :9093' \ - '--command-config $CONF/client.properties' \ - '2> /dev/null' \ - | tail -1 | jq -c '.brokers[] | select(.broker == 3)' | jq +juju ssh kafka/leader sudo -i charmed-kafka.log-dirs --describe \ + --bootstrap-server :19093 \ + --command-config '$CONF/client.properties' \ + 2>/dev/null \ + | sed -n '/^{/p' \ + | jq '.brokers[] | select(.broker == 103)' ``` -This should produce output similar to the result seen below, with no partitions allocated by default: +This should produce output similar to the result seen below, +with no partitions allocated by default: ```json { - "broker": 3, + "broker": 103, "logDirs": [ { "error": null, - "logDir": "/var/snap/charmed-kafka/common/var/lib/kafka/data", + "logDir": "/var/snap/charmed-kafka/common/var/lib/kafka/data/11/log", "partitions": [] } ] } ``` -Now, let's run the `rebalance` action to allocate some existing partitions from brokers `0`, `1` and `2` to broker `3`: +Now, let's run the `rebalance` action to allocate some existing partitions +from other brokers (`0`, `1` and `2`) to broker `3`: ```bash -juju run cruise-control/0 rebalance mode=add brokerid=3 --wait=2m +juju run cruise-control/0 rebalance mode=add brokerid=103 --wait=2m ``` -```{note} -If this action fails with a message similar to `Cruise Control balancer service has not yet collected enough data to provide a partition reallocation proposal`, wait 20 minutes or so and try again. Cruise Control takes a while to collect sufficient metrics from an Apache Kafka cluster during a cold deployment. +```{warning} +If this action fails with a message similar to +`Cruise Control balancer service has not yet collected enough data to provide a partition +reallocation proposal`, wait for at least 20 minutes and try again. +Cruise Control takes a long time (sometimes more than an hour) to collect sufficient metrics +from an Apache Kafka cluster during a cold deployment. ``` -By default, the `rebalance` action runs as a "dryrun", where the returned result is what **would** happen were the partition rebalance actually executed. The action output has detailed information on the proposed allocation. +By default, the `rebalance` action runs as a "dryrun", where the returned result +is what **would** happen were the partition rebalance actually executed. +The action output has detailed information on the proposed allocation. For example, the **summary** section might look similar to this: @@ -115,15 +130,18 @@ summary: recentwindows: "1" ``` -If we are happy with this proposal, we can re-run the action, but this time instructing the charm to actually execute the proposal: +If we are happy with this proposal, we can re-run the action, +but this time instructing the charm to actually execute the proposal: ```bash -juju run cruise-control/0 rebalance mode=add dryrun=false brokerid=3 --wait=10m +juju run cruise-control/0 rebalance mode=add dryrun=false brokerid=103 --wait=10m ``` -Partition rebalances can take quite some time. To monitor the progress, in a separate terminal session, check the Juju debug logs to see it in progress: +Partition rebalancing can take significant time. +To monitor the progress, in a separate terminal session, check the `juju debug-log` command output +to see it in progress: -``` +```text unit-cruise-control-0: 22:18:41 INFO unit.cruise-control/0.juju-log Waiting for task execution to finish for user_task_id='d3e426a3-6c2e-412e-804c-8a677f2678af'... unit-cruise-control-0: 22:18:51 INFO unit.cruise-control/0.juju-log Waiting for task execution to finish for user_task_id='d3e426a3-6c2e-412e-804c-8a677f2678af'... unit-cruise-control-0: 22:19:02 INFO unit.cruise-control/0.juju-log Waiting for task execution to finish for user_task_id='d3e426a3-6c2e-412e-804c-8a677f2678af'... @@ -131,23 +149,23 @@ unit-cruise-control-0: 22:19:12 INFO unit.cruise-control/0.juju-log Waiting for ... ``` -Once the action is complete, verify the partitions using the same commands as before: +Once the action is complete, verify the partitions on the newly added unit +using the same commands as before: ```bash -juju ssh kafka/leader sudo -i \ - 'charmed-kafka.log-dirs' \ - '--describe' \ - '--bootstrap-server :9093' \ - '--command-config $CONF/client.properties' \ - '2> /dev/null' \ - | tail -1 | jq -c '.brokers[] | select(.broker == 3)' | jq +juju ssh kafka/leader sudo -i charmed-kafka.log-dirs --describe \ + --bootstrap-server :19093 \ + --command-config '$CONF/client.properties' \ + 2>/dev/null \ + | sed -n '/^{/p' \ + | jq '.brokers[] | select(.broker == 103)' ``` This should produce an output similar to the result seen below, with broker `3` now having assigned partitions present, completing the adding of a new broker to the cluster: ```json { - "broker": 3, + "broker": 103, "logDirs": [ { "partitions": [ @@ -157,77 +175,93 @@ This should produce an output similar to the result seen below, with broker `3` "offsetLag": 0, "isFuture": false }, - + ] } + ] } ``` -### Removing old brokers +## Removing old brokers -To safely scale-in an Apache Kafka cluster, we must make sure to carefully move any existing data from units about to be removed, to another unit that will persist. +To safely scale-in an Apache Kafka cluster, we must make sure to carefully move any existing data +from units about to be removed, to another unit that will persist. -In practice, this means running a `rebalance` Juju action as seen above, **BEFORE** scaling down the application. This ensures that data is moved, prior to the unit becoming unreachable and permanently losing the data on it. +In practice, this means running a `rebalance` Juju action as seen above, +**BEFORE** scaling down the application. This ensures that data is moved, +prior to the unit becoming unreachable and permanently losing the data on it. ```{note} -As partition data is replicated across a finite number of units based on the value of the Apache Kafka cluster's `replication.factor` property (default value is `3`), it is imperative to remove only one broker at a time, to avoid losing all available replicas for a given partition. +As partition data is replicated across a finite number of units based on the value +of the Apache Kafka cluster's `replication.factor` property (default value is `3`) +it is imperative to remove only one broker at a time, to avoid losing all available +replicas for a given partition. ``` -To remove the most recent broker unit `3` from the previous example, re-run the `rebalance` action with `mode=remove`: +To remove the most recent broker unit `3` from the previous example, +re-run the `rebalance` action with `mode=remove`: ```bash juju run cruise-control/0 rebalance mode=remove dryrun=false brokerid=3 --wait=10m ``` -This does not remove the unit, but moves the partitions from the broker on unit number `3` to other brokers within the cluster. +This does not remove the unit, but moves the partitions from the broker on unit number `3` +to other brokers within the cluster. Once the action has been completed, verify that broker `3` no longer has any assigned partitions: ```bash -juju ssh kafka/leader sudo -i \ - 'charmed-kafka.log-dirs' \ - '--describe' \ - '--bootstrap-server :9093' \ - '--command-config $CONF/client.properties' \ - '2> /dev/null' \ - | tail -1 | jq -c '.brokers[] | select(.broker == 3)' | jq +juju ssh kafka/leader sudo -i charmed-kafka.log-dirs --describe \ + --bootstrap-server :19093 \ + --command-config '$CONF/client.properties' \ + 2>/dev/null \ + | sed -n '/^{/p' \ + | jq '.brokers[] | select(.broker == 103)' ``` -Make sure that broker `3` now has no partitions assigned, for example: +Make sure that the broker has no partitions assigned, for example: ```json { - "broker": 3, + "broker": 103, "logDirs": [ { "partitions": [], "error": null, - "logDir": "/var/lib/kafka/data" + "logDir": "/var/snap/charmed-kafka/common/var/lib/kafka/data/11/log" } ] } ``` -Now, it is safe to scale-in the cluster, removing the broker number `3` completely: +Now, it is safe to scale-in the cluster by removing the broker number `3` completely: ```bash juju remove-unit kafka/3 ``` -### Full cluster rebalancing +## Full cluster rebalancing -Over time, an Apache Kafka cluster in production may develop an imbalance in partition allocation, with some brokers having greater/fewer allocated than others. This can occur as topic load fluctuates, partitions are added or removed due to reconfiguration, or new topics are created or deleted. Therefore, as part of regular cluster maintenance, administrators should periodically redistribute partitions across existing broker units to ensure optimal performance. +Over time, an Apache Kafka cluster in production may develop an imbalance in partition allocation, +with some brokers having greater/fewer allocated than others. +This can occur as topic load fluctuates, partitions are added or removed due to reconfiguration, +or new topics are created or deleted. Therefore, as part of regular cluster maintenance, +administrators should periodically redistribute partitions across existing broker units +to ensure optimal performance. -Unlike `Adding new brokers` or `Removing old brokers`, this includes a full re-shuffle of partition allocation across all currently live broker units. +Unlike `Adding new brokers` or `Removing old brokers`, this includes a full re-shuffle +of partition allocation across all currently live broker units. -To achieve this, re-run the `rebalance` action with the `mode=full`. You can do it in the "dryrun" mode (by default) for now: +To achieve this, re-run the `rebalance` action with the `mode=full`. +You can do it in the "dryrun" mode (by default) for now: ```bash juju run cruise-control/0 rebalance mode=full --wait=10m ``` -Looking at the bottom of the output, see the value of the `balancedness` score before and after the proposed 'full' rebalance: +Looking at the bottom of the output, see the value of the `balancedness` score +before and after the proposed 'full' rebalance: -``` +```text summary: ... ondemandbalancednessscoreafter: "90.06926434109423" @@ -240,4 +274,3 @@ To implement the proposed changes, run the same command but with `dryrun=false`: ```bash juju run cruise-control/0 rebalance mode=full dryrun=false --wait=10m ``` - diff --git a/docs/tutorial/use-kafka-connect.md b/docs/tutorial/use-kafka-connect.md index 3d7a2bd9..10fbc27e 100644 --- a/docs/tutorial/use-kafka-connect.md +++ b/docs/tutorial/use-kafka-connect.md @@ -3,47 +3,72 @@ This is a part of the [Charmed Apache Kafka Tutorial](index.md). -## Using Kafka Connect for ETL +In this part of the tutorial, we are going to use +[Kafka Connect](https://kafka.apache.org/41/kafka-connect/overview/), an ETL framework on top of +Apache Kafka, to seamlessly move data between different charmed database technologies. -In this part of the tutorial, we are going to use [Kafka Connect](https://kafka.apache.org/41/kafka-connect/overview/) - an ETL framework on top of Apache Kafka - to seamlessly move data between different charmed database technologies. +We will follow a step-by-step process for moving data between +[Canonical Data Platform charms](https://canonical.com/data) using Kafka Connect. +Specifically, we will showcase a particular use-case of loading data from a relational database, +(PostgreSQL), to a document store and search engine (OpenSearch), entirely using charmed solutions. -We will follow a step-by-step process for moving data between [Canonical Data Platform charms](https://canonical.com/data) using Kafka Connect. Specifically, we will showcase a particular use-case of loading data from a relational database, i.e. PostgreSQL, to a document store and search engine, i.e. OpenSearch, entirely using charmed solutions. +By the end, you should be able to use Kafka Connect integrator and Kafka Connect charms +to streamline data ETL tasks on Canonical Data Platform charmed solutions. -By the end, you should be able to use Kafka Connect integrator and Kafka Connect charms to streamline data ETL tasks on Canonical Data Platform charmed solutions. +## Prerequisites -### Prerequisites - -We will be deploying different charmed data solutions including PostgreSQL and OpenSearch. If you require more information or face issues deploying any of the mentioned products, you should consult the respective documentations: +We will be deploying different charmed data solutions including PostgreSQL and OpenSearch. +If you require more information or face issues deploying any of the mentioned products, +you should consult the respective documentations: - For PostgreSQL, refer to [Charmed PostgreSQL tutorial](https://canonical-charmed-postgresql.readthedocs-hosted.com/14/tutorial/). - For OpenSearch, refer to [Charmed OpenSearch tutorial](https://canonical-charmed-opensearch.readthedocs-hosted.com/2/tutorial/). -### Check current deployment +## Check current deployment + +Up to this point, we should have three units of Charmed Apache Kafka application. +Check the current status of the Juju model: + +```shell +juju status +``` -Up to this point, we should have three units of Charmed Apache Kafka application. That means the `juju status` command should show an output similar to the following: +
Output example ```text -Model Controller Cloud/Region Version SLA Timestamp -tutorial overlord localhost/localhost 3.6.8 unsupported 01:02:27Z +Model Controller Cloud/Region Version SLA Timestamp +tutorial overlord localhost/localhost 3.6.13 unsupported 18:27:29Z App Version Status Scale Charm Channel Rev Exposed Message data-integrator active 1 data-integrator latest/stable 180 no -kafka 4.0.0 active 3 kafka 4/edge 226 no -kraft 4.0.0 active 3 kafka 4/edge 226 no -self-signed-certificates active 1 self-signed-certificates 1/edge 336 no +kafka 4.0.0 active 3 kafka 4/edge 245 no +kraft 4.0.0 active 3 kafka 4/edge 245 no +self-signed-certificates active 1 self-signed-certificates 1/stable 317 no Unit Workload Agent Machine Public address Ports Message -data-integrator/0* active idle 6 10.233.204.111 -kafka/0* active idle 0 10.233.204.241 9093,19093/tcp -kafka/1 active idle 1 10.233.204.196 9093,19093/tcp -kafka/2 active idle 2 10.233.204.148 9093,19093/tcp -kraft/0 active idle 3 10.233.204.125 9098/tcp -kraft/1* active idle 4 10.233.204.36 9098/tcp -kraft/2 active idle 5 10.233.204.225 9098/tcp -self-signed-certificates/0* active idle 7 10.233.204.134 -``` - -### Set the necessary kernel properties for OpenSearch +data-integrator/0* active idle 6 10.109.154.254 +kafka/0* active idle 0 10.109.154.47 9093,19093/tcp +kafka/1 active idle 1 10.109.154.171 9093,19093/tcp +kafka/2 active idle 2 10.109.154.82 9093,19093/tcp +kraft/0* active idle 3 10.109.154.49 9098/tcp +kraft/1 active idle 4 10.109.154.148 9098/tcp +kraft/2 active idle 5 10.109.154.50 9098/tcp +self-signed-certificates/0* active idle 8 10.109.154.248 + +Machine State Address Inst id Base AZ Message +0 started 10.109.154.47 juju-030538-0 ubuntu@24.04 dev Running +1 started 10.109.154.171 juju-030538-1 ubuntu@24.04 dev Running +2 started 10.109.154.82 juju-030538-2 ubuntu@24.04 dev Running +3 started 10.109.154.49 juju-030538-3 ubuntu@24.04 dev Running +4 started 10.109.154.148 juju-030538-4 ubuntu@24.04 dev Running +5 started 10.109.154.50 juju-030538-5 ubuntu@24.04 dev Running +6 started 10.109.154.254 juju-030538-6 ubuntu@24.04 dev Running +8 started 10.109.154.248 juju-030538-8 ubuntu@24.04 dev Running +``` + +
+ +## Set the necessary kernel properties for OpenSearch Since we will be deploying the OpenSearch charm, we need to make necessary kernel configurations required for OpenSearch charm to function properly, @@ -77,7 +102,7 @@ EOF juju model-config --file=./cloudinit-userdata.yaml ``` -### Deploy the databases and Kafka Connect charms +## Deploy the databases and Kafka Connect charms Deploy the PostgreSQL, OpenSearch, and Kafka Connect charms: @@ -87,11 +112,15 @@ juju deploy postgresql --channel 14/stable juju deploy opensearch --channel 2/stable --config profile=testing ``` -OpenSearch charm requires a TLS relation to become active. We will use the [`self-signed-certificates` charm](https://charmhub.io/self-signed-certificates) that was deployed earlier in the [Enable Encryption](https://charmhub.io/kafka/docs/t-enable-encryption) part of this Tutorial. +OpenSearch charm requires a TLS relation to become active. +We will use the [`self-signed-certificates` charm](https://charmhub.io/self-signed-certificates) +that was deployed earlier in the +[Enable Encryption](tutorial-enable-encryption) part of this Tutorial. -### Enable TLS +## Enable TLS -Using the `juju status` command, you should see that the Kafka Connect and OpenSearch applications are in `blocked` state. In order to activate them, we need to make necessary integrations using the `juju integrate` command. +Using the `juju status` command, you should see that the Kafka Connect and OpenSearch applications +are in `blocked` state. In order to activate them, we need to set up necessary integrations. First, activate the OpenSearch application by integrating it with the TLS operator: @@ -105,46 +134,71 @@ Then, activate the Kafka Connect application by integrating it with the Apache K juju integrate kafka kafka-connect ``` -Finally, since we will be using TLS on the Kafka Connect interface, integrate the Kafka Connect application with the TLS operator: +Finally, since we will be using TLS on the Kafka Connect interface, integrate the Kafka Connect +application with the TLS operator: ```bash juju integrate kafka-connect self-signed-certificates ``` -Use the `watch -n 1 --color juju status --color` command to continuously probe your model's status. After a couple of minutes, all the applications should be in `active|idle` state, and you should see an output like the following, with 7 applications and 13 units: +Use the `watch juju status --color` command to continuously probe your model's status. +After a couple of minutes, all the applications should be in `active`/`idle` state. + +
Output example ```text -Model Controller Cloud/Region Version SLA Timestamp -tutorial overlord localhost/localhost 3.6.8 unsupported 01:02:27Z +Model Controller Cloud/Region Version SLA Timestamp +tutorial overlord localhost/localhost 3.6.13 unsupported 18:51:59Z App Version Status Scale Charm Channel Rev Exposed Message data-integrator active 1 data-integrator latest/stable 180 no -kafka 4.0.0 active 3 kafka 4/edge 226 no -kraft 4.0.0 active 3 kafka 4/edge 226 no -opensearch active 1 opensearch 2/edge 218 no -postgresql 14.15 active 1 postgresql 14/stable 553 no - -self-signed-certificates active 1 self-signed-certificates 1/edge 336 no +kafka 4.0.0 active 3 kafka 4/edge 245 no +kafka-connect active 1 kafka-connect latest/edge 30 no +kraft 4.0.0 active 3 kafka 4/edge 245 no +opensearch active 1 opensearch 2/stable 314 no +postgresql 14.20 active 1 postgresql 14/stable 987 no +self-signed-certificates active 1 self-signed-certificates 1/stable 317 no Unit Workload Agent Machine Public address Ports Message -data-integrator/0* active idle 6 10.233.204.111 -opensearch/0* active idle 11 10.233.204.172 9200/tcp -postgresql/0* active idle 12 10.233.204.121 5432/tcp Primary -kafka/0* active idle 0 10.233.204.241 9093,19093/tcp -kafka/1 active idle 1 10.233.204.196 9093,19093/tcp -kafka/2 active idle 2 10.233.204.148 9093,19093/tcp -kraft/0 active idle 3 10.233.204.125 9098/tcp -kraft/1* active idle 4 10.233.204.36 9098/tcp -kraft/2 active idle 5 10.233.204.225 9098/tcp -self-signed-certificates/0* active idle 7 10.233.204.134 -``` - -### Load test data - -In a real-world scenario, an application would typically write data to a PostgreSQL database. However, for the purposes of this tutorial, we’ll generate test data using a simple SQL script and load it into a PostgreSQL database using the `psql` command-line tool included with the PostgreSQL charm. +data-integrator/0* active idle 6 10.109.154.254 +kafka-connect/0* active idle 10 10.109.154.69 8083/tcp +kafka/0* active idle 0 10.109.154.47 9092,19093/tcp +kafka/1 active idle 1 10.109.154.171 9092,19093/tcp +kafka/2 active idle 2 10.109.154.82 9092,19093/tcp +kraft/0* active idle 3 10.109.154.49 9098/tcp +kraft/1 active idle 4 10.109.154.148 9098/tcp +kraft/2 active idle 5 10.109.154.50 9098/tcp +opensearch/0* active idle 12 10.109.154.204 9200/tcp +postgresql/0* active idle 11 10.109.154.208 5432/tcp Primary +self-signed-certificates/0* active idle 8 10.109.154.248 + +Machine State Address Inst id Base AZ Message +0 started 10.109.154.47 juju-030538-0 ubuntu@24.04 dev Running +1 started 10.109.154.171 juju-030538-1 ubuntu@24.04 dev Running +2 started 10.109.154.82 juju-030538-2 ubuntu@24.04 dev Running +3 started 10.109.154.49 juju-030538-3 ubuntu@24.04 dev Running +4 started 10.109.154.148 juju-030538-4 ubuntu@24.04 dev Running +5 started 10.109.154.50 juju-030538-5 ubuntu@24.04 dev Running +6 started 10.109.154.254 juju-030538-6 ubuntu@24.04 dev Running +8 started 10.109.154.248 juju-030538-8 ubuntu@24.04 dev Running +10 started 10.109.154.69 juju-030538-10 ubuntu@22.04 dev Running +11 started 10.109.154.208 juju-030538-11 ubuntu@22.04 dev Running +12 started 10.109.154.204 juju-030538-12 ubuntu@24.04 dev Running +``` + +
+ +## Load test data + +In a real-world scenario, an application would typically write data to a PostgreSQL database. +However, for the purposes of this tutorial, we’ll generate test data using a simple SQL script +and load it into a PostgreSQL database using the `psql` command-line tool included with +the PostgreSQL charm. ```{note} -For more information on how to access a PostgreSQL database in the PostgreSQL charm, refer to [Access PostgreSQL](https://charmhub.io/postgresql/docs/t-access) page of the Charmed PostgreSQL tutorial. +For more information on how to access a PostgreSQL database in the PostgreSQL charm, +refer to [Access PostgreSQL](https://charmhub.io/postgresql/docs/t-access) page +of the Charmed PostgreSQL tutorial. ``` First, create a SQL script by running the following command: @@ -189,49 +243,58 @@ Next, copy the `populate.sql` script to the PostgreSQL unit using the `juju scp` juju scp /tmp/populate.sql postgresql/0:/home/ubuntu/populate.sql ``` -Then, follow the [Access PostgreSQL](https://charmhub.io/postgresql/docs/t-access) tutorial to retrieve the password for the `operator` user on the PostgreSQL database using the `get-password` action: +Then, retrieve the password for the `operator` user on the PostgreSQL database using +the `get-password` action: ```bash juju run postgresql/leader get-password ``` -As a result, you should see output similar to the following: +See [PostgreSQL tutorial](https://charmhub.io/postgresql/docs/t-access) for more guidance if needed. + +
Output example + +As a result, you should see output with the password: ```text ... password: bQOUgw8ZZgUyPA6n ``` +
+ Make note of the password, and use `juju ssh` to connect to the PostgreSQL unit: ```bash juju ssh postgresql/leader ``` -Once connected to the unit, use the `psql` command line tool with the `operator` user credentials, to create the database named `tutorial`: +Once connected to the unit, use the `psql` command line tool with the `operator` +user credentials, to create the database named `tutorial`: ```bash psql --host $(hostname -i) --username operator --password --dbname postgres \ -c "CREATE DATABASE tutorial" ``` -You will be prompted to type the password, which you have obtained previously. +You will be prompted for the password, which you have obtained previously. -Now, we can use the `populate.sql` script copied earlier into the PostgreSQL unit, to create a table named `posts` with some test data: +Now, we can use the `populate.sql` script copied earlier into the PostgreSQL unit, +to create a table named `posts` with some test data: ```bash cat populate.sql | \ psql --host $(hostname -i) --username operator --password --dbname tutorial ``` -To ensure that the test data is loaded successfully into the `posts` table, use the following command: +To ensure that the test data is loaded successfully into the `posts` table: ```bash psql --host $(hostname -i) --username operator --password --dbname tutorial \ -c 'SELECT COUNT(*) FROM posts' ``` -The output should indicate that the `posts` table has five rows now: +The output should indicate that the `posts` table has five rows now: ```text count @@ -242,10 +305,14 @@ The output should indicate that the `posts` table has five rows now: Log out from the PostgreSQL unit using `exit` command or the `Ctrl+D` keyboard shortcut. -### Deploy and integrate the `postgresql-connect-integrator` charm +## Deploy and integrate the `postgresql-connect-integrator` charm -Now that you have sample data loaded into PostgreSQL, it is time to deploy the `postgresql-connect-integrator` charm to enable integration of PostgreSQL and Kafka Connect applications. -First, deploy the charm in `source` mode using the `juju deploy` command and provide the minimum necessary configurations: +Now that you have sample data loaded into PostgreSQL, it is time to deploy +the `postgresql-connect-integrator` charm to enable integration of PostgreSQL +and Kafka Connect applications. + +First, deploy the charm in `source` mode using the `juju deploy` command and provide +the minimum necessary configurations: ```bash juju deploy postgresql-connect-integrator \ @@ -255,10 +322,10 @@ juju deploy postgresql-connect-integrator \ --config topic_prefix=etl_ ``` -Each Kafka Connect integrator application needs at least two relations: +Each Kafka Connect integrator application needs at least two relations: -* with the Kafka Connect -* with a Database charm (e.g. MySQL, PostgreSQL, OpenSearch, etc.) +- with the Kafka Connect +- with a Database charm (e.g. MySQL, PostgreSQL, OpenSearch, etc.) Integrate both Kafka Connect and PostgreSQL with the `postgresql-connect-integrator` charm: @@ -267,21 +334,27 @@ juju integrate postgresql-connect-integrator postgresql juju integrate postgresql-connect-integrator kafka-connect ``` -After a couple of minutes, `juju status` command should show the `postgresql-connect-integrator` in `active|idle` state, with a message indicating that the ETL task is running: +After a couple of minutes, `juju status` command should show the +`postgresql-connect-integrator` in `active`/`idle` state, with a message indicating +that the ETL task is running: ```text ... -postgresql-connect-integrator/0* active idle 13 10.38.169.83 8080/tcp Task Status: RUNNING +postgresql-connect-integrator active 1 postgresql-connect-integrator latest/edge 13 no Task Status: RUNNING ... ``` -This means that the integrator application is actively copying data from the source database (named `tutorial`) into Apache Kafka topics prefixed with `etl_`. -For example, rows in the `posts` table will be published into the Apache Kafka topic named `etl_posts`. +This means that the integrator application is actively copying data from the source database +(named `tutorial`) into Apache Kafka topics prefixed with `etl_`. +For example, rows in the `posts` table will be published into the Apache Kafka topic +named `etl_posts`. -### Deploy and integrate the `opensearch-connect-integrator` charm +## Deploy and integrate the `opensearch-connect-integrator` charm -You are almost done with the ETL task, the only remaining part is to move data from Apache Kafka to OpenSearch. -To do that, deploy another Kafka Connect integrator named `opensearch-connect-integrator` in the `sink` mode: +You are almost done with the ETL task, the only remaining part is to move data from Apache Kafka +to OpenSearch. +To do that, deploy another Kafka Connect integrator named `opensearch-connect-integrator` +in the `sink` mode: ```bash juju deploy opensearch-connect-integrator \ @@ -290,8 +363,10 @@ juju deploy opensearch-connect-integrator \ --config topics="etl_posts" ``` -The above command deploys an integrator application to move messages from the `etl_posts` topic to the index in OpenSearch named `etl_posts`. -And the `etl_posts` topic is filled by the `postgresql-connect-integrator` charm we deployed earlier. +The above command deploys an integrator application to move messages from the `etl_posts` topic +to the index in OpenSearch named `etl_posts`. +And the `etl_posts` topic is filled by the `postgresql-connect-integrator` charm +we deployed earlier. To activate the `opensearch-connect-integrator`, make the necessary integrations: @@ -300,18 +375,21 @@ juju integrate opensearch-connect-integrator opensearch juju integrate opensearch-connect-integrator kafka-connect ``` -Wait a couple of minutes and run `juju status`, now both `opensearch-connect-integrator` and `postgresql-connect-integrator` applications should be in `active|idle` state, showing a message indicating that the ETL task is running: +Wait a couple of minutes and run `juju status`, now both `opensearch-connect-integrator` +and `postgresql-connect-integrator` applications should be in `active`/`idle` state, +showing a message indicating that the ETL task is running: ```text ... -opensearch-connect-integrator/0* active idle 14 10.38.169.108 8080/tcp Task Status: RUNNING -postgresql-connect-integrator/0* active idle 13 10.38.169.83 8080/tcp Task Status: RUNNING +opensearch-connect-integrator/0* active idle 14 10.109.154.70 8080/tcp Task Status: RUNNING +postgresql-connect-integrator/0* active idle 13 10.109.154.173 8080/tcp Task Status: RUNNING ... ``` -### Verify data transfer +## Verify data transfer -Now it's time to verify that the data is being copied from the PostgreSQL database to the OpenSearch index. +Now it's time to verify that the data is being copied from the PostgreSQL database +to the OpenSearch index. We can use the OpenSearch REST API for that purpose. First, retrieve the admin user credentials for OpenSearch using `get-password` action: @@ -324,42 +402,43 @@ As a result, you should see output similar to the following: ```text ... -password: GoCNE5KdFywT4nF1GSrwpAGyqRLecSXC +password: HTLPVZTzZPYhdrXyH3u8jvw42H9pWN4H username: admin ``` Then, retrieve the OpenSearch unit IP and save it into an environment variable: ```bash -OPENSEARCH_IP=$(juju ssh opensearch/0 'hostname -i') +OPENSEARCH_IP=$(juju ssh opensearch/0 'hostname -i' | tr -d '\r\n') ``` -Now, using the password obtained above, send a request to the topic's `_search` endpoint, either using your browser or `curl`: +**Using the password obtained above**, send a request to the topic's `_search` endpoint, +either using your browser or `curl`: ```bash -curl -u admin: -k -X GET https://$OPENSEARCH_IP:9200/etl_posts/_search +curl -u admin: -k -sS "https://${OPENSEARCH_IP}:9200/etl_posts/_search?pretty=true" ``` -As a result you get a JSON response containing the search results, which should have five documents. +As a result you get a JSON response containing the search results, which should have five documents. The `hits.total` value should be `5`, as shown in the output example below: ```text { - "took": 15, - "timed_out": false, - "_shards": { - "total": 1, - "successful": 1, - "skipped": 0, - "failed": 0 + "took" : 1, + "timed_out" : false, + "_shards" : { + "total" : 1, + "successful" : 1, + "skipped" : 0, + "failed" : 0 }, - "hits": { - "total": { - "value": 5, - "relation": "eq" + "hits" : { + "total" : { + "value" : 5, + "relation" : "eq" }, - "max_score": 1.0, - "hits": [ + "max_score" : 1.0, + "hits" : [ ... ] } @@ -367,16 +446,25 @@ The `hits.total` value should be `5`, as shown in the output example below: ``` -Now let's insert a new post into the PostgreSQL database. First SSH in to the PostgreSQL leader unit: +Now let's insert a new post into the PostgreSQL database. + +Get the password for the `operator` built-in user again: + +```shell +juju run postgresql/leader get-password +``` + +SSH to the PostgreSQL leader unit: ```bash juju ssh postgresql/leader ``` -Then, insert a new post using following command and the password for the `operator` user on the PostgreSQL: +Then, insert a new post using following command and the password for the `operator` user +on the PostgreSQL: ```bash -psql --host $(hostname -i) --username operator --password --dbname tutorial -c \ +psql --host $(hostname -i) --username operator --password --dbname tutorial -c \ "INSERT INTO posts (content, likes) VALUES ('my new post', 1)" ``` @@ -385,7 +473,7 @@ Log out from the PostgreSQL unit using `exit` command or the `Ctrl+D` keyboard s Then, check that the data is automatically copied to the OpenSearch index: ```bash -curl -u admin: -k -X GET https://$OPENSEARCH_IP:9200/etl_posts/_search +curl -u admin: -k -sS "https://${OPENSEARCH_IP}:9200/etl_posts/_search?pretty=true" ``` Which now should have six hits (output is truncated): @@ -393,14 +481,15 @@ Which now should have six hits (output is truncated): ```text { ... - "hits": { - "total": { - "value": 6, - "relation": "eq" - }, + "hits" : { + "total" : { + "value" : 6, + "relation" : "eq" + } + } ... } ``` -Congratulations! You have successfully completed an ETL job that continuously moves data from PostgreSQL to OpenSearch, using entirely charmed solutions. - +Congratulations! You have successfully completed an ETL job that continuously +moves data from PostgreSQL to OpenSearch, using entirely charmed solutions.