|
| 1 | +# Managing Data Contracts in Confluent Cloud |
| 2 | + |
| 3 | +Data contracts consist not only of the schemas to define the structure of events, but also rulesets allowing for more fine-grained validations, |
| 4 | +controls, and discovery. In this demo, we'll evolve a schema by adding migration rules. |
| 5 | + |
| 6 | +I'd like to credit my colleague Gilles Philippart's work as the basis for the schemas and rules in this demo. My goal here is to add examples with commonly used |
| 7 | +developer tools and practices to his work in this area, which can be found [here in GitHub](https://github.com/gphilipp/migration-rules-demo). |
| 8 | + |
| 9 | +## Running the Example |
| 10 | + |
| 11 | +<img src="./images/overview.png" width="1000" height="600"> |
| 12 | + |
| 13 | +In the workflow above, we see these tools in action: |
| 14 | +* The [Confluent Terraform Provider](https://registry.terraform.io/providers/confluentinc/confluent/latest/docs) is used to define Confluent Cloud assets (Kafka cluster(s), Data Governance, Kafka Topics, and Schema Configurations). |
| 15 | +* Using the newly created Schema Registry, data engineers and architects define the schema of the events that comprise the organization's canonical data model - i.e. entities, events, and commands that are shared across applications. - along with other parts of the data contract. This includes data quality rules, metadata, and migration rules. A gradle plugin is utilized to register the schemas and related elements of the data contract with the Schema Registry. |
| 16 | +* Applications which producer and/or consume these event types can download the schemas from the Schema Registry. In our example, this is a JVM application built using Gradle. A gradle plugin is used to download the schemas, after which another gradle plugin is used to generate Java classes from those schemas - thus providing the application with compile-time type safety. |
| 17 | + |
| 18 | +### Prerequisites |
| 19 | + |
| 20 | +Clone the `confluentinc/demo-scene` GitHub repository (if you haven't already) and navigate to the `demo-scene` directory: |
| 21 | + |
| 22 | +```shell |
| 23 | +git clone [email protected]:confluentinc/demo-scene.git |
| 24 | +cd demo-scene |
| 25 | +``` |
| 26 | + |
| 27 | +Here are the tools needed to run this tutorial: |
| 28 | +* [Confluent Cloud](http://confluent.cloud) |
| 29 | +* [Confluent CLI](https://docs.confluent.io/confluent-cli/current/install.html) |
| 30 | +* [Terraform](https://developer.hashicorp.com/terraform/install?product_intent=terraform) |
| 31 | +* [jq](https://jqlang.github.io/jq/) |
| 32 | +* [Gradle](https://gradle.org/install/) (version 8.5 or higher) |
| 33 | +* JDK 17 |
| 34 | +* IDE of choice |
| 35 | + |
| 36 | +> When installing and configuring the Confluent CLI, include the Confluent Cloud credentials as environment variables for future use. For instance with bash or zsh, include these export statements: |
| 37 | +> |
| 38 | +> ```shell |
| 39 | +> export CONFLUENT_CLOUD_API_KEY=<API KEY> |
| 40 | +> export CONFLUENT_CLOUD_API_SECRET<API SECRET> |
| 41 | +> ``` |
| 42 | +> |
| 43 | +
|
| 44 | +Terraform can use the value of any environment variable whose name begins with `TF_VAR_` as the value of a terraform variable of the same name. For more on this functionality, see the [terraform documentation](https://developer.hashicorp.com/terraform/cli/config/environment-variables#tf_var_name). |
| 45 | +
|
| 46 | +Our example requires we set the value the `org_id` variable from Confluent Cloud. This denotes which organization will house the Confluent Cloud assets we are creating. So let's export the Confluent Cloud organization ID to a terraform environment variable. |
| 47 | +
|
| 48 | +This command may open a browser window asking you to authenticate to Confluent Cloud. Once that's complete, the result of |
| 49 | +`confluent organization list` is queried by `jq` to extract the `id` of the current organization to which you are authenticated: |
| 50 | +
|
| 51 | +```shell |
| 52 | +export TF_VAR_org_id=$(confluent organization list -o json | jq -c -r '.[] | select(.is_current)' | jq '.id') |
| 53 | +``` |
| 54 | +
|
| 55 | +### Create Assets in Confluent Cloud |
| 56 | + |
| 57 | +From the root of `data-contracts`, run the provided setup script - `setup.sh`. You may need to edit permissions to make this file executable. |
| 58 | + |
| 59 | +Because this script is provisioning multiple assets in Confluent Cloud and downloading the appropriate gradle-wrapper JAR files for each module, |
| 60 | +it may take a few minutes to complete. |
| 61 | + |
| 62 | +```shell |
| 63 | +chmod +x setup.sh |
| 64 | +./setup.sh |
| 65 | +Setup Confluent Cloud and all Gradle Builds for Demo |
| 66 | +------------------------------------- |
| 67 | +Initializing Confluent Cloud Environment... |
| 68 | +...... |
| 69 | +...... |
| 70 | +...... |
| 71 | +...... |
| 72 | +BUILD SUCCESSFUL in 2s |
| 73 | +6 actionable tasks: 6 executed |
| 74 | +Watched directory hierarchies: [/Users/sjacobs/code/confluentinc/demo-scene/data-contracts] |
| 75 | +Code Generation Complete |
| 76 | +------------------------------------- |
| 77 | + |
| 78 | +Setup Complete! |
| 79 | +``` |
| 80 | + |
| 81 | +Let's have a look at what we've created in Confluent Cloud, we find a new Environment: |
| 82 | + |
| 83 | + |
| 84 | + |
| 85 | +With a Kafka cluster: |
| 86 | + |
| 87 | + |
| 88 | + |
| 89 | +And Data Contracts: |
| 90 | + |
| 91 | + |
| 92 | + |
| 93 | + |
| 94 | +Locally, we also create a `properties` file containing the parameters needed for our Kafka clients to connect to Confluent Cloud. For an example of this |
| 95 | +`properties` file, see [confluent.properties.orig](shared/src/main/resources/confluent.properties.orig). |
| 96 | + |
| 97 | +> **NOTE**: The file-based approach we're using here is NOT recommended for a production-quality application. Perhaps a secrets manager implementation would be better suited - which the major cloud providers all offer, or perhaps a tool like Hashicorp Vault. Such a tool would also have client libraries in a Maven repository for the JVM applications to access the secrets. |
| 98 | +> |
| 99 | +
|
| 100 | +### Run the Examples |
| 101 | + |
| 102 | +In `app-schema-v1`, the `ApplicationMain` object's `main` function starts a consumer in a new thread to subscribe to the `membership-avro` topic. It then begins |
| 103 | +producing randomly-generated events to `membership-avro` at a provided interval for a provided duration. By default, an event is produced every 1 second for 100 seconds. These events are created and consumed using version 1 of the membership schema. The console output of the consumer should look something like this: |
| 104 | + |
| 105 | +```shell |
| 106 | +[Thread-0] INFO io.confluent.devrel.datacontracts.shared.BaseConsumer - Received Membership d0e65c83-b1c5-451d-b08b-8d1ed6fca8d6, {"user_id": "d0e65c83-b1c5-451d-b08b-8d1ed6fca8d6", "start_date": "2023-01-14", "end_date": "2025-05-28"} |
| 107 | +[Thread-0] INFO io.confluent.devrel.datacontracts.shared.BaseConsumer - Received Membership 940cf6fa-eb12-46af-87e8-5a9bc33df119, {"user_id": "940cf6fa-eb12-46af-87e8-5a9bc33df119", "start_date": "2023-05-23", "end_date": "2025-07-02"} |
| 108 | +``` |
| 109 | + |
| 110 | +The `app-schema-v2` module's `main` function starts a consumer subscribed to the `membership-avro` topic. But this time, events will be consumed using |
| 111 | +version 2 of the membership schema. Notice the `Map` of consumer overrides in the constructor of the `MembershipConsumer` in that module. As such, those |
| 112 | +same events which `app-schema-v1` produced using `major_version=1` of the schema are consumed using `major_version=2`: |
| 113 | + |
| 114 | +```shell |
| 115 | +[Thread-0] INFO io.confluent.devrel.datacontracts.shared.BaseConsumer - v2 - Received Membership b0e34c68-208c-4771-be19-79689fc9ad28, {"user_id": "b0e34c68-208c-4771-be19-79689fc9ad28", "validity_period": {"from": "2022-11-06", "to": "2025-09-03"}} |
| 116 | +[Thread-0] INFO io.confluent.devrel.datacontracts.shared.BaseConsumer - v2 - Received Membership 517f8c7e-4ae5-47ea-93a2-c1f00669d330, {"user_id": "517f8c7e-4ae5-47ea-93a2-c1f00669d330", "validity_period": {"from": "2023-01-29", "to": "2025-02-19"}} |
| 117 | +``` |
| 118 | + |
| 119 | +This illustrates how the use of migration rules allows producers and consumers to independently make any code and configuration changes needed to accommodate schema changes. |
| 120 | + |
| 121 | +## Teardown |
| 122 | + |
| 123 | +When you're done with the demo, issue this command from the `cc-terraform` directory to destroy the Confluent Cloud environment |
| 124 | +we created: |
| 125 | + |
| 126 | +```shell |
| 127 | +terraform destroy -auto-approve |
| 128 | +``` |
| 129 | + |
| 130 | +Check the Confluent Cloud console to ensure this environment no longer exists. |
0 commit comments