Skip to content

Commit 2582f2c

Browse files
Merge pull request #206372 from TheovanKraay/cassandra-mi-doc-updates
Cassandra MI doc updates
2 parents 68b1a89 + 9c2b157 commit 2582f2c

File tree

3 files changed

+105
-43
lines changed

3 files changed

+105
-43
lines changed

articles/managed-instance-apache-cassandra/configure-hybrid-cluster.md

Lines changed: 46 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ ms.devlang: azurecli
1111
---
1212
# Quickstart: Configure a hybrid cluster with Azure Managed Instance for Apache Cassandra
1313

14-
Azure Managed Instance for Apache Cassandra provides automated deployment and scaling operations for managed open-source Apache Cassandra datacenters. This service helps you accelerate hybrid scenarios and reduce ongoing maintenance.
14+
Azure Managed Instance for Apache Cassandra provides automated deployment and scaling operations for managed open-source Apache Cassandra datacenters. This service helps you accelerate hybrid scenarios and reduce ongoing maintenance.
1515

1616
This quickstart demonstrates how to use the Azure CLI commands to configure a hybrid cluster. If you have existing datacenters in an on-premises or self-hosted environment, you can use Azure Managed Instance for Apache Cassandra to add other datacenters to that cluster and maintain them.
1717

@@ -21,7 +21,7 @@ This quickstart demonstrates how to use the Azure CLI commands to configure a hy
2121

2222
* [Azure Virtual Network](../virtual-network/virtual-networks-overview.md) with connectivity to your self-hosted or on-premises environment. For more information on connecting on premises environments to Azure, see the [Connect an on-premises network to Azure](/azure/architecture/reference-architectures/hybrid-networking/) article.
2323

24-
## <a id="create-account"></a>Configure a hybrid cluster
24+
## <a id="configure-hybrid"></a>Configure a hybrid cluster
2525

2626
1. Sign in to the [Azure portal](https://portal.azure.com/) and navigate to your Virtual Network resource.
2727

@@ -195,38 +195,68 @@ This quickstart demonstrates how to use the Azure CLI commands to configure a hy
195195
> [!IMPORTANT]
196196
> If your existing Apache Cassandra cluster only has a single data center, and this is the first time a data center is being added, ensure that the `endpoint_snitch` parameter in `cassandra.yaml` is set to `GossipingPropertyFileSnitch`.
197197
198+
> [!IMPORTANT]
199+
> If your existing application code is using QUORUM for consistency, you should ensure that **prior to changing the replication settings in the step below**, your existing application code is using **LOCAL_QUORUM** to connect to your existing cluster (otherwise live updates will fail after you change replication settings in the below step). Once the replication strategy has been changed, you can revert to QUORUM if preferred.
200+
201+
198202
1. Finally, use the following CQL query to update the replication strategy in each keyspace to include all datacenters across the cluster:
199203

200204
```bash
201205
ALTER KEYSPACE "ks" WITH REPLICATION = {'class': 'NetworkTopologyStrategy', 'on-premise-dc': 3, 'managed-instance-dc': 3};
202206
```
203207

204-
You also need to update the password tables:
208+
You also need to update several system tables:
205209

206210
```bash
207211
ALTER KEYSPACE "system_auth" WITH REPLICATION = {'class': 'NetworkTopologyStrategy', 'on-premise-dc': 3, 'managed-instance-dc': 3}
212+
ALTER KEYSPACE "system_distributed" WITH REPLICATION = {'class': 'NetworkTopologyStrategy', 'on-premise-dc': 3, 'managed-instance-dc': 3}
213+
ALTER KEYSPACE "system_traces" WITH REPLICATION = {'class': 'NetworkTopologyStrategy', 'on-premise-dc': 3, 'managed-instance-dc': 3}
208214
```
209215

210216
> [!IMPORTANT]
211-
> If you are using hybrid cluster as a method of migrating historic data into the new Azure Managed Instance Cassandra data centers, ensure that you disable automatic repairs:
212-
> ```azurecli-interactive
213-
> az managed-cassandra cluster update --cluster-name --resource-group--repair-enabled false
214-
> ```
215-
> Then run `nodetool repair --full` on all the nodes in your existing cluster's data center. You should run this **only after all of the prior steps have been taken**. This should ensure that all historical data is replicated to your new data centers in Azure Managed Instance for Apache Cassandra. If you have a very large amount of data in your existing cluster, it may be necessary to run the repairs at the keyspace or even table level - see [here](https://cassandra.apache.org/doc/latest/cassandra/operating/repair.html) for more details on running repairs in Cassandra. Prior to changing the replication settings, you should also make sure that any application code that connects to your existing Cassandra cluster is using LOCAL_QUORUM. You should leave it at this setting during the migration (it can be switched back afterwards if required). After the migration is completed, you can enable automatic repair again, and point your application code to the new Cassandra Managed Instance data center's seed nodes (and revert the quorum settings if preferred).
216-
>
217-
> Finally, to decommission your old data center:
218-
>
219-
> - Run `ALTER KEYSPACE` for each keyspace, removing the old data center.
220-
> - We recommend running `nodetool repair` for each keyspace as well, before the below step.
221-
> - Run [nodetool decommision](https://cassandra.apache.org/doc/latest/cassandra/operating/topo_changes.html#removing-nodes) for each on premise data center node.
217+
> If the data center(s) in your existing cluster do not enforce [client-to-node encryption (SSL)](https://cassandra.apache.org/doc/3.11/cassandra/operating/security.html#client-to-node-encryption), and you intend for your application code to connect directly to Cassandra Managed Instance, you will also need to enable SSL in your application code.
218+
219+
220+
## <a id="hybrid-real-time-migration"></a>Use hybrid cluster for real-time migration
221+
222+
The above instructions provide guidance for configuring a hybrid cluster. However, this is also a great way of achieving a seamless zero-downtime migration. If you have an on-premise or other Cassandra environment that you want to decommission with zero downtime, in favour of running your workload in Azure Managed Instance for Apache Cassandra, the following steps must be completed in this order:
223+
224+
1. Configure hybrid cluster - follow the instructions above.
225+
1. Temporarily disable automatic repairs in Azure Managed Instance for Apache Cassandra for the duration of the migration:
226+
227+
```azurecli-interactive
228+
az managed-cassandra cluster update --cluster-name --resource-group--repair-enabled false
229+
```
230+
231+
1. Run `nodetool repair --full` on each node in your existing cluster's data center. You should run this **only after all of the prior steps have been taken**. This should ensure that all historical data is replicated to your new data centers in Azure Managed Instance for Apache Cassandra. For most installations you can only run one or two in parallel to not overload the cluster. You can monitor a particular repair run by checking `nodetool netsats` and `nodetool compactionstats` against the specific node. If you have a very large amount of data in your existing cluster, it may be necessary to run the repairs at the keyspace or even table level - see [here](https://cassandra.apache.org/doc/latest/cassandra/operating/repair.html) for more details on running repairs in Cassandra.
232+
233+
222234
223235
> [!NOTE]
224236
> To speed up repairs we advise (if system load permits it) to increase both stream throughput and compaction throughput as in the example below:
225237
>```azure-cli
226238
> az managed-cassandra cluster invoke-command --resource-group $resourceGroupName --cluster-name $clusterName --host $host --command-name nodetool --arguments "setstreamthroughput"="" "7000"=""
227239
>
228-
> az managed-cassandra cluster invoke-command --resource-group $resourceGroupName --cluster-name $clusterName --host $host --command-name nodetool --arguments "setcompactionthroughput"="" "960"=""
229-
>```
240+
> az managed-cassandra cluster invoke-command --resource-group $resourceGroupName --cluster-name $clusterName --host $host --command-name nodetool --arguments "setcompactionthroughput"="" "960"=""
241+
242+
1. Cut over your application code to point to the seed nodes in your new Azure Managed Instance for Apache Cassandra data center(s).
243+
244+
> [!IMPORTANT]
245+
> As also mentioned in the hybrid setup instructions, if the data center(s) in your existing cluster do not enforce [client-to-node encryption (SSL)](https://cassandra.apache.org/doc/3.11/cassandra/operating/security.html#client-to-node-encryption), you will need to enable this in your application code, as Cassandra Managed Instance enforces this.
246+
247+
1. Run nodetool repair **again** on all the nodes in your existing cluster's data center, in the same manner as in step 3 above (to ensure any deltas are replicated following application cut over).
248+
249+
1. Run ALTER KEYSPACE for each keyspace, in the same manner as done earlier, but now removing your old data center(s).
250+
251+
1. Run [nodetool decommission](https://cassandra.apache.org/doc/latest/cassandra/tools/nodetool/decommission.html) for each old data center node.
252+
253+
1. Switch your application code back to quorum (if required/preferred).
254+
255+
1. Re-enable automatic repairs:
256+
257+
```azurecli-interactive
258+
az managed-cassandra cluster update --cluster-name --resource-group--repair-enabled true
259+
```
230260
231261
## Troubleshooting
232262

articles/managed-instance-apache-cassandra/create-cluster-portal.md

Lines changed: 29 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -132,20 +132,7 @@ If you don't have an Azure subscription, create a [free account](https://azure.m
132132
133133
## Update Cassandra configuration
134134
135-
The service allows update to a limited set of Cassandra configurations on a datacenter via the portal or by [using CLI commands](manage-resources-cli.md#update-yaml). The following YAML settings are supported:
136-
137-
- column_index_size_in_kb
138-
- allocate_tokens_for_keyspace
139-
- compaction_throughput_mb_per_sec
140-
- read_request_timeout_in_ms
141-
- range_request_timeout_in_ms
142-
- aggregated_request_timeout_in_ms
143-
- write_request_timeout_in_ms
144-
- request_timeout_in_ms
145-
- internode_compression
146-
- batchlog_replay_throttle_in_kb
147-
148-
To update settings in the portal:
135+
The service allows update to Cassandra YAML configuration on a datacenter via the portal or by [using CLI commands](manage-resources-cli.md#update-yaml). To update settings in the portal:
149136
150137
1. Find `Cassandra Configuration` under settings. Highlight the data center whose configuration you want to change, and click update:
151138
@@ -160,7 +147,34 @@ To update settings in the portal:
160147
:::image type="content" source="./media/create-cluster-portal/update-config-3.png" alt-text="Screenshot of the updated Cassandra config." lightbox="./media/create-cluster-portal/update-config-3.png" border="true":::
161148
162149
> [!NOTE]
163-
> Only overridden Cassandra configuration values are shown in the portal.
150+
> Only overridden Cassandra configuration values are shown in the portal.
151+
152+
> [!IMPORTANT]
153+
> Ensure the Cassandra yaml settings you provide are appropriate for the version of Cassandra you have deployed. See [here](https://github.com/apache/cassandra/blob/cassandra-3.11/conf/cassandra.yaml) for Cassandra v3.11 settings and [here](https://github.com/apache/cassandra/blob/cassandra-4.0/conf/cassandra.yaml) for v4.0. The following YAML settings are **not** allowed to be updated:
154+
>
155+
> - cluster_name
156+
> - seed_provider
157+
> - initial_token
158+
> - autobootstrap
159+
> - client_ecncryption_options
160+
> - server_encryption_options
161+
> - transparent_data_encryption_options
162+
> - audit_logging_options
163+
> - authenticator
164+
> - authorizer
165+
> - role_manager
166+
> - storage_port
167+
> - ssl_storage_port
168+
> - native_transport_port
169+
> - native_transport_port_ssl
170+
> - listen_address
171+
> - listen_interface
172+
> - broadcast_address
173+
> - hints_directory
174+
> - data_file_directories
175+
> - commitlog_directory
176+
> - cdc_raw_directory
177+
> - saved_caches_directory
164178
165179
## Troubleshooting
166180

articles/managed-instance-apache-cassandra/manage-resources-cli.md

Lines changed: 30 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -222,18 +222,7 @@ az managed-cassandra datacenter update \
222222

223223
### <a id="update-yaml"></a>Update Cassandra configuration
224224

225-
Change Cassandra configuration on a datacenter by using the [az managed-cassandra datacenter update](/cli/azure/managed-cassandra/datacenter#az-managed-cassandra-datacenter-update) command. You will need to base64 encode the YAML fragment by using an [online tool](https://www.base64encode.org/). The following YAML settings are supported:
226-
227-
- column_index_size_in_kb
228-
- allocate_tokens_for_keyspace
229-
- compaction_throughput_mb_per_sec
230-
- read_request_timeout_in_ms
231-
- range_request_timeout_in_ms
232-
- aggregated_request_timeout_in_ms
233-
- write_request_timeout_in_ms
234-
- request_timeout_in_ms
235-
- internode_compression
236-
- batchlog_replay_throttle_in_kb
225+
Change Cassandra configuration on a datacenter by using the [az managed-cassandra datacenter update](/cli/azure/managed-cassandra/datacenter#az-managed-cassandra-datacenter-update) command. You will need to base64 encode the YAML fragment by using an [online tool](https://www.base64encode.org/).
237226

238227
For example, the following YAML fragment:
239228

@@ -261,6 +250,35 @@ az managed-cassandra datacenter update \
261250
--base64-encoded-cassandra-yaml-fragment $yamlFragment
262251
```
263252

253+
> [!IMPORTANT]
254+
> Ensure the Cassandra yaml settings you provide are appropriate for the version of Cassandra you have deployed. See [here](https://github.com/apache/cassandra/blob/cassandra-3.11/conf/cassandra.yaml) for Cassandra v3.11 settings and [here](https://github.com/apache/cassandra/blob/cassandra-4.0/conf/cassandra.yaml) for v4.0. The following YAML settings are **not** allowed to be updated:
255+
>
256+
> - cluster_name
257+
> - seed_provider
258+
> - initial_token
259+
> - autobootstrap
260+
> - client_ecncryption_options
261+
> - server_encryption_options
262+
> - transparent_data_encryption_options
263+
> - audit_logging_options
264+
> - authenticator
265+
> - authorizer
266+
> - role_manager
267+
> - storage_port
268+
> - ssl_storage_port
269+
> - native_transport_port
270+
> - native_transport_port_ssl
271+
> - listen_address
272+
> - listen_interface
273+
> - broadcast_address
274+
> - hints_directory
275+
> - data_file_directories
276+
> - commitlog_directory
277+
> - cdc_raw_directory
278+
> - saved_caches_directory
279+
280+
281+
264282
### <a id="get-datacenters-cluster"></a>Get the datacenters in a cluster
265283

266284
Get datacenters in a cluster by using the [az managed-cassandra datacenter list](/cli/azure/managed-cassandra/datacenter#az-managed-cassandra-datacenter-list) command:

0 commit comments

Comments
 (0)