You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/managed-instance-apache-cassandra/configure-hybrid-cluster.md
+46-16Lines changed: 46 additions & 16 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -11,7 +11,7 @@ ms.devlang: azurecli
11
11
---
12
12
# Quickstart: Configure a hybrid cluster with Azure Managed Instance for Apache Cassandra
13
13
14
-
Azure Managed Instance for Apache Cassandra provides automated deployment and scaling operations for managed open-source Apache Cassandra datacenters. This service helps you accelerate hybrid scenarios and reduce ongoing maintenance.
14
+
Azure Managed Instance for Apache Cassandra provides automated deployment and scaling operations for managed open-source Apache Cassandra datacenters. This service helps you accelerate hybrid scenarios and reduce ongoing maintenance.
15
15
16
16
This quickstart demonstrates how to use the Azure CLI commands to configure a hybrid cluster. If you have existing datacenters in an on-premises or self-hosted environment, you can use Azure Managed Instance for Apache Cassandra to add other datacenters to that cluster and maintain them.
17
17
@@ -21,7 +21,7 @@ This quickstart demonstrates how to use the Azure CLI commands to configure a hy
21
21
22
22
*[Azure Virtual Network](../virtual-network/virtual-networks-overview.md) with connectivity to your self-hosted or on-premises environment. For more information on connecting on premises environments to Azure, see the [Connect an on-premises network to Azure](/azure/architecture/reference-architectures/hybrid-networking/) article.
23
23
24
-
## <aid="create-account"></a>Configure a hybrid cluster
24
+
## <aid="configure-hybrid"></a>Configure a hybrid cluster
25
25
26
26
1. Sign in to the [Azure portal](https://portal.azure.com/) and navigate to your Virtual Network resource.
27
27
@@ -195,38 +195,68 @@ This quickstart demonstrates how to use the Azure CLI commands to configure a hy
195
195
> [!IMPORTANT]
196
196
> If your existing Apache Cassandra cluster only has a single data center, and this is the first time a data center is being added, ensure that the `endpoint_snitch` parameter in `cassandra.yaml` is set to `GossipingPropertyFileSnitch`.
197
197
198
+
> [!IMPORTANT]
199
+
> If your existing application code is using QUORUM for consistency, you should ensure that **prior to changing the replication settings in the step below**, your existing application code is using **LOCAL_QUORUM** to connect to your existing cluster (otherwise live updates will fail after you change replication settings in the below step). Once the replication strategy has been changed, you can revert to QUORUM if preferred.
200
+
201
+
198
202
1. Finally, use the following CQL query to update the replication strategy in each keyspace to include all datacenters across the cluster:
199
203
200
204
```bash
201
205
ALTER KEYSPACE "ks" WITH REPLICATION = {'class': 'NetworkTopologyStrategy', 'on-premise-dc': 3, 'managed-instance-dc': 3};
202
206
```
203
207
204
-
You also need to update the password tables:
208
+
You also need to update several system tables:
205
209
206
210
```bash
207
211
ALTER KEYSPACE "system_auth" WITH REPLICATION = {'class': 'NetworkTopologyStrategy', 'on-premise-dc': 3, 'managed-instance-dc': 3}
212
+
ALTER KEYSPACE "system_distributed" WITH REPLICATION = {'class': 'NetworkTopologyStrategy', 'on-premise-dc': 3, 'managed-instance-dc': 3}
213
+
ALTER KEYSPACE "system_traces" WITH REPLICATION = {'class': 'NetworkTopologyStrategy', 'on-premise-dc': 3, 'managed-instance-dc': 3}
208
214
```
209
215
210
216
> [!IMPORTANT]
211
-
> If you are using hybrid cluster as a method of migrating historic data into the new Azure Managed Instance Cassandra data centers, ensure that you disable automatic repairs:
212
-
> ```azurecli-interactive
213
-
> az managed-cassandra cluster update --cluster-name --resource-group--repair-enabled false
214
-
> ```
215
-
> Then run `nodetool repair --full` on all the nodes in your existing cluster's data center. You should run this **only after all of the prior steps have been taken**. This should ensure that all historical data is replicated to your new data centers in Azure Managed Instance for Apache Cassandra. If you have a very large amount of data in your existing cluster, it may be necessary to run the repairs at the keyspace or even table level - see [here](https://cassandra.apache.org/doc/latest/cassandra/operating/repair.html) for more details on running repairs in Cassandra. Prior to changing the replication settings, you should also make sure that any application code that connects to your existing Cassandra cluster is using LOCAL_QUORUM. You should leave it at this setting during the migration (it can be switched back afterwards if required). After the migration is completed, you can enable automatic repair again, and point your application code to the new Cassandra Managed Instance data center's seed nodes (and revert the quorum settings if preferred).
216
-
>
217
-
> Finally, to decommission your old data center:
218
-
>
219
-
> - Run `ALTER KEYSPACE` for each keyspace, removing the old data center.
220
-
> - We recommend running `nodetool repair` for each keyspace as well, before the below step.
221
-
> - Run [nodetool decommision](https://cassandra.apache.org/doc/latest/cassandra/operating/topo_changes.html#removing-nodes) for each on premise data center node.
217
+
> If the data center(s) in your existing cluster do not enforce [client-to-node encryption (SSL)](https://cassandra.apache.org/doc/3.11/cassandra/operating/security.html#client-to-node-encryption), and you intend for your application code to connect directly to Cassandra Managed Instance, you will also need to enable SSL in your application code.
218
+
219
+
220
+
## <aid="hybrid-real-time-migration"></a>Use hybrid cluster for real-time migration
221
+
222
+
The above instructions provide guidance for configuring a hybrid cluster. However, this is also a great way of achieving a seamless zero-downtime migration. If you have an on-premise or other Cassandra environment that you want to decommission with zero downtime, in favour of running your workload in Azure Managed Instance for Apache Cassandra, the following steps must be completed in this order:
223
+
224
+
1. Configure hybrid cluster - follow the instructions above.
225
+
1. Temporarily disable automatic repairs in Azure Managed Instance for Apache Cassandra for the duration of the migration:
226
+
227
+
```azurecli-interactive
228
+
az managed-cassandra cluster update --cluster-name --resource-group--repair-enabled false
229
+
```
230
+
231
+
1. Run `nodetool repair --full` on each node in your existing cluster's data center. You should run this **only after all of the prior steps have been taken**. This should ensure that all historical data is replicated to your new data centers in Azure Managed Instance for Apache Cassandra. For most installations you can only run one or two in parallel to not overload the cluster. You can monitor a particular repair run by checking `nodetool netsats` and `nodetool compactionstats` against the specific node. If you have a very large amount of data in your existing cluster, it may be necessary to run the repairs at the keyspace or even table level - see [here](https://cassandra.apache.org/doc/latest/cassandra/operating/repair.html) for more details on running repairs in Cassandra.
232
+
233
+
222
234
223
235
> [!NOTE]
224
236
> To speed up repairs we advise (if system load permits it) to increase both stream throughput and compaction throughput as in the example below:
1. Cut over your application code to point to the seed nodes in your new Azure Managed Instance for Apache Cassandra data center(s).
243
+
244
+
> [!IMPORTANT]
245
+
> As also mentioned in the hybrid setup instructions, if the data center(s) in your existing cluster do not enforce [client-to-node encryption (SSL)](https://cassandra.apache.org/doc/3.11/cassandra/operating/security.html#client-to-node-encryption), you will need to enable this in your application code, as Cassandra Managed Instance enforces this.
246
+
247
+
1. Run nodetool repair **again** on all the nodes in your existing cluster's data center, in the same manner as in step 3 above (to ensure any deltas are replicated following application cut over).
248
+
249
+
1. Run ALTER KEYSPACE for each keyspace, in the same manner as done earlier, but now removing your old data center(s).
250
+
251
+
1. Run [nodetool decommission](https://cassandra.apache.org/doc/latest/cassandra/tools/nodetool/decommission.html) for each old data center node.
252
+
253
+
1. Switch your application code back to quorum (if required/preferred).
254
+
255
+
1. Re-enable automatic repairs:
256
+
257
+
```azurecli-interactive
258
+
az managed-cassandra cluster update --cluster-name --resource-group--repair-enabled true
Copy file name to clipboardExpand all lines: articles/managed-instance-apache-cassandra/create-cluster-portal.md
+29-15Lines changed: 29 additions & 15 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -132,20 +132,7 @@ If you don't have an Azure subscription, create a [free account](https://azure.m
132
132
133
133
## Update Cassandra configuration
134
134
135
-
The service allows update to a limited set of Cassandra configurations on a datacenter via the portal or by [using CLI commands](manage-resources-cli.md#update-yaml). The following YAML settings are supported:
136
-
137
-
- column_index_size_in_kb
138
-
- allocate_tokens_for_keyspace
139
-
- compaction_throughput_mb_per_sec
140
-
- read_request_timeout_in_ms
141
-
- range_request_timeout_in_ms
142
-
- aggregated_request_timeout_in_ms
143
-
- write_request_timeout_in_ms
144
-
- request_timeout_in_ms
145
-
- internode_compression
146
-
- batchlog_replay_throttle_in_kb
147
-
148
-
To update settings in the portal:
135
+
The service allows update to Cassandra YAML configuration on a datacenter via the portal or by [using CLI commands](manage-resources-cli.md#update-yaml). To update settings in the portal:
149
136
150
137
1. Find `Cassandra Configuration` under settings. Highlight the data center whose configuration you want to change, and click update:
151
138
@@ -160,7 +147,34 @@ To update settings in the portal:
160
147
:::image type="content" source="./media/create-cluster-portal/update-config-3.png" alt-text="Screenshot of the updated Cassandra config." lightbox="./media/create-cluster-portal/update-config-3.png" border="true":::
161
148
162
149
> [!NOTE]
163
-
> Only overridden Cassandra configuration values are shown in the portal.
150
+
> Only overridden Cassandra configuration values are shown in the portal.
151
+
152
+
> [!IMPORTANT]
153
+
> Ensure the Cassandra yaml settings you provide are appropriate for the version of Cassandra you have deployed. See [here](https://github.com/apache/cassandra/blob/cassandra-3.11/conf/cassandra.yaml) for Cassandra v3.11 settings and [here](https://github.com/apache/cassandra/blob/cassandra-4.0/conf/cassandra.yaml) for v4.0. The following YAML settings are **not** allowed to be updated:
Change Cassandra configuration on a datacenter by using the [az managed-cassandra datacenter update](/cli/azure/managed-cassandra/datacenter#az-managed-cassandra-datacenter-update) command. You will need to base64 encode the YAML fragment by using an [online tool](https://www.base64encode.org/). The following YAML settings are supported:
226
-
227
-
- column_index_size_in_kb
228
-
- allocate_tokens_for_keyspace
229
-
- compaction_throughput_mb_per_sec
230
-
- read_request_timeout_in_ms
231
-
- range_request_timeout_in_ms
232
-
- aggregated_request_timeout_in_ms
233
-
- write_request_timeout_in_ms
234
-
- request_timeout_in_ms
235
-
- internode_compression
236
-
- batchlog_replay_throttle_in_kb
225
+
Change Cassandra configuration on a datacenter by using the [az managed-cassandra datacenter update](/cli/azure/managed-cassandra/datacenter#az-managed-cassandra-datacenter-update) command. You will need to base64 encode the YAML fragment by using an [online tool](https://www.base64encode.org/).
237
226
238
227
For example, the following YAML fragment:
239
228
@@ -261,6 +250,35 @@ az managed-cassandra datacenter update \
> Ensure the Cassandra yaml settings you provide are appropriate for the version of Cassandra you have deployed. See [here](https://github.com/apache/cassandra/blob/cassandra-3.11/conf/cassandra.yaml) for Cassandra v3.11 settings and [here](https://github.com/apache/cassandra/blob/cassandra-4.0/conf/cassandra.yaml) for v4.0. The following YAML settings are **not** allowed to be updated:
255
+
>
256
+
> - cluster_name
257
+
> - seed_provider
258
+
> - initial_token
259
+
> - autobootstrap
260
+
> - client_ecncryption_options
261
+
> - server_encryption_options
262
+
> - transparent_data_encryption_options
263
+
> - audit_logging_options
264
+
> - authenticator
265
+
> - authorizer
266
+
> - role_manager
267
+
> - storage_port
268
+
> - ssl_storage_port
269
+
> - native_transport_port
270
+
> - native_transport_port_ssl
271
+
> - listen_address
272
+
> - listen_interface
273
+
> - broadcast_address
274
+
> - hints_directory
275
+
> - data_file_directories
276
+
> - commitlog_directory
277
+
> - cdc_raw_directory
278
+
> - saved_caches_directory
279
+
280
+
281
+
264
282
### <a id="get-datacenters-cluster"></a>Get the datacenters in a cluster
265
283
266
284
Get datacenters in a cluster by using the [az managed-cassandra datacenter list](/cli/azure/managed-cassandra/datacenter#az-managed-cassandra-datacenter-list) command:
0 commit comments