|
| 1 | +--- |
| 2 | +title: Materialized Views for Azure Cosmos DB API for Cassandra. (Preview) |
| 3 | +description: This documentation is provided as a resource for participants in the preview of Azure Cosmos DB Cassandra API Materialized View. |
| 4 | +author: dileepraotv-github |
| 5 | +ms.service: cosmos-db |
| 6 | +ms.subservice: cosmosdb-cassandra |
| 7 | +ms.topic: how-to |
| 8 | +ms.date: 01/06/2022 |
| 9 | +ms.author: turao |
| 10 | +--- |
| 11 | + |
| 12 | +# Enable materialized views for Azure Cosmos DB API for Cassandra operations (Preview) |
| 13 | +[!INCLUDE[appliesto-cassandra-api](../includes/appliesto-cassandra-api.md)] |
| 14 | + |
| 15 | +> [!IMPORTANT] |
| 16 | +> Materialized Views for Azure Cosmos DB API for Cassandra is currently in gated preview. Please send an email to [email protected] to try this feature. |
| 17 | +> Materialized View preview version is provided without a service level agreement, and it's not recommended for production workloads. Certain features might not be supported or might have constrained capabilities. |
| 18 | +> For more information, see [Supplemental Terms of Use for Microsoft Azure Previews](https://azure.microsoft.com/support/legal/preview-supplemental-terms/). |
| 19 | +
|
| 20 | +## Feature overview |
| 21 | + |
| 22 | +Materialized Views when defined will help provide a means to efficiently query a base table (container on Cosmos DB) with non-primary key filters. When users write to the base table, the Materialized view is built automatically in the background. This view can have a different primary key for lookups. The view will also contain only the projected columns from the base table. It will be a read-only table. |
| 23 | + |
| 24 | +You can query a column store without specifying a partition key by using Secondary Indexes. However, the query won't be effective for columns with high cardinality (scanning through all data for a small result set) or columns with low cardinality. Such queries end up being expensive as they end up being a cross partition query. |
| 25 | + |
| 26 | +With Materialized view, you can |
| 27 | +- Use as Global Secondary Indexes and save cross partition scans that reduce expensive queries |
| 28 | +- Provide SQL based conditional predicate to populate only certain columns and certain data that meet the pre-condition |
| 29 | +- Real time MVs that simplify real time event based scenarios where customers today use Change feed trigger for precondition checks to populate new collections" |
| 30 | + |
| 31 | +## Main benefits |
| 32 | + |
| 33 | +- With Materialized View (Server side denormalization), you can avoid multiple independent tables and client side denormalization. |
| 34 | +- Materialized view feature takes on the responsibility of updating views in order to keep them consistent with the base table. With this feature, you can avoid dual writes to the base table and the view. |
| 35 | +- Materialized Views helps optimize read performance |
| 36 | +- Ability to specify throughput for the materialized view independently |
| 37 | +- Based on the requirements to hydrate the view, you can configure the MV builder layer appropriately. |
| 38 | +- Speeding up write operations as it only needs to be written to the base table. |
| 39 | +- Additionally, This implementation on Cosmos DB is based on a pull model, which doesn't affect the writer performance. |
| 40 | + |
| 41 | + |
| 42 | + |
| 43 | +## How to get started? |
| 44 | + |
| 45 | +New Cassandra API accounts with Materialized Views enabled can be provisioned on your subscription by using REST API calls from az CLI. |
| 46 | + |
| 47 | +### Log in to the Azure command line interface |
| 48 | + |
| 49 | +Install Azure CLI as mentioned at [How to install the Azure CLI | Microsoft Docs](https://docs.microsoft.com/cli/azure/install-azure-cli) and log on using the below: |
| 50 | + ```azurecli-interactive |
| 51 | + az login |
| 52 | + ``` |
| 53 | + |
| 54 | +### Create an account |
| 55 | + |
| 56 | +To create account with support for customer managed keys and materialized views skip to **this** section |
| 57 | + |
| 58 | +To create an account, use the following command after creating body.txt with the below content, replacing {{subscriptionId}} with your subscription ID, {{resourceGroup}} with a resource group name that you should have created in advance, and {{accountName}} with a name for your Cassandra API account. |
| 59 | + |
| 60 | + ```azurecli-interactive |
| 61 | + az rest --method PUT --uri https://management.azure.com/subscriptions/{{subscriptionId}}/resourcegroups/{{resourceGroup}}/providers/Microsoft.DocumentDb/databaseAccounts/{{accountName}}?api-version=2021-11-15-preview --body @body.txt |
| 62 | + body.txt content: |
| 63 | + { |
| 64 | + "location": "East US", |
| 65 | + "properties": |
| 66 | + { |
| 67 | + "databaseAccountOfferType": "Standard", |
| 68 | + "locations": [ { "locationName": "East US" } ], |
| 69 | + "capabilities": [ { "name": "EnableCassandra" }, { "name": "CassandraEnableMaterializedViews" }], |
| 70 | + "enableMaterializedViews": true |
| 71 | + } |
| 72 | + } |
| 73 | + ``` |
| 74 | + |
| 75 | + Wait for a few minutes and check the completion using the below, the provisioningState in the output should have become Succeeded: |
| 76 | + ``` |
| 77 | + az rest --method GET --uri https://management.azure.com/subscriptions/{{subscriptionId}}/resourcegroups/{{resourceGroup}}/providers/Microsoft.DocumentDb/databaseAccounts/{{accountName}}?api-version=2021-11-15-preview |
| 78 | + ``` |
| 79 | +### Create an account with support for customer managed keys and materialized views |
| 80 | + |
| 81 | +This step is optional – you can skip this step if you don't want to use Customer Managed Keys for your Cosmos DB account. |
| 82 | + |
| 83 | +To use Customer Managed Keys feature and Materialized views together on Cosmos DB account, you must first configure managed identities with Azure Active Directory for your account and then enable support for materialized views. |
| 84 | + |
| 85 | +You can use the documentation [here](https://docs.microsoft.com/azure/cosmos-db/how-to-setup-cmk) to configure your Cosmos DB Cassandra account with customer managed keys and setup managed identity access to the key Vault. Make sure you follow all the steps in [Using a managed identity in Azure key vault access policy](https://docs.microsoft.com/azure/cosmos-db/how-to-setup-managed-identity). The next step to enable materializedViews on the account. |
| 86 | + |
| 87 | +Once your account is set up with CMK and managed identity, you can enable materialized views on the account by enabling “enableMaterializedViews” property in the request body. |
| 88 | + |
| 89 | + ```azurecli-interactive |
| 90 | + az rest --method PATCH --uri https://management.azure.com/subscriptions/{{subscriptionId}}/resourcegroups/{{resourceGroup}}/providers/Microsoft.DocumentDb/databaseAccounts/{{accountName}}?api-version=2021-07-01-preview --body @body.txt |
| 91 | +
|
| 92 | +
|
| 93 | +body.txt content: |
| 94 | +{ |
| 95 | + "properties": |
| 96 | + { |
| 97 | + "enableMaterializedViews": true |
| 98 | + } |
| 99 | +} |
| 100 | + ``` |
| 101 | + |
| 102 | + |
| 103 | + Wait for a few minutes and check the completion using the below, the provisioningState in the output should have become Succeeded: |
| 104 | + ``` |
| 105 | +az rest --method GET --uri https://management.azure.com/subscriptions/{{subscriptionId}}/resourcegroups/{{resourceGroup}}/providers/Microsoft.DocumentDb/databaseAccounts/{{accountName}}?api-version=2021-07-01-preview |
| 106 | +``` |
| 107 | + |
| 108 | +Perform another patch to set “CassandraEnableMaterializedViews” capability and wait for it to succeed |
| 109 | + |
| 110 | +``` |
| 111 | +az rest --method PATCH --uri https://management.azure.com/subscriptions/{{subscriptionId}}/resourcegroups/{{resourceGroup}}/providers/Microsoft.DocumentDb/databaseAccounts/{{accountName}}?api-version=2021-07-01-preview --body @body.txt |
| 112 | +
|
| 113 | +body.txt content: |
| 114 | +{ |
| 115 | + "properties": |
| 116 | + { |
| 117 | + "capabilities": |
| 118 | +[{"name":"EnableCassandra"}, |
| 119 | + {"name":"CassandraEnableMaterializedViews"}] |
| 120 | + } |
| 121 | +} |
| 122 | +``` |
| 123 | + |
| 124 | +### Create materialized view builder |
| 125 | + |
| 126 | +Following this step, you'll also need to provision a Materialized View Builder: |
| 127 | + |
| 128 | +``` |
| 129 | +az rest --method PUT --uri https://management.azure.com/subscriptions/{{subscriptionId}}/resourcegroups/{{resourceGroup}}/providers/Microsoft.DocumentDb/databaseAccounts/{{accountName}}/services/materializedViewsBuilder?api-version=2021-07-01-preview --body @body.txt |
| 130 | +
|
| 131 | +body.txt content: |
| 132 | +{ |
| 133 | + "properties": |
| 134 | + { |
| 135 | + "serviceType": "materializedViewsBuilder", |
| 136 | + "instanceCount": 1, |
| 137 | + "instanceSize": "Cosmos.D4s" |
| 138 | + } |
| 139 | +} |
| 140 | +``` |
| 141 | + |
| 142 | +Wait for a couple of minutes and check the status using the below, the status in the output should have become Running: |
| 143 | + |
| 144 | +``` |
| 145 | +az rest --method GET --uri https://management.azure.com/subscriptions/{{subscriptionId}}/resourcegroups/{{resourceGroup}}/providers/Microsoft.DocumentDb/databaseAccounts/{{accountName}}/services/materializedViewsBuilder?api-version=2021-07-01-preview |
| 146 | +``` |
| 147 | + |
| 148 | +## Caveats and current limitations |
| 149 | + |
| 150 | +Once your account and Materialized View Builder is set up, you should be able to create Materialized views per the documentation [here](https://cassandra.apache.org/doc/latest/cql/mvs.html) : |
| 151 | + |
| 152 | +However, there are a few caveats with Cosmos DB Cassandra API’s preview implementation of Materialized Views: |
| 153 | +- Materialized Views can't be created on a table that existed before the account was onboarded to support materialized views. Create new table after account is onboarded on which materialized views can be defined. |
| 154 | +- For the MV definition’s WHERE clause, only “IS NOT NULL” filters are currently allowed. |
| 155 | +- After a Materialized View is created against a base table, ALTER TABLE ADD operations aren't allowed on the base table’s schema - they're allowed only if none of the MVs have select * in their definition. |
| 156 | + |
| 157 | +In addition to the above, note the following limitations |
| 158 | + |
| 159 | +### Availability zones limitations |
| 160 | + |
| 161 | +- Materialized views can't be enabled on an account that has Availability zone enabled regions. |
| 162 | +- Adding a new region with Availability zone is not supported once “enableMaterializedViews” is set to true on the account. |
| 163 | + |
| 164 | +### Periodic backup and restore limitations |
| 165 | + |
| 166 | +Materialized views aren't automatically restored with the restore process. Customer needs to re-create the materialized views after the restore process is complete. Customer needs to enableMaterializedViews on their restored account before creating the materialized views and provision the builders for the materialized views to be built. |
| 167 | + |
| 168 | +Other limitations similar to **Open Source Apache Cassandra** behavior |
| 169 | + |
| 170 | +- Defining Conflict resolution policy on Materialized Views is not allowed. |
| 171 | +- Write operations from customer aren't allowed on Materialized views. |
| 172 | +- Cross document queries and use of aggregate functions aren't supported on Materialized views. |
| 173 | +- Modifying MaterializedViewDefinitionString after MV creation is not supported. |
| 174 | +- Deleting base table is not allowed if at least one MV is defined on it. All the MVs must first be deleted and then the base table can be deleted. |
| 175 | +- Defining materialized views on containers with Static columns is not allowed |
| 176 | + |
| 177 | +## Under the hood |
| 178 | + |
| 179 | +Azure Cosmos DB Cassandra API uses a MV builder compute layer to maintain Materialized views. Customer gets flexibility to configure the MV builder compute instances depending on the latency and lag requirements to hydrate the views. The compute containers are shared among all MVs within the database account. Each provisioned compute container spawns off multiple tasks that read change feed from base table partitions and write data to MV (which is also another table) after transforming them as per MV definition for every MV in the database account. |
| 180 | + |
| 181 | +## Frequently asked questions (FAQs) … |
| 182 | + |
| 183 | + |
| 184 | +### What transformations/actions are supported? |
| 185 | + |
| 186 | +- Specifying a partition key that is different from base table partition key. |
| 187 | +- Support for projecting selected subset of columns from base table. |
| 188 | +- Determine if row from base table can be part of materialized view based on conditions evaluated on primary key columns of base table row. Filters supported - equalities, inequalities, contains. (Planned for GA) |
| 189 | + |
| 190 | +### What consistency levels will be supported? |
| 191 | + |
| 192 | +Data in materialized view is eventually consistent. User might read stale rows when compared to data on base table due to redo of some operations on MVs. This behavior is acceptable since we guarantee only eventual consistency on the MV. Customers can configure (scale up and scale down) the MV builder layer depending on the latency requirement for the view to be consistent with base table. |
| 193 | + |
| 194 | +### Will there be an autoscale layer for the MV builder instances? |
| 195 | + |
| 196 | +Autoscaling for MV builder is not available right now. The MV builder instances can be manually scaled by modifying the instance count(scale out) or instance size(scale up). |
| 197 | + |
| 198 | +### Details on the billing model |
| 199 | + |
| 200 | +The proposed billing model will be to charge the customers for: |
| 201 | + |
| 202 | +**MV Builder compute nodes** MV Builder Compute – Single tenant layer |
| 203 | + |
| 204 | +**Storage** The OLTP storage of the base table and MV based on existing storage meter for Containers. LogStore won't be charged. |
| 205 | + |
| 206 | +**Request Units** The provisioned RUs for base container and Materialized View. |
| 207 | + |
| 208 | +### What are the different SKUs that will be available? |
| 209 | +Refer to Pricing - [Azure Cosmos DB | Microsoft Azure](https://azure.microsoft.com/pricing/details/cosmos-db/) and check instances under Dedicated Gateway |
| 210 | + |
| 211 | +### What type of TTL support do we have? |
| 212 | + |
| 213 | +Setting table level TTL on MV is not allowed. TTL from base table rows will be applied on MV as well. |
| 214 | + |
| 215 | + |
| 216 | +### Initial troubleshooting if MVs aren't up to date: |
| 217 | +- Check if MV builder instances are provisioned |
| 218 | +- Check if enough RUs are provisioned on the base table |
| 219 | +- Check for unavailability on Base table or MV |
| 220 | + |
| 221 | +### What type of monitoring is available in addition to the existing monitoring for Cassandra API? |
| 222 | + |
| 223 | +- Max Materialized View Catchup Gap in Minutes – Value(t) indicates rows written to base table in last ‘t’ minutes is yet to be propagated to MV. |
| 224 | +- Metrics related to RUs consumed on base table for MV build (read change feed cost) |
| 225 | +- Metrics related to RUs consumed on MV for MV build (write cost) |
| 226 | +- Metrics related to resource consumption on MV builders (CPU, memory usage metrics) |
| 227 | + |
| 228 | + |
| 229 | +### What are the restore options available for MVs? |
| 230 | +MVs can't be restored. Hence, MVs will need to be recreated once the base table is restored. |
| 231 | + |
| 232 | +### Can you create more than one view on a base table? |
| 233 | + |
| 234 | +Multiple views can be created on the same base table. Limit of five views is enforced. |
| 235 | + |
| 236 | +### How is uniqueness enforced on the materialized view? How will the mapping between the records in base table to the records in materialized view look like? |
| 237 | + |
| 238 | +The partition and clustering key of the base table are always part of primary key of any materialized view defined on it and enforce uniqueness of primary key after data repartitioning. |
| 239 | + |
| 240 | +### Can we add or remove columns on the base table once materialized view is defined? |
| 241 | + |
| 242 | +You'll be able to add a column to the base table, but you won't be able to remove a column. After a MV is created against a base table, ALTER TABLE ADD operations aren't allowed on the base table - they're allowed only if none of the MVs have select * in their definition. Cassandra doesn't support dropping columns on the base table if it has a materialized view defined on it. |
| 243 | + |
| 244 | +### Can we create MV on existing base table? |
| 245 | + |
| 246 | +No. Materialized Views can't be created on a table that existed before the account was onboarded to support materialized views. Create new table after account is onboarded on which materialized views can be defined. MV on existing table is planned for the future. |
| 247 | + |
| 248 | +### What are the conditions on which records won't make it to MV and how to identify such records? |
| 249 | + |
| 250 | +Below are some of the identified cases where data from base table can't be written to MV as they violate some constraints on MV table- |
| 251 | +- Rows that don’t satisfy partition key size limit in the materialized views |
| 252 | +- Rows that don't satisfy clustering key size limit in materialized views |
| 253 | + |
| 254 | +Currently we drop these rows but plan to expose details related to dropped rows in future so that the user can reconcile the missing data. |
0 commit comments