You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/cosmos-db/cassandra-faq.md
+22-22Lines changed: 22 additions & 22 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,17 +1,17 @@
1
1
---
2
-
title: Frequently asked questions about the Cassandra API in Azure Cosmos DB
3
-
description: Get answers to frequently asked questions about Cassandra API in Azure Cosmos DB.
2
+
title: Frequently asked questions about the Cassandra API for Azure Cosmos DB
3
+
description: Get answers to frequently asked questions about the Cassandra API for Azure Cosmos DB.
4
4
author: TheovanKraay
5
5
ms.service: cosmos-db
6
6
ms.topic: conceptual
7
7
ms.date: 04/09/2020
8
8
ms.author: thvankra
9
9
---
10
-
# Frequently asked questions about the Cassandra API in Azure Cosmos DB
10
+
# Frequently asked questions about the Cassandra API for Azure Cosmos DB
11
11
12
12
## What are some key differences between Apache Cassandra and the Cassandra API?
13
13
14
-
- Apache Cassandra recommends a 100-MB limit on the size of a partition key. Cassandra API allows up to 10 GB per partition.
14
+
- Apache Cassandra recommends a 100-MB limit on the size of a partition key. The Cassandra API for Azure Cosmos DB allows up to 10 GB per partition.
15
15
- Apache Cassandra allows you to disable durable commits. You can skip writing to the commit log and go directly to the memtables. This can lead to data loss if the node goes down before memtables are flushed to SSTables on disk. Azure Cosmos DB always does durable commits to help prevent data loss.
16
16
- Apache Cassandra can see diminished performance if the workload involves many replacements or deletions. The reason is tombstones that the read workload needs to skip over to fetch the latest data. The Cassandra API won't see diminished read performance when the workload has many replacements or deletions.
17
17
- During scenarios of high replacement workloads, compaction needs to run to merge SSTables on disk. (A merge is needed because Apache Cassandra's writes are append only. Multiple updates are stored as individual SSTable entries that need to be periodically merged). This situation can also lead to lowered read performance during compaction. This performance impact doesn't happen in the Cassandra API because the API doesn't implement compaction.
@@ -26,16 +26,16 @@ The Cassandra API for Azure Cosmos DB supports CQL version 3.x. Its CQL compatib
26
26
27
27
## Why is choosing throughput for a table a requirement?
28
28
29
-
Azure Cosmos DB sets the default throughput for your container based on where you create the table from: portal or CQL.
29
+
Azure Cosmos DB sets the default throughput for your container based on where you create the table from: Azure portal or CQL.
30
30
31
-
Azure Cosmos DB provides guarantees for performance and latency, with upper bounds on operation. These guarantees are possible when the engine can enforce governance on the tenant's operations. Setting throughput ensures that you get the guaranteed throughput and latency, because the platform reserves this capacity and guarantees operation success.
31
+
Azure Cosmos DB provides guarantees for performance and latency, with upper bounds on operations. These guarantees are possible when the engine can enforce governance on the tenant's operations. Setting throughput ensures that you get the guaranteed throughput and latency, because the platform reserves this capacity and guarantees operation success.
32
32
You can [elastically change throughput](manage-scale-cassandra.md) to benefit from the seasonality of your application and save costs.
33
33
34
34
The throughput concept is explained in the [Request Units in Azure Cosmos DB](request-units.md) article. The throughput for a table is equally distributed across the underlying physical partitions.
35
35
36
36
## What is the throughput of a table that's created through CQL?
37
37
38
-
Azure Cosmos DB uses Request Units per second (RU/s) as a currency for providing throughput. Tables created through CQL have 400 RU. You can change the RU from the portal.
38
+
Azure Cosmos DB uses Request Units per second (RU/s) as a currency for providing throughput. Tables created through CQL have 400 RU. You can change the RU from the Azure portal.
Azure Cosmos DB provides guarantees for performance and latency, with upper bounds on operation. These guarantees are possible when the engine can enforce governance on the tenant's operations. Setting throughput ensures that you get the guaranteed throughput and latency, because the platform reserves this capacity and guarantees operation success.
58
+
Azure Cosmos DB provides guarantees for performance and latency, with upper bounds on operations. These guarantees are possible when the engine can enforce governance on the tenant's operations. Setting throughput ensures that you get the guaranteed throughput and latency, because the platform reserves this capacity and guarantees operation success.
59
59
60
60
When you go over this capacity, you get the following error message that indicates your capacity was used up:
61
61
62
62
**0x1001 Overloaded: the request can't be processed because "Request Rate is large"**
63
63
64
-
It's essential to see what operations (and their volume) cause this issue. You can get an idea about consumed capacity going over the provisioned capacity with metrics on the portal. Then you need to ensure capacity is consumed nearly equally across all underlying partitions. If you see that one partition is consuming most of the throughput, you have skew of workload.
64
+
It's essential to see what operations (and their volume) cause this issue. You can get an idea about consumed capacity going over the provisioned capacity with metrics on the Azure portal. Then you need to ensure that capacity is consumed nearly equally across all underlying partitions. If you see that one partition is consuming most of the throughput, you have skew of workload.
65
65
66
66
Metrics are available that show you how throughput is used over hours, over days, and per seven days, across partitions or in aggregate. For more information, see [Monitoring and debugging with metrics in Azure Cosmos DB](use-metrics.md).
67
67
68
68
Diagnostic logs are explained in the [Azure Cosmos DB diagnostic logging](logging.md) article.
69
69
70
70
## Does the primary key map to the partition key concept of Azure Cosmos DB?
71
71
72
-
Yes, the partition key is used to place the entity in right location. In Azure Cosmos DB, it's used to find the right logical partition that's stored on a physical partition. The partitioning concept is well explained in the [Partition and scale in Azure Cosmos DB](partition-data.md) article. The essential takeaway here is that a logical partition shouldn't go over the 10-GB limit.
72
+
Yes, the partition key is used to place the entity in the right location. In Azure Cosmos DB, it's used to find the right logical partition that's stored on a physical partition. The partitioning concept is well explained in the [Partition and scale in Azure Cosmos DB](partition-data.md) article. The essential takeaway here is that a logical partition shouldn't go over the 10-GB limit.
73
73
74
74
## What happens when I get a notification that a partition is full?
75
75
@@ -79,7 +79,7 @@ You should adhere to the 10-GB limit on the number of entities or items per logi
79
79
80
80
## Can I use the API as a key value store with millions or billions of partition keys?
81
81
82
-
Azure Cosmos DB can store unlimited data by scaling out the storage. This is independent of the throughput. Yes you can always use the Cassandra API just to store and retrieve keys and values by specifying the right primary/partition key. These individual keys get their own logical partition and sit atop a physical partition without issues.
82
+
Azure Cosmos DB can store unlimited data by scaling out the storage. This storage is independent of the throughput. Yes, you can always use the Cassandra API just to store and retrieve keys and values by specifying the right primary/partition key. These individual keys get their own logical partition and sit atop a physical partition without issues.
83
83
84
84
## Can I create more than one table with the API?
85
85
@@ -91,19 +91,19 @@ Azure Cosmos DB is resource-governed system for both data and control plane acti
91
91
92
92
## What is the maximum number of tables that I can create?
93
93
94
-
There's no physical limit on number of tables. If you have large number of tables (where the total steady size goes over 10 TB of data) that need to be created, not the usual tens or hundreds, send email to [[email protected]](mailto:[email protected]).
94
+
There's no physical limit on the number of tables. If you have a large number of tables (where the total steady size goes over 10 TB of data) that need to be created, not the usual tens or hundreds, send email to [[email protected]](mailto:[email protected]).
95
95
96
96
## What is the maximum number of keyspaces that I can create?
97
97
98
-
There's no physical limit on number of keyspaces because they're metadata containers. If you have large number of keyspaces, send email to [[email protected]](mailto:[email protected]).
98
+
There's no physical limit on the number of keyspaces because they're metadata containers. If you have a large number of keyspaces, send email to [[email protected]](mailto:[email protected]).
99
99
100
100
## Can I bring in a lot of data after starting from a normal table?
101
101
102
102
Yes. Assuming uniformly distributed partitions, the storage capacity is automatically managed and increases as you push in more data. So you can confidently import as much data as you need without managing and provisioning nodes and more. But if you're anticipating a lot of immediate data growth, it makes more sense to directly [provision for the anticipated throughput](set-throughput.md) rather than starting lower and increasing it immediately.
103
103
104
104
## Can I use YAML file settings to configure API behavior?
105
105
106
-
The Cassandra API for Azure Cosmos DB provides protocol-level compatibility for executing operations. It hides away the complexity of management, monitoring, and configuration. As a developer/user, you don't need to worry about availability, tombstones, key cache, row cache, bloom filter, and a multitude of other settings. The Apache Cassandra API focuses on providing read and write performance that you need without the overhead of configuration and management.
106
+
The Cassandra API for Azure Cosmos DB provides protocol-level compatibility for executing operations. It hides away the complexity of management, monitoring, and configuration. As a developer/user, you don't need to worry about availability, tombstones, key cache, row cache, bloom filter, and a multitude of other settings. The Cassandra API focuses on providing the read and write performance that you need without the overhead of configuration and management.
107
107
108
108
## Will the API support node addition, cluster status, and node status commands?
109
109
@@ -121,13 +121,13 @@ Azure Cosmos DB provides performance guarantees for reads, writes, and throughpu
121
121
122
122
Yes, TTL is supported.
123
123
124
-
## Can I monitor node status, replica status, gc, and OS parameters earlier with various tools? What needs to be monitored now?
124
+
## How can I monitor infrastructure along with throughput?
125
125
126
-
Azure Cosmos DB is a platform service that helps you increase productivity and not worry about managing and monitoring infrastructure. You just need to take care of throughput that's available in portal metrics to see if you're getting throttled, and then increase or decrease that throughput. You can:
126
+
Azure Cosmos DB is a platform service that helps you increase productivity and not worry about managing and monitoring infrastructure. For example, you don't need to monitor node status, replica status, gc, and OS parameters earlier with various tools. You just need to take care of throughput that's available in portal metrics to see if you're getting throttled, and then increase or decrease that throughput. You can:
127
127
128
-
- Monitor [SLAs](monitor-accounts.md).
129
-
- Use [Metrics](use-metrics.md)
130
-
- Use [Diagnostic logs](logging.md).
128
+
- Monitor [SLAs](monitor-accounts.md)
129
+
- Use [metrics](use-metrics.md)
130
+
- Use [diagnostic logs](logging.md)
131
131
132
132
## Which client SDKs can work with the API?
133
133
@@ -143,11 +143,11 @@ No, sstableloader isn't supported.
143
143
144
144
## Can I pair an on-premises Apache Cassandra cluster with the API?
145
145
146
-
At present, Azure Cosmos DB has an optimized experience for a cloud environment without the overhead of operations. If you require pairing, send mail to [[email protected]](mailto:[email protected]) with a description of your scenario. We're working on an offering to help pair the on-premises/cloud Cassandra cluster with the Cassandra API for Azure Cosmos DB.
146
+
At present, Azure Cosmos DB has an optimized experience for a cloud environment without the overhead of operations. If you require pairing, send mail to [[email protected]](mailto:[email protected]) with a description of your scenario. We're working on an offering to help pair the on-premises or cloud Cassandra cluster with the Cassandra API for Azure Cosmos DB.
147
147
148
148
## Does the API provide full backups?
149
149
150
-
Azure Cosmos DB provides two free full backups taken at four-hour intervals across all APIs. So you don't need to set up a backup schedule and other things.
150
+
Azure Cosmos DB provides two free full backups taken at four-hour intervals across all APIs. So you don't need to set up a backup schedule.
151
151
152
152
If you want to modify retention and frequency, send email to [[email protected]](mailto:[email protected]) or raise a support case. Information about backup capability is provided in the [Automatic online backup and restore with Azure Cosmos DB](../synapse-analytics/sql-data-warehouse/backup-and-restore.md) article.
153
153
@@ -164,7 +164,7 @@ No. The Cassandra API supports [secondary indexes](cassandra-secondary-index.md)
164
164
165
165
### Can I use the new Cassandra API SDK locally with the emulator?
166
166
167
-
Yes, this is supported. You can find details of how to enable this in the [Use the Azure Cosmos Emulator for local development and testing](local-emulator.md#cassandra-api) article.
167
+
Yes, this is supported. You can find details on how to enable this in the [Use the Azure Cosmos Emulator for local development and testing](local-emulator.md#cassandra-api) article.
168
168
169
169
170
170
## How can I migrate data from Apache Cassandra clusters to Azure Cosmos DB?
0 commit comments