You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/cosmos-db/mongodb/vcore/scalability-overview.md
+6-4Lines changed: 6 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -39,12 +39,11 @@ Write operations in the vCore based service for Azure Cosmos DB for MongoDB are
39
39
40
40
41
41
# Storage heavy workloads and large disks
42
-
- There are no min storage requirements on each of the available compute cluster tiers. The M30 cluster tier can have a 32TB disk attached to it. Conversely, the M200 cluster tier can have a 32GB disk attached to it.
43
-
- The smallest cluster tier can provision a 32TB disk. Similarly, the largest cluster tier can provision the smallest storage SKU.
44
-
- Storage heavy workloads in particular can significantly benefit by provisioning much larger disks.
42
+
## No minimum storage requirements per cluster tier
43
+
As mentioned earlier, storage and compute are decoupled for billing and provisioning. While they function as a cohesive unit, they can be scaled independently. The M30 cluster tier can have 32TB disks provisioned. Similarly, the M200 cluster tier can have 32GB disks provisioned to optimize for both storage and compute costs individually.
45
44
46
45
## Lower TCO with large disks (32TB and beyond)
47
-
Typically, NoSQL databases with a vCore based model limit the storage per physical shard at 4TB. The vCore based service for Azure Cosmos DB for MongoDB provides upto 8x that capacity with 32TB disks with plans to expand to 64TB and 128TB disks per shard in the near future. For storage heavy workloads, a 4TB storage capacity per physical shard will require a massive fleet of compute resources just so sustain the storage requirements of the workload. Compute is more expensive than storage and over provisioning compute due to capacity limits in a service can inflate costs rapidly. An immediate response to minimize compute costs in such scenarios is to "tier" the data by limiting the data in the transactional data store to just the immediately accessed hot data and separate the larger volume of cold data in a cold store. This causes technical complexity in the application layer. Furthermore, the availability of the entire system is now dependent on the resiliency of multiple data stores.
46
+
Typically, NoSQL databases with a vCore based model limit the storage per physical shard at 4TB. The vCore based service for Azure Cosmos DB for MongoDB provides upto 8x that capacity with 32TB disks with plans to expand to 64TB and 128TB disks per shard in the near future. For storage heavy workloads, a 4TB storage capacity per physical shard will require a massive fleet of compute resources just so sustain the storage requirements of the workload. Compute is more expensive than storage and over provisioning compute due to capacity limits in a service can inflate costs rapidly.
48
47
49
48
Let's consider the lower TCO with the vCore based Azure Cosmos DB for MongDB for a storage heavy workload with 200TB of data.
50
49
| Storage size per shard | Min shards needed to sustain 200TB |
@@ -55,6 +54,9 @@ Let's consider the lower TCO with the vCore based Azure Cosmos DB for MongDB for
55
54
56
55
The reduction in Compute requirements reduces sharply with larger disks. While 7 or 4 physical shards may not be sufficient to sustain the throughput requirements of the workload and more shards may be needed, even doubling or tripling the shard count along with the larger disks will be significantly more cost optimal than a 50 shard cluster with smaller disks.
57
56
57
+
## Skip storage tiering with large disks
58
+
An immediate response to minimize compute costs in storage heavy scenarios is to "tier" the data. Data in the transactional database is limited to the most frequently accessed "hot" data while the larger volume of "cold" data is detached to a cold store. This causes operational complexity in the application layer. Latencies will also be unpredictable depending upon the data tier being accessed. Furthermore, the availability of the entire system is now dependent on the resiliency of both the hot and cold data stores combined. With large disks in the vCore service, there is no need for tiered storage as the cost of storage heavy workloads are significantly minimized.
59
+
58
60
## Next steps
59
61
-[Learn how to scale Azure Cosmos DB for MongoDB vCore cluster](./how-to-scale-cluster.md)
60
62
-[Check out indexing best practices](./how-to-create-indexes.md)
0 commit comments