Skip to content

Commit a141288

Browse files
authored
Update partitioning.md
1 parent fb05be3 commit a141288

File tree

1 file changed

+10
-10
lines changed

1 file changed

+10
-10
lines changed

articles/cosmos-db/mongodb/vcore/partitioning.md

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ For example, let's consider a collection called Employees with the document stru
2626

2727
This table shows a mapping of shard key values to logical partitions.
2828

29-
| Document Id | Shard Key Value | Logical Shard |
29+
| Document ID | Shard Key Value | Logical Shard |
3030
|-------------|-----------------|-------------- |
3131
| "12345" | "Steve Smith" | Shard 1 |
3232
| "23456" | "Jane Doe" | Shard 2 |
@@ -38,7 +38,7 @@ This table shows a mapping of shard key values to logical partitions.
3838

3939
- There are also no limits to the size of a single logical shard.
4040

41-
- In addition, the service does not limit transactions to the scope of a logical shard. The vCore based service for Azure Cosmos DB for MongoDB supports read and write transactions that are applicable across multiple logical shards and across multiple physical shards in the cluster.
41+
- In addition, the service doesn't limit transactions to the scope of a logical shard. The vCore based service for Azure Cosmos DB for MongoDB supports read and write transactions that are applicable across multiple logical shards and across multiple physical shards in the cluster.
4242

4343
## Physical shards
4444
Physical shards are the underlying machines and disks responsible for persisting the data and fulfilling database transactions. Unlike logical shards, the service manages physical shards under the covers.
@@ -48,11 +48,11 @@ The number of physical shards are defined when a cluster is created. Single shar
4848
### Mapping logical shards to physical shards
4949
When new logical shards are added, the cluster seamlessly updates the mapping of logical to physical shards. Similarly, the assignment of the address space to each physical shard is changed as new physical shards are added to the cluster after which, logical shards are rebalanced across the cluster.
5050

51-
The hash range used to map logical and physical shards is evenly distributed across the physical shards in the cluster. Each physical shard owns an evenly sized bucket of the hash range. For every document that is written, he value of the shard key property is hashed and the hash value determines the mapping of the document to the underlying physical shard. Internally, several logical shards map to a single physical shard. Moreover, logical shards are never split across physical shards and all the documents for a logical shard only map to one physical shard.
51+
The hash range used to map logical and physical shards is evenly distributed across the physical shards in the cluster. Each physical shard owns an evenly sized bucket of the hash range. For every document that is written, the value of the shard key property is hashed and the hash value determines the mapping of the document to the underlying physical shard. Internally, several logical shards map to a single physical shard. Moreover, logical shards are never split across physical shards and all the documents for a logical shard only map to one physical shard.
5252

53-
Building on the prior example using a cluster with 2 physical shards, this table shows a sample mapping of documents to physical shards.
53+
Building on the prior example using a cluster with two physical shards, this table shows a sample mapping of documents to physical shards.
5454

55-
| Document Id | Shard Key Value | Logical Shard | Physical Shard |
55+
| Document ID | Shard Key Value | Logical Shard | Physical Shard |
5656
|-------------|-----------------|-------------- |------------------|
5757
| "12345" | "Steve Smith" | Shard 1 | Physical Shard 1 |
5858
| "23456" | "Jane Doe" | Shard 2 | Physical Shard 2 |
@@ -64,19 +64,19 @@ Building on the prior example using a cluster with 2 physical shards, this table
6464
### Capacity of physical shards
6565
The cluster tier that is selected when the cluster is provisioned determines the CPU and memory capacity of a physical shard. Similarly the storage SKU determines the storage and IOPS capacity of a physical shard. Larger cluster tiers provide more compute power and larger memory while larger storage disks provide more storage and IOPS. Read heavy workloads benefit from a larger cluster tier while write heavy workloads benefit from a larger storage SKU. The cluster tier can be scaled up and down after the cluster is created based on the changing needs of the application.
6666

67-
In a multi-shard cluster, the capacity of each physical shard is the same. Scaling up the cluster tier or the storage SKU does not change the placement of logical shards on the physical shards. After a scale up operation, the number of physical shards remains the same thus avoiding the need to relabance the data in the cluster.
67+
In a multi-shard cluster, the capacity of each physical shard is the same. Scaling up the cluster tier or the storage SKU doesn't change the placement of logical shards on the physical shards. After a scale up operation, the number of physical shards remains the same thus avoiding the need to rebalance the data in the cluster.
6868

69-
The compute, memory, storage and IOPS capacity of the physical shard The logical shards that are co-located within the same physical shard are collectively bound by the compute, memory, storage and IOPS capacity of the physical shard they are mapped to. Shard keys that do not have an even distribution of storage and request volumes can cause uneven storage and throughput consumption within the cluster. Hot partitions can cause physical shards to be unevenly utilized leading to unpredictable throughput and performance. Thus sharded clusters require careful planning upfront to ensure performance remains consistent as the requirements of the application change over time.
69+
The compute, memory, storage, and IOPS capacity of the physical shard determine the resources available for the logical shards. Shard keys that don't have an even distribution of storage and request volumes can cause uneven storage and throughput consumption within the cluster. Hot partitions can cause physical shards to be unevenly utilized leading to unpredictable throughput and performance. Thus sharded clusters require careful planning upfront to ensure performance remains consistent as the requirements of the application change over time.
7070

7171

7272
### Replica sets
7373
Each physical shard consists of a set of replicas, also referred to as a replica set. Each replica hosts an instance of the database engine. A replica set makes the data store within the physical shard durable, highly available, and consistent. Each replica that makes up the physical shard inherits the partition's storage and compute capacity. Azure Cosmos DB for MongoDB vCore automatically manages replica sets.
7474

7575

7676
## Best practices for sharding data
77-
- Sharding in Azure Cosmos DB for MongoDB vCore is not required unless the collection's storage and transaction volumes can exceed the capacity of a single physical shard. For instance, the service provides 32 TB disks per shard. If a collection requires more than 32 TB, it should be sharded.
77+
- Sharding in Azure Cosmos DB for MongoDB vCore isn't required unless the collection's storage and transaction volumes can exceed the capacity of a single physical shard. For instance, the service provides 32 TB disks per shard. If a collection requires more than 32 TB, it should be sharded.
7878

79-
- It is not necessary to shard every collection in a cluster with multiple physical shards. Sharded and unsharded collections can coexist in the same cluster. The service optimally distributes the collections within the cluster to evenly utilize the cluster's compute and storage resources as evenly as possible.
79+
- It isn't necessary to shard every collection in a cluster with multiple physical shards. Sharded and unsharded collections can coexist in the same cluster. The service optimally distributes the collections within the cluster to evenly utilize the cluster's compute and storage resources as evenly as possible.
8080

8181
- For read heavy applications, the shard key must be selected based on the most frequent query patterns. The most commonly used query filter for a collection should be chosen as the shard key to optimize the highest percentage of database transactions by localizing the search to a single physical shard.
8282

@@ -106,7 +106,7 @@ use cosmicworks;
106106
sh.shardCollection("cosmicworks.employee", {"firstName": 1})
107107
```
108108

109-
The service does not index the shard key by default. Once the collection is sharded an index must be explicitly created on the shard key property.
109+
The service doesn't index the shard key by default. Once the collection is sharded an index must be explicitly created on the shard key property.
110110

111111
```javascript
112112
use cosmicworks;

0 commit comments

Comments
 (0)