You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/cosmos-db/partition-data.md
+6-5Lines changed: 6 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,13 +5,13 @@ author: markjbrown
5
5
ms.author: mjbrown
6
6
ms.service: cosmos-db
7
7
ms.topic: conceptual
8
-
ms.date: 04/23/2020
8
+
ms.date: 04/24/2020
9
9
10
10
---
11
11
12
12
# Partitioning and horizontal scaling in Azure Cosmos DB
13
13
14
-
This article explains physical and logical partitions in Azure Cosmos DB. It also discusses best practices for scaling and partitioning.
14
+
This article explains the difference between logical and physical partitions. It also discusses best practices for partitioning and gives an in-depth view at how horizontal scaling works in Azure Cosmos DB. It's not necessary to understand these internal details to [select your partition key](partitioning-overview.md#choose-partitionkey) but we have covered them so you have clarity for how Azure Cosmos DB works.
15
15
16
16
## Logical partitions
17
17
@@ -28,14 +28,15 @@ There is no limit to the number of logical partitions in your container. Each lo
28
28
An Azure Cosmos container is scaled by distributing data and throughput across physical partitions. Internally, one or more logical partitions are mapped to a single physical partition. Most small Cosmos containers have many logical partitions but only require a single physical partition. Unlike logical partitions, physical partitions are an internal implementation of the system.
29
29
30
30
The number of physical partitions in your Cosmos container depends on the following:
31
-
- Amount of provisioned throughput (each individual physical partition can provide a throughput up to 10,000 request units per second)
32
-
- Total data storage (each individual physical partition can store up to 50GB)
31
+
32
+
- Amount of provisioned throughput (each individual physical partition can provide a throughput of up to 10,000 request units per second)
33
+
- Total data storage (each individual physical partition can store up to 50GB)
33
34
34
35
There is no limit to the total number of physical partitions in your container. As your provisioned throughput or data size grows, Azure Cosmos DB will automatically create new physical partitions by splitting existing ones. Physical partition splits do not impact your application's availability. After the physical partition split, all data within a single logical partition will still be stored on the same physical partition. A physical partition split simply creates a new mapping of logical partitions to physical partitions.
35
36
36
37
Throughput provisioned for a container is divided evenly among physical partitions. A partition key design that doesn't distribute the throughput requests evenly might create "hot" partitions. Hot partitions might result in rate-limiting and in inefficient use of the provisioned throughput, and higher costs.
37
38
38
-
You can see how many physical partitions in your container the **Storage** section of the **Metrics blade** of the Azure Portal:
39
+
You can see your container's physical partitions in the **Storage** section of the **Metrics blade** of the Azure Portal:
39
40
40
41
[](./media/partition-data/view-partitions-zoomed-in.png#lightbox)
Copy file name to clipboardExpand all lines: articles/cosmos-db/partitioning-overview.md
+10-10Lines changed: 10 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,7 +5,7 @@ author: markjbrown
5
5
ms.author: mjbrown
6
6
ms.service: cosmos-db
7
7
ms.topic: conceptual
8
-
ms.date: 04/23/2020
8
+
ms.date: 04/24/2020
9
9
10
10
---
11
11
@@ -27,11 +27,11 @@ Azure Cosmos DB uses hash-based partitioning to spread logical partitions across
27
27
28
28
Transactions (in stored procedures or triggers) are allowed only against items in a single logical partition.
29
29
30
-
To learn more about [how Azure Cosmos DB manages partitions](partition-data.md). (It's not necessary to understand the internal details to build or run your applications, but added here for a curious reader.)
30
+
You can learn more about [how Azure Cosmos DB manages partitions](partition-data.md). (It's not necessary to understand the internal details to build or run your applications, but added here for a curious reader.)
31
31
32
32
## <aid="choose-partitionkey"></a>Choosing a partition key
33
33
34
-
Selecting your partition key is a simple but important design choice in Azure Cosmos DB. Once you select your partition key, it is not possible to change it in-place. If you need change your partition key, you should move your data to a new container with your new desired partition key.
34
+
Selecting your partition key is a simple but important design choice in Azure Cosmos DB. Once you select your partition key, it is not possible to change it in-place. If you need to change your partition key, you should move your data to a new container with your new desired partition key.
35
35
36
36
For **all** containers, your partition key should:
37
37
@@ -43,9 +43,9 @@ If you need [multi-item ACID transactions](database-transactions-optimistic-conc
43
43
44
44
## Partition keys for read-heavy containers
45
45
46
-
For most containers, the above criteria is all you need to consider when picking a partition key. For large read-heavy containers, however, you might want to choose a partition key that is a property that appears frequently as a filter in your queries. Queries can be [efficiently routed to only the relevant physical partitions](how-to-query-container.md#in-partition-query) by including the partition key in the filter predicate.
46
+
For most containers, the above criteria is all you need to consider when picking a partition key. For large read-heavy containers, however, you might want to choose a partition key that appears frequently as a filter in your queries. Queries can be [efficiently routed to only the relevant physical partitions](how-to-query-container.md#in-partition-query) by including the partition key in the filter predicate.
47
47
48
-
If most of your workload's requests are queries and most of your queries have an equality filter on the same property, this property can be a good partition key choice. For example, if you frequently run a query that filters on `UserID`, then selecting `UserID` as the partition key would reduce the number of [cross-partition queries](how-to-query-container#avoiding-cross-partition-queries).
48
+
If most of your workload's requests are queries and most of your queries have an equality filter on the same property, this property can be a good partition key choice. For example, if you frequently run a query that filters on `UserID`, then selecting `UserID` as the partition key would reduce the number of [cross-partition queries](how-to-query-container.md#avoiding-cross-partition-queries).
49
49
50
50
However, if your container is small, you probably don't have enough physical partitions to need to worry about the performance impact of cross-partition queries. Most small containers in Azure Cosmos DB only require one or two physical partitions.
51
51
@@ -58,16 +58,16 @@ If your container could grow to more than a few physical partitions, then you sh
58
58
59
59
If your container has a property that has a wide range of possible values, it is likely a great partition key choice. One possible example of such a property is the *item ID*. For small read-heavy containers or write-heavy containers of any size, the *item ID* is naturally a great choice for the partition key.
60
60
61
-
The *item ID* is a great partition key choice because:
61
+
The *item ID* is a great partition key choice for the following reasons:
62
62
63
63
* There are a wide range of possible values (one unique *item ID* per item).
64
64
* Because there is a unique *item ID* per item, the *item ID* does a great job at evenly balancing RU consumption and data storage.
65
-
*It allows you to easily do efficient point reads since you'll always know an item's partition key if you know its *item ID*.
65
+
*You can easily do efficient point reads since you'll always know an item's partition key if you know its *item ID*.
66
66
67
-
Some things to consider when selecting the *item ID* include:
67
+
Some things to consider when selecting the *item ID*as the partition key include:
68
68
69
-
* If the *item ID* is the partition key, it will become a unique identifier throughout your entire container. You won't be able to have items that have a duplicate *item ID*
70
-
* If you have a read-heavy container that has a lot of [physical partitions].(partition-data.md#physical-partitions), all of your queries will be cross-partition
69
+
* If the *item ID* is the partition key, it will become a unique identifier throughout your entire container. You won't be able to have items that have a duplicate *item ID*.
70
+
* If you have a read-heavy container that has a lot of [physical partitions](partition-data.md#physical-partitions), all of your queries will be cross-partition
71
71
* You can't run stored procedures or triggers across multiple logical partitions.
0 commit comments