Skip to content

Commit ac1ca46

Browse files
committed
Consistency level updates
1 parent f38b42c commit ac1ca46

File tree

2 files changed

+63
-28
lines changed

2 files changed

+63
-28
lines changed

articles/cosmos-db/consistency-levels-tradeoffs.md

Lines changed: 23 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -5,45 +5,58 @@ author: markjbrown
55
ms.author: mjbrown
66
ms.service: cosmos-db
77
ms.topic: conceptual
8-
ms.date: 07/23/2019
8+
ms.date: 04/06/2020
99
ms.reviewer: sngun
1010
---
1111

12-
# Consistency, availability, and performance tradeoffs
12+
# Consistency, availability, and performance tradeoffs
1313

1414
Distributed databases that rely on replication for high availability, low latency, or both must make tradeoffs. The tradeoffs are between read consistency vs. availability, latency, and throughput.
1515

16-
Azure Cosmos DB approaches data consistency as a spectrum of choices. This approach includes more options than the two extremes of strong and eventual consistency. You can choose from five well-defined models on the consistency spectrum. From strongest to weakest, the models are:
16+
Azure Cosmos DB approaches data consistency as a spectrum of choices. This approach includes more options than the two extremes of strong and eventual consistency. You can choose from five well-defined levels on the consistency spectrum. From strongest to weakest, the levels are:
1717

1818
- *Strong*
1919
- *Bounded staleness*
2020
- *Session*
2121
- *Consistent prefix*
2222
- *Eventual*
2323

24-
Each model provides availability and performance tradeoffs and is backed by comprehensive SLAs.
24+
Each level provides availability and performance tradeoffs and is backed by comprehensive SLAs.
2525

2626
## Consistency levels and latency
2727

28-
The read latency for all consistency levels is always guaranteed to be less than 10 milliseconds at the 99th percentile. This read latency is backed by the SLA. The average read latency, at the 50th percentile, is typically 2 milliseconds or less. Azure Cosmos accounts that span several regions and are configured with strong consistency are an exception to this guarantee.
28+
The read latency for all consistency levels is always guaranteed to be less than 10 milliseconds at the 99th percentile. This read latency is backed by the SLA. The average read latency, at the 50th percentile, is typically 4 milliseconds or less.
2929

30-
The write latency for all consistency levels is always guaranteed to be less than 10 milliseconds at the 99th percentile. This write latency is backed by the SLA. The average write latency, at the 50th percentile, is usually 5 milliseconds or less.
30+
The write latency for all consistency levels is always guaranteed to be less than 10 milliseconds at the 99th percentile. This write latency is backed by the SLA. The average write latency, at the 50th percentile, is usually 5 milliseconds or less. Azure Cosmos accounts that span several regions and are configured with strong consistency are an exception to this guarantee.
3131

32-
For Azure Cosmos accounts configured with strong consistency with more than one region, the write latency is guaranteed to be less than two times round-trip time (RTT) between any of the two farthest regions, plus 10 milliseconds at the 99th percentile.
32+
### Write latency and Strong consistency
3333

34-
The exact RTT latency is a function of speed-of-light distance and the Azure networking topology. Azure networking doesn't provide any latency SLAs for the RTT between any two Azure regions. For your Azure Cosmos account, replication latencies are displayed in the Azure portal. You can use the Azure portal (go to the Metrics blade) to monitor the replication latencies between various regions that are associated with your Azure Cosmos account.
34+
For Azure Cosmos accounts configured with strong consistency with more than one region, the write latency is equal to two times round-trip time (RTT) between any of the two farthest regions, plus 10 milliseconds at the 99th percentile. High network RTT between the regions will translate to higher latency for Cosmos DB requests since strong consistency completes an operation only after ensuring that it has been committed to all regions within an account.
35+
36+
The exact RTT latency is a function of speed-of-light distance and the Azure networking topology. Azure networking doesn't provide any latency SLAs for the RTT between any two Azure regions. For your Azure Cosmos account, replication latencies are displayed in the Azure portal. You can use the Azure portal (go to the Metrics blade, select Consistency tab) to monitor the replication latencies between various regions that are associated with your Azure Cosmos account.
37+
38+
> [!IMPORTANT]
39+
> Strong consistency for accounts with regions spanning more than 5000 miles (8000 kilometers) is blocked by default due to high write latency. To enable this capability please contact support.
3540
3641
## Consistency levels and throughput
3742

38-
- For the same number of request units, the session, consistent prefix, and eventual consistency levels provide about two times the read throughput when compared with strong and bounded staleness.
43+
- For strong and bounded staleness, reads are done against two replicas in a four replica set (minority quorum) to provide consistency guarantees. Session, consistent prefix and eventual do single replica reads. The result is that, for the same number of request units, read throughput for strong and bounded staleness is half of the other consistency levels.
3944

4045
- For a given type of write operation, such as insert, replace, upsert, and delete, the write throughput for request units is identical for all consistency levels.
4146

47+
|**Consistency Level**|**Quorum Reads**|**Quorum Writes**|
48+
|--|--|--|
49+
|**Strong**|Local Minority|Global Majority|
50+
|**Bounded Staleness**|Local Minority|Local Majority|
51+
|**Session**|Single Replica (using session token)|Local Majority|
52+
|**Consistent Prefix**|Single Replica|Local Majority|
53+
|**Eventual**|Single Replica|Local Majority|
54+
4255
## <a id="rto"></a>Consistency levels and data durability
4356

4457
Within a globally distributed database environment there is a direct relationship between the consistency level and data durability in the presence of a region-wide outage. As you develop your business continuity plan, you need to understand the maximum acceptable time before the application fully recovers after a disruptive event. The time required for an application to fully recover is known as **recovery time objective** (**RTO**). You also need to understand the maximum period of recent data updates the application can tolerate losing when recovering after a disruptive event. The time period of updates that you might afford to lose is known as **recovery point objective** (**RPO**).
4558

46-
The table below defines the relationship between consistency model and data durability in the presence of region wide outage. It is important to note that in a distributed system, even with strong consistency, it is impossible to have a distributed database with an RPO and RTO of zero due to the CAP Theorem. To learn more about why, see [Consistency levels in Azure Cosmos DB](consistency-levels.md).
59+
The table below defines the relationship between consistency model and data durability in the presence of a region wide outage. It is important to note that in a distributed system, even with strong consistency, it is impossible to have a distributed database with an RPO and RTO of zero due to the CAP Theorem. To learn more about why, see [Consistency levels in Azure Cosmos DB](consistency-levels.md).
4760

4861
|**Region(s)**|**Replication mode**|**Consistency level**|**RPO**|**RTO**|
4962
|---------|---------|---------|---------|---------|

articles/cosmos-db/consistency-levels.md

Lines changed: 40 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -5,23 +5,23 @@ author: markjbrown
55
ms.author: mjbrown
66
ms.service: cosmos-db
77
ms.topic: conceptual
8-
ms.date: 03/18/2020
8+
ms.date: 04/06/2020
99
---
1010
# Consistency levels in Azure Cosmos DB
1111

12-
Distributed databases that rely on replication for high availability, low latency, or both, make the fundamental tradeoff between the read consistency vs. availability, latency, and throughput. Most commercially available distributed databases ask developers to choose between the two extreme consistency models: *strong* consistency and *eventual* consistency. The linearizability or the strong consistency model is the gold standard of data programmability. But it adds a price of higher latency (in steady state) and reduced availability (during failures). On the other hand, eventual consistency offers higher availability and better performance, but makes it hard to program applications.
12+
Distributed databases that rely on replication for high availability, low latency, or both, make the fundamental tradeoff between the read consistency vs. availability, latency, and throughput. Most commercially available distributed databases ask developers to choose between the two extreme consistency models: *strong* consistency and *eventual* consistency. The linearizability of the strong consistency model is the gold standard of data programmability. But it adds a price of higher write latency (in steady state) and reduced availability (during failures). On the other hand, eventual consistency offers higher availability and better performance, but makes it hard to program applications.
1313

14-
Azure Cosmos DB approaches data consistency as a spectrum of choices instead of two extremes. Strong consistency and eventual consistency are at the ends of the spectrum, but there are many consistency choices along the spectrum. Developers can use these options to make precise choices and granular tradeoffs with respect to high availability and performance.
14+
Azure Cosmos DB approaches data consistency as a spectrum of choices instead of two extremes. Developers can use these options to make precise choices and granular tradeoffs with respect to high availability and performance.
1515

16-
With Azure Cosmos DB, developers can choose from five well-defined consistency models on the consistency spectrum. From strongest to more relaxed, the models include *strong*, *bounded staleness*, *session*, *consistent prefix*, and *eventual* consistency. The models are well-defined and intuitive and can be used for specific real-world scenarios. Each model provides [availability and performance tradeoffs](consistency-levels-tradeoffs.md) and is backed by the SLAs. The following image shows the different consistency levels as a spectrum.
16+
With Azure Cosmos DB, developers can choose from five well-defined consistency levels on the consistency spectrum. These levels include *strong*, *bounded staleness*, *session*, *consistent prefix*, and *eventual* consistency. The levels are well-defined and intuitive and can be used for specific real-world scenarios. Each level provides [availability and performance tradeoffs](consistency-levels-tradeoffs.md) and are backed by SLAs. The following image shows the different consistency levels as a spectrum.
1717

1818
![Consistency as a spectrum](./media/consistency-levels/five-consistency-levels.png)
1919

2020
The consistency levels are region-agnostic and are guaranteed for all operations regardless of the region from which the reads and writes are served, the number of regions associated with your Azure Cosmos account, or whether your account is configured with a single or multiple write regions.
2121

2222
## Scope of the read consistency
2323

24-
Read consistency applies to a single read operation scoped within a partition-key range or a logical partition. The read operation can be issued by a remote client or a stored procedure.
24+
Read consistency applies to a single read operation scoped within a logical partition. The read operation can be issued by a remote client or a stored procedure.
2525

2626
## Configure the default consistency level
2727

@@ -39,26 +39,49 @@ The semantics of the five consistency levels are described here:
3939

4040
![video](media/consistency-levels/strong-consistency.gif)
4141

42-
- **Bounded staleness**: The reads are guaranteed to honor the consistent-prefix guarantee. The reads might lag behind writes by at most *"K"* versions (that is, "updates") of an item or by *"T"* time interval. In other words, when you choose bounded staleness, the "staleness" can be configured in two ways:
42+
- **Bounded staleness**: The reads are guaranteed to honor the consistent-prefix guarantee. The reads might lag behind writes by at most *"K"* versions (that is, "updates") of an item or by *"T"* time interval. In other words, when you choose bounded staleness, the "staleness" can be configured in two ways:
4343

44-
* The number of versions (*K*) of the item
45-
* The time interval (*T*) by which the reads might lag behind the writes
44+
- The number of versions (*K*) of the item
45+
- The time interval (*T*) by which the reads might lag behind the writes
4646

47-
Bounded staleness offers total global order except within the "staleness window." The monotonic read guarantees exist within a region both inside and outside the staleness window. Strong consistency has the same semantics as the one offered by bounded staleness. The staleness window is equal to zero. Bounded staleness is also referred to as time-delayed linearizability. When a client performs read operations within a region that accepts writes, the guarantees provided by bounded staleness consistency are identical to those guarantees by the strong consistency.
47+
Bounded staleness offers total global order outside of the "staleness window." When a client performs read operations within a region that accepts writes, the guarantees provided by bounded staleness consistency are identical to those guarantees by the strong consistency.
48+
49+
Inside the staleness window, Bounded Staleness provides the following consistency guarantees:
50+
51+
- Consistency for clients in the same region for a single-master account = Strong
52+
- Consistency for clients in different regions for a single-master account = Consistent Prefix
53+
- Consistency for clients writing to a single region for a multi-master account = Consistent Prefix
54+
- Consistency for clients writing to different regions for a multi-master account = Eventual
4855

4956
Bounded staleness is frequently chosen by globally distributed applications that expect low write latencies but require total global order guarantee. Bounded staleness is great for applications featuring group collaboration and sharing, stock ticker, publish-subscribe/queueing etc. The following graphic illustrates the bounded staleness consistency with musical notes. After the data is written to the "West US 2" region, the "East US 2" and "Australia East" regions read the written value based on the configured maximum lag time or the maximum operations:
5057

5158
![video](media/consistency-levels/bounded-staleness-consistency.gif)
5259

53-
- **Session**: Within a single client session reads are guaranteed to honor the consistent-prefix (assuming a single "writer" session), monotonic reads, monotonic writes, read-your-writes, and write-follows-reads guarantees. Clients outside of the session performing writes will see eventual consistency.
60+
- **Session**: Within a single client session reads are guaranteed to honor the consistent-prefix, monotonic reads, monotonic writes, read-your-writes, and write-follows-reads guarantees. This assumes a single "writer" session or sharing the session token for multiple writers.
61+
62+
Clients outside of the session performing writes will see the following guarantees:
63+
64+
- Consistency for clients in same region for a single-master account = Consistent Prefix
65+
- Consistency for clients in different regions for a single-master account = Consistent Prefix
66+
- Consistency for clients writing to a single region for a multi-master account = Consistent Prefix
67+
- Consistency for clients writing to multiple regions for a multi-master account = Eventual
5468

5569
Session consistency is the widely used consistency level for both single region as well as globally distributed applications. It provides write latencies, availability, and read throughput comparable to that of eventual consistency but also provides the consistency guarantees that suit the needs of applications written to operate in the context of a user. The following graphic illustrates the session consistency with musical notes. The "West US 2" region and the "East US 2" regions are using the same session (Session A) so they both read the data at the same time. Whereas the "Australia East" region is using "Session B" so, it receives data later but in the same order as the writes.
5670

5771
![video](media/consistency-levels/session-consistency.gif)
5872

5973
- **Consistent prefix**: Updates that are returned contain some prefix of all the updates, with no gaps. Consistent prefix consistency level guarantees that read never see out-of-order writes.
6074

61-
If writes were performed in the order `A, B, C`, then a client sees either `A`, `A,B`, or `A,B,C`, but never out of order like `A,C` or `B,A,C`. Consistent Prefix provides write latencies, availability, and read throughput comparable to that of eventual consistency, but also provides the order guarantees that suit the needs of scenarios where order is important. The following graphic illustrates the consistency prefix consistency with musical notes. In all the regions, the reads never see out of order writes:
75+
If writes were performed in the order `A, B, C`, then a client sees either `A`, `A,B`, or `A,B,C`, but never out of order like `A,C` or `B,A,C`. Consistent Prefix provides write latencies, availability, and read throughput comparable to that of eventual consistency, but also provides the order guarantees that suit the needs of scenarios where order is important.
76+
77+
Below are the consistency guarantees for Consistent Prefix:
78+
79+
- Consistency for clients in same region for a single-master account = Consistent Prefix
80+
- Consistency for clients in different regions for a single-master account = Consistent Prefix
81+
- Consistency for clients writing to a single region for a multi-master account = Consistent Prefix
82+
- Consistency for clients writing to multiple regions for a multi-master account = Eventual
83+
84+
The following graphic illustrates the consistency prefix consistency with musical notes. In all the regions, the reads never see out of order writes:
6285

6386
![video](media/consistency-levels/consistent-prefix.gif)
6487

@@ -73,7 +96,7 @@ To learn more about consistency concepts, read the following articles:
7396

7497
- [High-level TLA+ specifications for the five consistency levels offered by Azure Cosmos DB](https://github.com/Azure/azure-cosmos-tla)
7598
- [Replicated Data Consistency Explained Through Baseball (video) by Doug Terry](https://www.youtube.com/watch?v=gluIh8zd26I)
76-
- [Replicated Data Consistency Explained Through Baseball (whitepaper) by Doug Terry](https://www.microsoft.com/en-us/research/publication/replicated-data-consistency-explained-through-baseball/?from=http%3A%2F%2Fresearch.microsoft.com%2Fpubs%2F157411%2Fconsistencyandbaseballreport.pdf)
99+
- [Replicated Data Consistency Explained Through Baseball (whitepaper) by Doug Terry](https://www.microsoft.com/research/publication/replicated-data-consistency-explained-through-baseball/)
77100
- [Session guarantees for weakly consistent replicated data](https://dl.acm.org/citation.cfm?id=383631)
78101
- [Consistency Tradeoffs in Modern Distributed Database Systems Design: CAP is Only Part of the Story](https://www.computer.org/csdl/magazine/co/2012/02/mco2012020037/13rRUxjyX7k)
79102
- [Probabilistic Bounded Staleness (PBS) for Practical Partial Quorums](https://vldb.org/pvldb/vol5/p776_peterbailis_vldb2012.pdf)
@@ -83,9 +106,8 @@ To learn more about consistency concepts, read the following articles:
83106

84107
To learn more about consistency levels in Azure Cosmos DB, read the following articles:
85108

86-
* [Choose the right consistency level for your application](consistency-levels-choosing.md)
87-
* [Consistency levels across Azure Cosmos DB APIs](consistency-levels-across-apis.md)
88-
* [Availability and performance tradeoffs for various consistency levels](consistency-levels-tradeoffs.md)
89-
* [Configure the default consistency level](how-to-manage-consistency.md#configure-the-default-consistency-level)
90-
* [Override the default consistency level](how-to-manage-consistency.md#override-the-default-consistency-level)
91-
109+
- [Choose the right consistency level for your application](consistency-levels-choosing.md)
110+
- [Consistency levels across Azure Cosmos DB APIs](consistency-levels-across-apis.md)
111+
- [Availability and performance tradeoffs for various consistency levels](consistency-levels-tradeoffs.md)
112+
- [Configure the default consistency level](how-to-manage-consistency.md#configure-the-default-consistency-level)
113+
- [Override the default consistency level](how-to-manage-consistency.md#override-the-default-consistency-level)

0 commit comments

Comments
 (0)