Skip to content

Commit 7c3e75c

Browse files
authored
Merge pull request #114106 from timsander1/master
add pull model docs and samples
2 parents 6f1de10 + f56ee7c commit 7c3e75c

File tree

4 files changed

+194
-9
lines changed

4 files changed

+194
-9
lines changed

articles/cosmos-db/TOC.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -494,6 +494,8 @@
494494
- name: Trigger Azure Functions
495495
displayName: change feed
496496
href: change-feed-functions.md
497+
- name: Change feed pull model
498+
href: change-feed-pull-model.md
497499
- name: Globally distributed analytics and AI
498500
items:
499501
- name: Analytics use cases

articles/cosmos-db/change-feed-processor.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ ms.author: tisande
66
ms.service: cosmos-db
77
ms.devlang: dotnet
88
ms.topic: conceptual
9-
ms.date: 4/29/2020
9+
ms.date: 05/06/2020
1010
ms.reviewer: sngun
1111
---
1212

@@ -97,6 +97,7 @@ You are charged for RUs consumed, since data movement in and out of Cosmos conta
9797
You can now proceed to learn more about change feed processor in the following articles:
9898

9999
* [Overview of change feed](change-feed.md)
100+
* [Change feed pull model](change-feed-pull-model.md)
100101
* [How to migrate from the change feed processor library](how-to-migrate-from-change-feed-library.md)
101102
* [Using the change feed estimator](how-to-use-change-feed-estimator.md)
102103
* [Change feed processor start time](how-to-configure-change-feed-start-time.md)
Lines changed: 173 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,173 @@
1+
---
2+
title: Change feed pull model
3+
description: Learn how to use the Azure Cosmos DB change feed pull model to read the change feed and the differences between the pull model and Change Feed Processor
4+
author: timsander1
5+
ms.author: tisande
6+
ms.service: cosmos-db
7+
ms.devlang: dotnet
8+
ms.topic: conceptual
9+
ms.date: 05/06/2020
10+
ms.reviewer: sngun
11+
---
12+
13+
# Change feed pull model in Azure Cosmos DB
14+
15+
With the change feed pull model, you can consume the Azure Cosmos DB change feed at your own pace. As you can already do with the [change feed processor](change-feed-processor.md), you can use the change feed pull model to parallelize the processing of changes across multiple change feed consumers.
16+
17+
> [!NOTE]
18+
> The change feed pull model is currently in [preview in the Azure Cosmos DB .NET SDK](https://www.nuget.org/packages/Microsoft.Azure.Cosmos/3.9.0-preview) only. The preview is not yet available for other SDK versions.
19+
20+
## Consuming an entire container's changes
21+
22+
You can create a `FeedIterator` to process the change feed using the pull model. When you initially create a `FeedIterator`, you can specify an optional `StartTime` within the `ChangeFeedRequestOptions`. When left unspecified, the `StartTime` will be the current time.
23+
24+
The `FeedIterator` comes in two flavors. In addition to the examples below that return entity objects, you can also obtain the response with `Stream` support. Streams allow you to read data without having it first deserialized, saving on client resources.
25+
26+
Here's an example for obtaining a `FeedIterator` that returns entity objects, in this case a `User` object:
27+
28+
```csharp
29+
FeedIterator<User> iteratorWithPOCOS = container.GetChangeFeedIterator<User>();
30+
```
31+
32+
Here's an example for obtaining a `FeedIterator` that returns a `Stream`:
33+
34+
```csharp
35+
FeedIterator iteratorWithStreams = container.GetChangeFeedStreamIterator();
36+
```
37+
38+
Using a `FeedIterator`, you can easily process an entire container's change feed at your own pace. Here's an example:
39+
40+
```csharp
41+
FeedIterator<User> iteratorForTheEntireContainer= container.GetChangeFeedIterator(new ChangeFeedRequestOptions{StartTime = DateTime.MinValue});
42+
43+
while (iteratorForTheEntireContainer.HasMoreResults)
44+
{
45+
FeedResponse<User> users = await iteratorForTheEntireContainer.ReadNextAsync();
46+
47+
foreach (User user in users)
48+
{
49+
Console.WriteLine($"Detected change for user with id {user.id}");
50+
}
51+
}
52+
```
53+
54+
## Consuming a partition key's changes
55+
56+
In some cases, you may only want to process a specific partition key's changes. You can obtain a `FeedIterator` for a specific partition key and process the changes the same way that you can for an entire container:
57+
58+
```csharp
59+
FeedIterator<User> iteratorForThePartitionKey = container.GetChangeFeedIterator(new PartitionKey("myPartitionKeyValueToRead"), new ChangeFeedRequestOptions{StartTime = DateTime.MinValue});
60+
61+
while (iteratorForThePartitionKey.HasMoreResults)
62+
{
63+
FeedResponse<User> users = await iteratorForThePartitionKey.ReadNextAsync();
64+
65+
foreach (User user in users)
66+
{
67+
Console.WriteLine($"Detected change for user with id {user.id}");
68+
}
69+
}
70+
```
71+
72+
## Using FeedRange for parallelization
73+
74+
In the [change feed processor](change-feed-processor.md), work is automatically spread across multiple consumers. In the change feed pull model, you can use the `FeedRange` to parallelize the processing of the change feed. A `FeedRange` represents a range of partition key values.
75+
76+
Here's an example showing how to obtain a list of ranges for your container:
77+
78+
```csharp
79+
IReadOnlyList<FeedRange> ranges = await container.GetFeedRangesAsync();
80+
```
81+
82+
When you obtain of list of FeedRanges for your container, you'll get one `FeedRange` per [physical partition](partition-data.md#physical-partitions).
83+
84+
Using a `FeedRange`, you can then create a `FeedIterator` to parallelize the processing of the change feed across multiple machines or threads. Unlike the previous example that showed how to obtain a single `FeedIterator` for the entire container, you can use the `FeedRange` to obtain multiple FeedIterators which can process the change feed in parallel.
85+
86+
In the case where you want to use FeedRanges, you need to have an orchestrator process that obtains FeedRanges and distributes them to those machines. This distribution could be:
87+
88+
* Using `FeedRange.ToJsonString` and distributing this string value. The consumers can use this value with `FeedRange.FromJsonString`
89+
* If the distribution is in-process, passing the `FeedRange` object reference.
90+
91+
Here's a sample that shows how to read from the beginning of the container's change feed using two hypothetical separate machines that are reading in parallel:
92+
93+
Machine 1:
94+
95+
```csharp
96+
FeedIterator<User> iteratorA = container.GetChangeFeedIterator<Person>(ranges[0], new ChangeFeedRequestOptions{StartTime = DateTime.MinValue});
97+
while (iteratorA.HasMoreResults)
98+
{
99+
FeedResponse<User> users = await iteratorA.ReadNextAsync();
100+
101+
foreach (User user in users)
102+
{
103+
Console.WriteLine($"Detected change for user with id {user.id}");
104+
}
105+
}
106+
```
107+
108+
Machine 2:
109+
110+
```csharp
111+
FeedIterator<User> iteratorB = container.GetChangeFeedIterator<User>(ranges[1], new ChangeFeedRequestOptions{StartTime = DateTime.MinValue});
112+
while (iteratorB.HasMoreResults)
113+
{
114+
FeedResponse<User> users = await iteratorB.ReadNextAsync();
115+
116+
foreach (User user in users)
117+
{
118+
Console.WriteLine($"Detected change for user with id {user.id}");
119+
}
120+
}
121+
```
122+
123+
## Saving continuation tokens
124+
125+
You can save the position of your `FeedIterator` by creating a continuation token. A continuation token is a string value that keeps of track of your FeedIterator's last processed changes. This allows the `FeedIterator` to resume at this point later. The following code will read through the change feed since container creation. After no more changes are available, it will persist a continuation token so that change feed consumption can be later resumed.
126+
127+
```csharp
128+
FeedIterator<User> iterator = container.GetChangeFeedIterator<User>(ranges[0], new ChangeFeedRequestOptions{StartTime = DateTime.MinValue});
129+
130+
string continuation = null;
131+
132+
while (iterator.HasMoreResults)
133+
{
134+
FeedResponse<User> users = await iterator.ReadNextAsync();
135+
continuation = orders.ContinuationToken;
136+
137+
foreach (User user in Users)
138+
{
139+
Console.WriteLine($"Detected change for user with id {user.id}");
140+
}
141+
}
142+
143+
// Some time later
144+
FeedIterator<User> iteratorThatResumesFromLastPoint = container.GetChangeFeedIterator<User>(continuation);
145+
```
146+
147+
## Comparing with change feed processor
148+
149+
Many scenarios can process the change feed using either the [change feed processor](change-feed-processor.md) or the pull model. The pull model's continuation tokens and the change feed processor's lease container are both "bookmarks" for the last processed item (or batch of items) in the change feed.
150+
However, you can't convert continuation tokens to a lease container (or vice versa).
151+
152+
You should consider using the pull model in these scenarios:
153+
154+
- You want to do a one-time read of the existing data in the change feed
155+
- You only want to read changes from a particular partition key
156+
- You don't want a push model and want to consume the change feed at your own pace
157+
158+
Here's some key differences between the change feed processor and pull model:
159+
160+
| | Change feed processor| Pull model |
161+
| --- | --- | --- |
162+
| Keeping track of current point in processing change feed | Lease (stored in an Azure Cosmos DB container) | Continuation token (stored in memory or manually persisted) |
163+
| Ability to replay past changes | Yes, with push model | Yes, with pull model|
164+
| Polling for future changes | Automatically checks for changes based on user-specified `WithPollInterval` | Manual |
165+
| Process changes from entire container | Yes, and automatically parallelized across multiple threads/machine consuming from the same container| Yes, and manually parallelized using FeedTokens |
166+
| Process changes from just a single partition key | Not supported | Yes|
167+
| Support level | Generally available | Preview |
168+
169+
## Next steps
170+
171+
* [Overview of change feed](change-feed.md)
172+
* [Using the change feed processor](change-feed-processor.md)
173+
* [Trigger Azure Functions](change-feed-functions.md)

articles/cosmos-db/read-change-feed.md

Lines changed: 17 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,33 +1,42 @@
11
---
22
title: Accessing change feed in Azure Cosmos DB Azure Cosmos DB
3-
description: This article describes different options available to read and access change feed in Azure Cosmos DB Azure Cosmos DB.
4-
author: TheovanKraay
5-
ms.author: thvankra
3+
description: This article describes different options available to read and access change feed in Azure Cosmos DB.
4+
author: timsander1
5+
ms.author: tisande
66
ms.service: cosmos-db
77
ms.topic: conceptual
8-
ms.date: 11/25/2019
9-
8+
ms.date: 05/06/2020
9+
ms.reviewer: sngun
1010
---
11+
1112
# Reading Azure Cosmos DB change feed
1213

1314
You can work with the Azure Cosmos DB change feed using any of the following options:
1415

1516
* Using Azure Functions
16-
* Using the change feed processor library
17+
* Using the change feed processor
1718
* Using the Azure Cosmos DB SQL API SDK
19+
* Using the change feed pull model (preview)
1820

1921
## Using Azure Functions
2022

2123
Azure Functions is the simplest and recommended option. When you create an Azure Functions trigger for Cosmos DB, you can select the container to connect, and the Azure Function gets triggered whenever there is a change to the container. Triggers can be created by using the Azure Functions portal, the Azure Cosmos DB portal or programmatically with SDKs. Visual Studio and VS Code provide support to write Azure Functions, and you can even use the Azure Functions CLI for cross-platform development. You can write and debug the code on your desktop, and then deploy the function with one click. See [Serverless database computing using Azure Functions](serverless-computing-database.md) and [Using change feed with Azure Functions](change-feed-functions.md)) articles to learn more.
2224

23-
## Using the change feed processor library
25+
## Using the change feed processor
2426

25-
The change feed processor library hides complexity and still gives you a complete control of the change feed. The library follows the observer pattern, where your processing function is called by the library. If you have a high throughput change feed, you can instantiate multiple clients to read the change feed. Because you're using change feed processor library, it will automatically divide the load among the different clients without you having to implement this logic. All the complexity is handled by the library. If you want to have your own load balancer, then you can implement `IPartitionLoadBalancingStrategy` for a custom partition strategy to process change feed. To learn more, see [using change feed processor library](change-feed-processor.md).
27+
The change feed processor hides complexity and still gives you a complete control of the change feed. The library follows the observer pattern, where your processing function is called by the library. If you have a high throughput change feed, you can instantiate multiple clients to read the change feed. Because you're using change feed processor library, it will automatically divide the load among the different clients without you having to implement this logic. All the complexity is handled by the library. To learn more, see [using change feed processor](change-feed-processor.md). The change feed processor is part of the [Azure Cosmos DB SDK V3](https://github.com/Azure/azure-cosmos-dotnet-v3).
2628

2729
## Using the Azure Cosmos DB SQL API SDK
2830

2931
With the SDK, you get a low-level control of the change feed. You can manage the checkpoint, access a particular logical partition key, etc. If you have multiple readers, you can use `ChangeFeedOptions` to distribute read load to different threads or different clients.
3032

33+
## Using the change feed pull model
34+
35+
The [change feed pull model](change-feed-pull-model.md) allows you to consume the change feed at your own pace and parallelize processing of changes with FeedRanges. A FeedRange spans a range of partition key values. Using the change feed pull model, it is also easy to process changes for a specific partition key.
36+
37+
> [!NOTE]
38+
> The change feed pull model is currently in [preview in the Azure Cosmos DB .NET SDK](https://www.nuget.org/packages/Microsoft.Azure.Cosmos/3.9.0-preview) only. The preview is not yet available for other SDK versions.
39+
3140
## Change feed in APIs for Cassandra and MongoDB
3241

3342
Change feed functionality is surfaced as change stream in MongoDB API and Query with predicate in Cassandra API. To learn more about the implementation details for MongoDB API, see the [Change streams in the Azure Cosmos DB API for MongoDB](mongodb-change-streams.md).

0 commit comments

Comments
 (0)