Merge pull request #114106 from timsander1/master

GitHubber17 · web-flow · commit 7c3e75c38504 · 2020-05-07T20:33:24.000-07:00
add pull model docs and samples
diff --git a/articles/cosmos-db/TOC.yml b/articles/cosmos-db/TOC.yml
@@ -494,6 +494,8 @@
       - name: Trigger Azure Functions
         displayName: change feed
         href: change-feed-functions.md
+      - name: Change feed pull model
+        href: change-feed-pull-model.md
     - name: Globally distributed analytics and AI
       items:
       - name: Analytics use cases
diff --git a/articles/cosmos-db/change-feed-processor.md b/articles/cosmos-db/change-feed-processor.md
@@ -6,7 +6,7 @@ ms.author: tisande
 ms.service: cosmos-db
 ms.devlang: dotnet
 ms.topic: conceptual
-ms.date: 4/29/2020
+ms.date: 05/06/2020
 ms.reviewer: sngun
 ---
 
@@ -97,6 +97,7 @@ You are charged for RUs consumed, since data movement in and out of Cosmos conta
 You can now proceed to learn more about change feed processor in the following articles:
 
 * [Overview of change feed](change-feed.md)
+* [Change feed pull model](change-feed-pull-model.md)
 * [How to migrate from the change feed processor library](how-to-migrate-from-change-feed-library.md)
 * [Using the change feed estimator](how-to-use-change-feed-estimator.md)
 * [Change feed processor start time](how-to-configure-change-feed-start-time.md)
diff --git a/articles/cosmos-db/change-feed-pull-model.md b/articles/cosmos-db/change-feed-pull-model.md
@@ -0,0 +1,173 @@
+---
+title: Change feed pull model
+description: Learn how to use the Azure Cosmos DB change feed pull model to read the change feed and the differences between the pull model and Change Feed Processor
+author: timsander1
+ms.author: tisande
+ms.service: cosmos-db
+ms.devlang: dotnet
+ms.topic: conceptual
+ms.date: 05/06/2020
+ms.reviewer: sngun
+---
+
+# Change feed pull model in Azure Cosmos DB
+
+With the change feed pull model, you can consume the Azure Cosmos DB change feed at your own pace. As you can already do with the [change feed processor](change-feed-processor.md), you can use the change feed pull model to parallelize the processing of changes across multiple change feed consumers.
+
+> [!NOTE]
+> The change feed pull model is currently in [preview in the Azure Cosmos DB .NET SDK](https://www.nuget.org/packages/Microsoft.Azure.Cosmos/3.9.0-preview) only. The preview is not yet available for other SDK versions.
+
+## Consuming an entire container's changes
+
+You can create a `FeedIterator` to process the change feed using the pull model. When you initially create a `FeedIterator`, you can specify an optional `StartTime` within the `ChangeFeedRequestOptions`. When left unspecified, the `StartTime` will be the current time.
+
+The `FeedIterator` comes in two flavors. In addition to the examples below that return entity objects, you can also obtain the response with `Stream` support. Streams allow you to read data without having it first deserialized, saving on client resources.
+
+Here's an example for obtaining a `FeedIterator` that returns entity objects, in this case a `User` object:
+
+```csharp
+FeedIterator<User> iteratorWithPOCOS = container.GetChangeFeedIterator<User>();
+```
+
+Here's an example for obtaining a `FeedIterator` that returns a `Stream`:
+
+```csharp
+FeedIterator iteratorWithStreams = container.GetChangeFeedStreamIterator();
+```
+
+Using a `FeedIterator`, you can easily process an entire container's change feed at your own pace. Here's an example:
+
+```csharp
+FeedIterator<User> iteratorForTheEntireContainer= container.GetChangeFeedIterator(new ChangeFeedRequestOptions{StartTime = DateTime.MinValue});
+
+while (iteratorForTheEntireContainer.HasMoreResults)
+{
+   FeedResponse<User> users = await iteratorForTheEntireContainer.ReadNextAsync();
+
+   foreach (User user in users)
+    {
+        Console.WriteLine($"Detected change for user with id {user.id}");
+    }
+}
+```
+
+## Consuming a partition key's changes
+
+In some cases, you may only want to process a specific partition key's changes. You can obtain a `FeedIterator` for a specific partition key and process the changes the same way that you can for an entire container:
+
+```csharp
+FeedIterator<User> iteratorForThePartitionKey = container.GetChangeFeedIterator(new PartitionKey("myPartitionKeyValueToRead"), new ChangeFeedRequestOptions{StartTime = DateTime.MinValue});
+
+while (iteratorForThePartitionKey.HasMoreResults)
+{
+   FeedResponse<User> users = await iteratorForThePartitionKey.ReadNextAsync();
+
+   foreach (User user in users)
+    {
+        Console.WriteLine($"Detected change for user with id {user.id}");
+    }
+}
+```
+
+## Using FeedRange for parallelization
+
+In the [change feed processor](change-feed-processor.md), work is automatically spread across multiple consumers. In the change feed pull model, you can use the `FeedRange` to parallelize the processing of the change feed. A `FeedRange` represents a range of partition key values.
+
+Here's an example showing how to obtain a list of ranges for your container:
+
+```csharp
+IReadOnlyList<FeedRange> ranges = await container.GetFeedRangesAsync();
+```
+
+When you obtain of list of FeedRanges for your container, you'll get one `FeedRange` per [physical partition](partition-data.md#physical-partitions).
+
+Using a `FeedRange`, you can then create a `FeedIterator` to parallelize the processing of the change feed across multiple machines or threads. Unlike the previous example that showed how to obtain a single `FeedIterator` for the entire container, you can use the `FeedRange` to obtain multiple FeedIterators which can process the change feed in parallel.
+
+In the case where you want to use FeedRanges, you need to have an orchestrator process that obtains FeedRanges and distributes them to those machines. This distribution could be:
+
+* Using `FeedRange.ToJsonString` and distributing this string value. The consumers can use this value with `FeedRange.FromJsonString`
+* If the distribution is in-process, passing the `FeedRange` object reference.
+
+Here's a sample that shows how to read from the beginning of the container's change feed using two hypothetical separate machines that are reading in parallel:
+
+Machine 1:
+
+```csharp
+FeedIterator<User> iteratorA = container.GetChangeFeedIterator<Person>(ranges[0], new ChangeFeedRequestOptions{StartTime = DateTime.MinValue});
+while (iteratorA.HasMoreResults)
+{
+   FeedResponse<User> users = await iteratorA.ReadNextAsync();
+
+   foreach (User user in users)
+    {
+        Console.WriteLine($"Detected change for user with id {user.id}");
+    }
+}
+```
+
+Machine 2:
+
+```csharp
+FeedIterator<User> iteratorB = container.GetChangeFeedIterator<User>(ranges[1], new ChangeFeedRequestOptions{StartTime = DateTime.MinValue});
+while (iteratorB.HasMoreResults)
+{
+   FeedResponse<User> users = await iteratorB.ReadNextAsync();
+
+   foreach (User user in users)
+    {
+        Console.WriteLine($"Detected change for user with id {user.id}");
+    }
+}
+```
+
+## Saving continuation tokens
+
+You can save the position of your `FeedIterator` by creating a continuation token. A continuation token is a string value that keeps of track of your FeedIterator's last processed changes. This allows the `FeedIterator` to resume at this point later. The following code will read through the change feed since container creation. After no more changes are available, it will persist a continuation token so that change feed consumption can be later resumed.
+
+```csharp
+FeedIterator<User> iterator = container.GetChangeFeedIterator<User>(ranges[0], new ChangeFeedRequestOptions{StartTime = DateTime.MinValue});
+
+string continuation = null;
+
+while (iterator.HasMoreResults)
+{
+   FeedResponse<User> users = await iterator.ReadNextAsync();
+   continuation = orders.ContinuationToken;
+
+   foreach (User user in Users)
+    {
+        Console.WriteLine($"Detected change for user with id {user.id}");
+    }
+}
+
+// Some time later
+FeedIterator<User> iteratorThatResumesFromLastPoint = container.GetChangeFeedIterator<User>(continuation);
+```
+
+## Comparing with change feed processor
+
+Many scenarios can process the change feed using either the [change feed processor](change-feed-processor.md) or the pull model. The pull model's continuation tokens and the change feed processor's lease container are both "bookmarks" for the last processed item (or batch of items) in the change feed.
+However, you can't convert continuation tokens to a lease container (or vice versa).
+
+You should consider using the pull model in these scenarios:
+
+- You want to do a one-time read of the existing data in the change feed
+- You only want to read changes from a particular partition key
+- You don't want a push model and want to consume the change feed at your own pace
+
+Here's some key differences between the change feed processor and pull model:
+
+|  | Change feed processor| Pull model |
+| --- | --- | --- |
+| Keeping track of current point in processing change feed | Lease (stored in an Azure Cosmos DB container) | Continuation token (stored in memory or manually persisted) |
+| Ability to replay past changes | Yes, with push model | Yes, with pull model|
+| Polling for future changes | Automatically checks for changes based on user-specified `WithPollInterval` | Manual |
+| Process changes from entire container | Yes, and automatically parallelized across multiple threads/machine consuming from the same container| Yes, and manually parallelized using FeedTokens |
+| Process changes from just a single partition key | Not supported | Yes|
+| Support level | Generally available | Preview |
+
+## Next steps
+
+* [Overview of change feed](change-feed.md)
+* [Using the change feed processor](change-feed-processor.md)
+* [Trigger Azure Functions](change-feed-functions.md)
diff --git a/articles/cosmos-db/read-change-feed.md b/articles/cosmos-db/read-change-feed.md
@@ -1,33 +1,42 @@
 ---
 title: Accessing change feed in Azure Cosmos DB Azure Cosmos DB 
-description: This article describes different options available to read and access change feed in Azure Cosmos DB Azure Cosmos DB.  
-author: TheovanKraay
-ms.author: thvankra
+description: This article describes different options available to read and access change feed in Azure Cosmos DB.  
+author: timsander1
+ms.author: tisande
 ms.service: cosmos-db
 ms.topic: conceptual
-ms.date: 11/25/2019
-
+ms.date: 05/06/2020
+ms.reviewer: sngun
 ---
+
 # Reading Azure Cosmos DB change feed
 
 You can work with the Azure Cosmos DB change feed using any of the following options:
 
 * Using Azure Functions
-* Using the change feed processor library
+* Using the change feed processor
 * Using the Azure Cosmos DB SQL API SDK
+* Using the change feed pull model (preview)
 
 ## Using Azure Functions
 
 Azure Functions is the simplest and recommended option. When you create an Azure Functions trigger for Cosmos DB, you can select the container to connect, and the Azure Function gets triggered whenever there is a change to the container. Triggers can be created by using the Azure Functions portal, the Azure Cosmos DB portal or programmatically with SDKs. Visual Studio and VS Code provide support to write Azure Functions, and you can even use the Azure Functions CLI for cross-platform development. You can write and debug the code on your desktop, and then deploy the function with one click. See [Serverless database computing using Azure Functions](serverless-computing-database.md) and [Using change feed with Azure Functions](change-feed-functions.md)) articles to learn more.
 
-## Using the change feed processor library
+## Using the change feed processor
 
-The change feed processor library hides complexity and still gives you a complete control of the change feed. The library follows the observer pattern, where your processing function is called by the library. If you have a high throughput change feed, you can instantiate multiple clients to read the change feed. Because you're using change feed processor library, it will automatically divide the load among the different clients without you having to implement this logic. All the complexity is handled by the library. If you want to have your own load balancer, then you can implement `IPartitionLoadBalancingStrategy` for a custom partition strategy to process change feed. To learn more, see [using change feed processor library](change-feed-processor.md).
+The change feed processor hides complexity and still gives you a complete control of the change feed. The library follows the observer pattern, where your processing function is called by the library. If you have a high throughput change feed, you can instantiate multiple clients to read the change feed. Because you're using change feed processor library, it will automatically divide the load among the different clients without you having to implement this logic. All the complexity is handled by the library. To learn more, see [using change feed processor](change-feed-processor.md). The change feed processor is part of the [Azure Cosmos DB SDK V3](https://github.com/Azure/azure-cosmos-dotnet-v3).
 
 ## Using the Azure Cosmos DB SQL API SDK
 
 With the SDK, you get a low-level control of the change feed. You can manage the checkpoint, access a particular logical partition key, etc. If you have multiple readers, you can use `ChangeFeedOptions` to distribute read load to different threads or different clients.
 
+## Using the change feed pull model
+
+The [change feed pull model](change-feed-pull-model.md) allows you to consume the change feed at your own pace and parallelize processing of changes with FeedRanges. A FeedRange spans a range of partition key values. Using the change feed pull model, it is also easy to process changes for a specific partition key.
+
+> [!NOTE]
+> The change feed pull model is currently in [preview in the Azure Cosmos DB .NET SDK](https://www.nuget.org/packages/Microsoft.Azure.Cosmos/3.9.0-preview) only. The preview is not yet available for other SDK versions.
+
 ## Change feed in APIs for Cassandra and MongoDB
 
 Change feed functionality is surfaced as change stream in MongoDB API and Query with predicate in Cassandra API. To learn more about the implementation details for MongoDB API, see the [Change streams in the Azure Cosmos DB API for MongoDB](mongodb-change-streams.md).