Skip to content

Commit 68c4884

Browse files
committed
Polishing for more clarity
1 parent 32b4f13 commit 68c4884

File tree

1 file changed

+13
-14
lines changed

1 file changed

+13
-14
lines changed

articles/cosmos-db/nosql/change-feed-processor.md

Lines changed: 13 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ There are four main components of implementing the change feed processor:
3131

3232
1. **The delegate:** The delegate is the code that defines what you, the developer, want to do with each batch of changes that the change feed processor reads.
3333

34-
To further understand how these four elements of change feed processor work together, let's look at an example in the following diagram. The monitored container stores documents and uses 'City' as the partition key. We see that the partition key values are distributed in ranges (each range representing a [physical partition](../partitioning-overview.md#physical-partitions)) that contain items.
34+
To further understand how these four elements of change feed processor work together, let's look at an example in the following diagram. The monitored container stores items and uses 'City' as the partition key. The partition key values are distributed in ranges (each range representing a [physical partition](../partitioning-overview.md#physical-partitions)) that contain items.
3535
There are two compute instances and the change feed processor is assigning different ranges to each instance to maximize compute distribution, each instance has a unique and different name.
3636
Each range is being read in parallel and its progress is maintained separately from other ranges in the lease container through a *lease* document. The combination of the leases represents the current state of the change feed processor.
3737

@@ -45,14 +45,13 @@ The point of entry is always the monitored container, from a `Container` instanc
4545

4646
[!code-csharp[Main](~/samples-cosmosdb-dotnet-change-feed-processor/src/Program.cs?name=DefineProcessor)]
4747

48-
Where the first parameter is a distinct name that describes the goal of this processor and the second name is the delegate implementation that will handle changes.
48+
Where the first parameter is a distinct name that describes the goal of this processor and the second name is the delegate implementation that handles changes.
4949

5050
An example of a delegate would be:
5151

52-
5352
[!code-csharp[Main](~/samples-cosmosdb-dotnet-change-feed-processor/src/Program.cs?name=Delegate)]
5453

55-
Afterwards, you define the compute instance name or unique identifier with `WithInstanceName`, this should be unique and different in each compute instance you're deploying, and finally, which is the container to maintain the lease state with `WithLeaseContainer`.
54+
Afterwards, you define the compute instance name or unique identifier with `WithInstanceName`, which should be unique and different in each compute instance you're deploying, and finally, which is the container to maintain the lease state with `WithLeaseContainer`.
5655

5756
Calling `Build` gives you the processor instance that you can start by calling `StartAsync`.
5857

@@ -67,7 +66,7 @@ The normal life cycle of a host instance is:
6766

6867
## Error handling
6968

70-
The change feed processor is resilient to user code errors. That means that if your delegate implementation has an unhandled exception (step #4), the thread processing that particular batch of changes will be stopped, and a new thread will be created. The new thread checks which was the latest point in time the lease store has for that range of partition key values, and restart from there, effectively sending the same batch of changes to the delegate. This behavior continues until your delegate processes the changes correctly and it's the reason the change feed processor has an "at least once" guarantee.
69+
The change feed processor is resilient to user code errors. If your delegate implementation has an unhandled exception (step #4), the thread processing that particular batch of changes stops, and a new thread is eventually created. The new thread checks which was the latest point in time the lease store has for that range of partition key values, and restart from there, effectively sending the same batch of changes to the delegate. This behavior continues until your delegate processes the changes correctly and it's the reason the change feed processor has an "at least once" guarantee.
7170

7271
> [!NOTE]
7372
> There is only one scenario where a batch of changes will not be retried. If the failure happens on the first ever delegate execution, the lease store has no previous saved state to be used on the retry. On those cases, the retry would use the [initial starting configuration](#starting-time), which might or might not include the last batch.
@@ -100,15 +99,15 @@ As mentioned before, within a deployment unit you can have one or more compute i
10099
1. All instances should have the same `processorName`.
101100
1. Each instance needs to have a different instance name (`WithInstanceName`).
102101

103-
If these three conditions apply, then the change feed processor distributes all the leases in the lease container across all running instances of that deployment unit and parallelize compute using an equal distribution algorithm. One lease can only be owned by one instance at a given time, so the number of instances shouldn't be greater than the number of leases.
102+
If these three conditions apply, then the change feed processor distributes all the leases in the lease container across all running instances of that deployment unit and parallelize compute using an equal distribution algorithm. A lease is owned by one instance at a given time, so the number of instances shouldn't be greater than the number of leases.
104103

105104
The number of instances can grow and shrink, and the change feed processor will dynamically adjust the load by redistributing accordingly.
106105

107106
Moreover, the change feed processor can dynamically adjust to containers scale due to throughput or storage increases. When your container grows, the change feed processor transparently handles these scenarios by dynamically increasing the leases and distributing the new leases among existing instances.
108107

109108
## Starting time
110109

111-
By default, when a change feed processor starts the first time, it initializes the leases container, and start its [processing life cycle](#processing-life-cycle). Any changes that happened in the monitored container before the change feed processor was initialized for the first time won't be detected.
110+
By default, when a change feed processor starts the first time, it initializes the leases container, and start its [processing life cycle](#processing-life-cycle). Any changes that happened in the monitored container before the change feed processor is initialized for the first time aren't detected.
112111

113112
### Reading from a previous date and time
114113

@@ -139,7 +138,7 @@ An example of a delegate implementation would be:
139138
> In the above we pass a variable `options` of type `ChangeFeedProcessorOptions`, which can be used to set various values including `setStartFromBeginning`:
140139
> [!code-java[](~/azure-cosmos-java-sql-api-samples/src/main/java/com/azure/cosmos/examples/changefeed/SampleChangeFeedProcessor.java?name=ChangeFeedProcessorOptions)]
141140
142-
We assign this to a `changeFeedProcessorInstance`, passing parameters of compute instance name (`hostName`), the monitored container (here called `feedContainer`) and the `leaseContainer`. We then start the change feed processor:
141+
We assign the result of `buildChangeFeedProcessor()` to a `changeFeedProcessorInstance`, passing parameters of compute instance name (`hostName`), the monitored container (here called `feedContainer`) and the `leaseContainer`. We then start the change feed processor:
143142

144143
[!code-java[](~/azure-cosmos-java-sql-api-samples/src/main/java/com/azure/cosmos/examples/changefeed/SampleChangeFeedProcessor.java?name=StartChangeFeedProcessor)]
145144

@@ -157,7 +156,7 @@ The normal life cycle of a host instance is:
157156

158157
## Error handling
159158

160-
The change feed processor is resilient to user code errors. That means that if your delegate implementation has an unhandled exception (step #4), the thread processing that particular batch of changes will be stopped, and a new thread will be created. The new thread checks which was the latest point in time the lease store has for that range of partition key values, and restart from there, effectively sending the same batch of changes to the delegate. This behavior continues until your delegate processes the changes correctly and it's the reason the change feed processor has an "at least once" guarantee.
159+
The change feed processor is resilient to user code errors. If your delegate implementation has an unhandled exception (step #4), the thread processing that particular batch of changes is stopped, and a new thread is created. The new thread checks which was the latest point in time the lease store has for that range of partition key values, and restart from there, effectively sending the same batch of changes to the delegate. This behavior continues until your delegate processes the changes correctly and it's the reason the change feed processor has an "at least once" guarantee.
161160

162161
> [!NOTE]
163162
> There is only one scenario where a batch of changes will not be retried. If the failure happens on the first ever delegate execution, the lease store has no previous saved state to be used on the retry. On those cases, the retry would use the [initial starting configuration](#starting-time), which might or might not include the last batch.
@@ -180,7 +179,7 @@ As mentioned before, within a deployment unit you can have one or more compute i
180179
1. All instances should have the same value set in `options.setLeasePrefix` (or none set at all).
181180
1. Each instance needs to have a different `hostName`.
182181

183-
If these three conditions apply, then the change feed processor distributes all the leases in the lease container across all running instances of that deployment unit and parallelize compute using an equal distribution algorithm. One lease can only be owned by one instance at a given time, so the number of instances shouldn't be greater than the number of leases.
182+
If these three conditions apply, then the change feed processor distributes all the leases in the lease container across all running instances of that deployment unit and parallelize compute using an equal distribution algorithm. A lease is owned by one instance at a given time, so the number of instances shouldn't be greater than the number of leases.
184183

185184
The number of instances can grow and shrink, and the change feed processor will dynamically adjust the load by redistributing accordingly. Deployment units can share the same lease container, but they should each have a different `leasePrefix`.
186185

@@ -196,7 +195,7 @@ It's possible to initialize the change feed processor to read changes starting a
196195

197196
### Reading from the beginning
198197

199-
In our above sample, we set `setStartFromBeginning` to `false`, which is the same as the default value. In other scenarios like data migrations or analyzing the entire history of a container, we need to read the change feed from **the beginning of that container's lifetime**. To do that, we can set `setStartFromBeginning` to `true`. The change feed processor will be initialized and start reading changes from the beginning of the lifetime of the container.
198+
In our sample, we set `setStartFromBeginning` to `false`, which is the same as the default value. In other scenarios like data migrations or analyzing the entire history of a container, we need to read the change feed from **the beginning of that container's lifetime**. To do that, we can set `setStartFromBeginning` to `true`. The change feed processor will be initialized and start reading changes from the beginning of the lifetime of the container.
200199

201200
> [!NOTE]
202201
> These customization options only work to setup the starting point in time of the change feed processor. Once the leases container is initialized for the first time, changing them has no effect.
@@ -205,9 +204,9 @@ In our above sample, we set `setStartFromBeginning` to `false`, which is the sam
205204

206205
## Change feed and provisioned throughput
207206

208-
Change feed read operations on the monitored container consume [request units](../request-units.md). Make sure your monitored container isn't experiencing [throttling](troubleshoot-request-rate-too-large.md), otherwise you'll experience delays in receiving change feed events on your processors.
207+
Change feed read operations on the monitored container consume [request units](../request-units.md). Make sure your monitored container isn't experiencing [throttling](troubleshoot-request-rate-too-large.md), it adds delays in receiving change feed events on your processors.
209208

210-
Operations on the lease container (updating and maintaining state) consume [request units](../request-units.md). The higher the number of instances using the same lease container, the higher the potential request units consumption is. Make sure your lease container isn't experiencing [throttling](troubleshoot-request-rate-too-large.md), otherwise you'll experience delays in receiving change feed events on your processors, in some cases where throttling is high, the processors might stop processing completely.
209+
Operations on the lease container (updating and maintaining state) consume [request units](../request-units.md). The higher the number of instances using the same lease container, the higher the potential request units consumption is. Make sure your lease container isn't experiencing [throttling](troubleshoot-request-rate-too-large.md), it adds delays in receiving change feed events and can even stop processing completely.
211210

212211
## Sharing the lease container
213212

@@ -232,7 +231,7 @@ The change feed processor can be hosted in any platform that supports long runni
232231
* A serverless function in [Azure Functions](/azure/architecture/best-practices/background-jobs#azure-functions).
233232
* An [ASP.NET hosted service](/aspnet/core/fundamentals/host/hosted-services).
234233

235-
While change feed processor can run in short lived environments, because the lease container maintains the state, the startup cycle of these environments add delay to receiving the notifications (due to the overhead of starting the processor every time the environment is started).
234+
While change feed processor can run in short lived environments because the lease container maintains the state, the startup cycle of these environments add delay to receiving the notifications (due to the overhead of starting the processor every time the environment is started).
236235

237236
## Additional resources
238237

0 commit comments

Comments
 (0)