Skip to content

Commit b019a48

Browse files
authored
Merge pull request #222814 from ealsur/users/ealsur/cosmosretry
Linking Cosmos DB bindings retry
2 parents 8a9fc22 + 317bd50 commit b019a48

File tree

2 files changed

+28
-31
lines changed

2 files changed

+28
-31
lines changed

articles/azure-functions/functions-bindings-error-pages.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ title: Azure Functions error handling and retry guidance
33
description: Learn to handle errors and retry events in Azure Functions with links to specific binding errors, including information on retry policies.
44

55
ms.topic: conceptual
6-
ms.date: 08/03/2022
6+
ms.date: 01/03/2023
77
zone_pivot_groups: programming-languages-set-functions-lang-workers
88
---
99

@@ -49,7 +49,7 @@ There are two kinds of retries available for your functions: built-in retry beha
4949

5050
| Trigger/binding | Retry source | Configuration |
5151
| ---- | ---- | ----- |
52-
| Azure Cosmos DB | n/a | Not configurable |
52+
| Azure Cosmos DB | [Retry policies](#retry-policies) | Function-level |
5353
| Blob Storage | [Binding extension](functions-bindings-storage-blob-trigger.md#poison-blobs) | [host.json](functions-bindings-storage-queue.md#host-json) |
5454
| Event Grid | [Binding extension](../event-grid/delivery-and-retry.md) | Event subscription |
5555
| Event Hubs | [Retry policies](#retry-policies) | Function-level |

articles/cosmos-db/nosql/troubleshoot-changefeed-functions.md

Lines changed: 26 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ author: ealsur
55
ms.service: cosmos-db
66
ms.subservice: nosql
77
ms.custom: ignite-2022
8-
ms.date: 04/14/2022
8+
ms.date: 01/03/2023
99
ms.author: maquaran
1010
ms.topic: troubleshooting
1111
ms.reviewer: mjbrown
@@ -18,53 +18,50 @@ This article covers common issues, workarounds, and diagnostic steps, when you u
1818

1919
## Dependencies
2020

21-
The Azure Functions trigger and bindings for Azure Cosmos DB depend on the extension packages over the base Azure Functions runtime. Always keep these packages updated, as they might include fixes and new features that might address any potential issues you may encounter:
22-
23-
* For Azure Functions V2, see [Microsoft.Azure.WebJobs.Extensions.CosmosDB](https://www.nuget.org/packages/Microsoft.Azure.WebJobs.Extensions.CosmosDB).
24-
* For Azure Functions V1, see [Microsoft.Azure.WebJobs.Extensions.DocumentDB](https://www.nuget.org/packages/Microsoft.Azure.WebJobs.Extensions.DocumentDB).
21+
The Azure Functions trigger and bindings for Azure Cosmos DB depend on the extension package [Microsoft.Azure.WebJobs.Extensions.CosmosDB](https://www.nuget.org/packages/Microsoft.Azure.WebJobs.Extensions.CosmosDB) over the base Azure Functions runtime. Always keep these packages updated, as they might include fixes and new features that might address any potential issues you may encounter.
2522

2623
This article will always refer to Azure Functions V2 whenever the runtime is mentioned, unless explicitly specified.
2724

2825
## Consume the Azure Cosmos DB SDK independently
2926

3027
The key functionality of the extension package is to provide support for the Azure Functions trigger and bindings for Azure Cosmos DB. It also includes the [Azure Cosmos DB .NET SDK](sdk-dotnet-core-v2.md), which is helpful if you want to interact with Azure Cosmos DB programmatically without using the trigger and bindings.
3128

32-
If want to use the Azure Cosmos DB SDK, make sure that you don't add to your project another NuGet package reference. Instead, **let the SDK reference resolve through the Azure Functions' Extension package**. Consume the Azure Cosmos DB SDK separately from the trigger and bindings
29+
If you want to use the Azure Cosmos DB SDK, make sure that you don't add to your project another NuGet package reference. Instead, **let the SDK reference resolve through the Azure Functions' Extension package**. Consume the Azure Cosmos DB SDK separately from the trigger and bindings
3330

34-
Additionally, if you are manually creating your own instance of the [Azure Cosmos DB SDK client](./sdk-dotnet-core-v2.md), you should follow the pattern of having only one instance of the client [using a Singleton pattern approach](../../azure-functions/manage-connections.md?tabs=csharp#azure-cosmos-db-clients). This process avoids the potential socket issues in your operations.
31+
Additionally, if you're manually creating your own instance of the [Azure Cosmos DB SDK client](./sdk-dotnet-core-v2.md), you should follow the pattern of having only one instance of the client [using a Singleton pattern approach](../../azure-functions/manage-connections.md?tabs=csharp#azure-cosmos-db-clients). This process avoids the potential socket issues in your operations.
3532

3633
## Common scenarios and workarounds
3734

3835
### Azure Function fails with error message collection doesn't exist
3936

40-
Azure Function fails with error message "Either the source collection 'collection-name' (in database 'database-name') or the lease collection 'collection2-name' (in database 'database2-name') does not exist. Both collections must exist before the listener starts. To automatically create the lease collection, set 'CreateLeaseCollectionIfNotExists' to 'true'"
37+
Azure Function fails with error message "Either the source collection 'collection-name' (in database 'database-name') or the lease collection 'collection2-name' (in database 'database2-name') doesn't exist. Both collections must exist before the listener starts. To automatically create the lease collection, set 'CreateLeaseCollectionIfNotExists' to 'true'"
4138

42-
This means that either one or both of the Azure Cosmos DB containers required for the trigger to work do not exist or are not reachable to the Azure Function. **The error itself will tell you which Azure Cosmos DB database and container is the trigger looking for** based on your configuration.
39+
This means that either one or both of the Azure Cosmos DB containers required for the trigger to work don't exist or aren't reachable to the Azure Function. **The error itself will tell you which Azure Cosmos DB database and container is the trigger looking for** based on your configuration.
4340

4441
1. Verify the `ConnectionStringSetting` attribute and that it **references a setting that exists in your Azure Function App**. The value on this attribute shouldn't be the Connection String itself, but the name of the Configuration Setting.
45-
2. Verify that the `databaseName` and `collectionName` exist in your Azure Cosmos DB account. If you are using automatic value replacement (using `%settingName%` patterns), make sure the name of the setting exists in your Azure Function App.
42+
2. Verify that the `databaseName` and `collectionName` exist in your Azure Cosmos DB account. If you're using automatic value replacement (using `%settingName%` patterns), make sure the name of the setting exists in your Azure Function App.
4643
3. If you don't specify a `LeaseCollectionName/leaseCollectionName`, the default is "leases". Verify that such container exists. Optionally you can set the `CreateLeaseCollectionIfNotExists` attribute in your Trigger to `true` to automatically create it.
47-
4. Verify your [Azure Cosmos DB account's Firewall configuration](../how-to-configure-firewall.md) to see to see that it's not it's not blocking the Azure Function.
44+
4. Verify your [Azure Cosmos DB account's Firewall configuration](../how-to-configure-firewall.md) to see that it's not blocking the Azure Function.
4845

4946
### Azure Function fails to start with "Shared throughput collection should have a partition key"
5047

51-
The previous versions of the Azure Cosmos DB Extension did not support using a leases container that was created within a [shared throughput database](../set-throughput.md#set-throughput-on-a-database). To resolve this issue, update the [Microsoft.Azure.WebJobs.Extensions.CosmosDB](https://www.nuget.org/packages/Microsoft.Azure.WebJobs.Extensions.CosmosDB) extension to get the latest version.
48+
The previous versions of the Azure Cosmos DB Extension didn't support using a leases container that was created within a [shared throughput database](../set-throughput.md#set-throughput-on-a-database). To resolve this issue, update the [Microsoft.Azure.WebJobs.Extensions.CosmosDB](https://www.nuget.org/packages/Microsoft.Azure.WebJobs.Extensions.CosmosDB) extension to get the latest version.
5249

5350
### Azure Function fails to start with "PartitionKey must be supplied for this operation."
5451

55-
This error means that you are currently using a partitioned lease collection with an old [extension dependency](#dependencies). Upgrade to the latest available version. If you are currently running on Azure Functions V1, you will need to upgrade to Azure Functions V2.
52+
This error means that you're currently using a partitioned lease collection with an old [extension dependency](#dependencies). Upgrade to the latest available version. If you're currently running on Azure Functions V1, you'll need to upgrade to Azure Functions V2.
5653

57-
### Azure Function fails to start with "Forbidden (403); Substatus: 5300... The given request [POST ...] cannot be authorized by AAD token in data plane"
54+
### Azure Function fails to start with "Forbidden (403); Substatus: 5300... The given request [POST ...] can't be authorized by AAD token in data plane"
5855

59-
This error means your Function is attempting to [perform a non-data operation using Azure AD identities](troubleshoot-forbidden.md#non-data-operations-are-not-allowed). You cannot use `CreateLeaseContainerIfNotExists = true` when using Azure AD identities.
56+
This error means your Function is attempting to [perform a non-data operation using Azure AD identities](troubleshoot-forbidden.md#non-data-operations-are-not-allowed). You can't use `CreateLeaseContainerIfNotExists = true` when using Azure AD identities.
6057

6158
### Azure Function fails to start with "The lease collection, if partitioned, must have partition key equal to id."
6259

63-
This error means that your current leases container is partitioned, but the partition key path is not `/id`. To resolve this issue, you need to recreate the leases container with `/id` as the partition key.
60+
This error means that your current leases container is partitioned, but the partition key path isn't `/id`. To resolve this issue, you need to recreate the leases container with `/id` as the partition key.
6461

65-
### You see a "Value cannot be null. Parameter name: o" in your Azure Functions logs when you try to Run the Trigger
62+
### You see a "Value can't be null. Parameter name: o" in your Azure Functions logs when you try to Run the Trigger
6663

67-
This issue appears if you are using the Azure portal and you try to select the **Run** button on the screen when inspecting an Azure Function that uses the trigger. The trigger does not require for you to select Run to start, it will automatically start when the Azure Function is deployed. If you want to check the Azure Function's log stream on the Azure portal, just go to your monitored container and insert some new items, you will automatically see the Trigger executing.
64+
This issue appears if you're using the Azure portal and you try to select the **Run** button on the screen when inspecting an Azure Function that uses the trigger. The trigger doesn't require for you to select Run to start, it will automatically start when the Azure Function is deployed. If you want to check the Azure Function's log stream on the Azure portal, just go to your monitored container and insert some new items, you'll automatically see the Trigger executing.
6865

6966
### My changes take too long to be received
7067

@@ -75,40 +72,40 @@ This scenario can have multiple causes and all of them should be checked:
7572
If it's the latter, there could be some delay between the changes being stored and the Azure Function picking them up. This is because internally, when the trigger checks for changes in your Azure Cosmos DB container and finds none pending to be read, it will sleep for a configurable amount of time (5 seconds, by default) before checking for new changes (to avoid high RU consumption). You can configure this sleep time through the `FeedPollDelay/feedPollDelay` setting in the [configuration](../../azure-functions/functions-bindings-cosmosdb-v2-trigger.md#configuration) of your trigger (the value is expected to be in milliseconds).
7673
3. Your Azure Cosmos DB container might be [rate-limited](../request-units.md).
7774
4. You can use the `PreferredLocations` attribute in your trigger to specify a comma-separated list of Azure regions to define a custom preferred connection order.
78-
5. The speed at which your Trigger receives new changes is dictated by the speed at which you are processing them. Verify the Function's [Execution Time / Duration](../../azure-functions/analyze-telemetry-data.md), if your Function is slow that will increase the time it takes for your Trigger to get new changes. If you see a recent increase in Duration, there could be a recent code change that might affect it. If the speed at which you are receiving operations on your Azure Cosmos DB container is faster than the speed of your Trigger, you will keep lagging behind. You might want to investigate in the Function's code, what is the most time consuming operation and how to optimize it.
75+
5. The speed at which your Trigger receives new changes is dictated by the speed at which you're processing them. Verify the Function's [Execution Time / Duration](../../azure-functions/analyze-telemetry-data.md), if your Function is slow that will increase the time it takes for your Trigger to get new changes. If you see a recent increase in Duration, there could be a recent code change that might affect it. If the speed at which you're receiving operations on your Azure Cosmos DB container is faster than the speed of your Trigger, you'll keep lagging behind. You might want to investigate in the Function's code, what is the most time consuming operation and how to optimize it.
7976

8077
### Some changes are repeated in my Trigger
8178

82-
The concept of a "change" is an operation on a document. The most common scenarios where events for the same document is received are:
79+
The concept of a "change" is an operation on a document. The most common scenarios where events for the same document are received are:
8380
* The account is using Eventual consistency. While consuming the change feed in an Eventual consistency level, there could be duplicate events in-between subsequent change feed read operations (the last event of one read operation appears as the first of the next).
8481
* The document is being updated. The Change Feed can contain multiple operations for the same documents, if that document is receiving updates, it can pick up multiple events (one for each update). One easy way to distinguish among different operations for the same document is to track the `_lsn` [property for each change](../change-feed.md#change-feed-and-_etag-_lsn-or-_ts). If they don't match, these are different changes over the same document.
85-
* If you are identifying documents just by `id`, remember that the unique identifier for a document is the `id` and its partition key (there can be two documents with the same `id` but different partition key).
82+
* If you're identifying documents just by `id`, remember that the unique identifier for a document is the `id` and its partition key (there can be two documents with the same `id` but different partition key).
8683

8784
### Some changes are missing in my Trigger
8885

89-
If you find that some of the changes that happened in your Azure Cosmos DB container are not being picked up by the Azure Function or some changes are missing in the destination when you are copying them, please follow the below steps.
86+
If you find that some of the changes that happened in your Azure Cosmos DB container aren't being picked up by the Azure Function or some changes are missing in the destination when you're copying them, follow the below steps.
9087

91-
When your Azure Function receives the changes, it often processes them, and could optionally, send the result to another destination. When you are investigating missing changes, make sure you **measure which changes are being received at the ingestion point** (when the Azure Function starts), not on the destination.
88+
When your Azure Function receives the changes, it often processes them, and could optionally, send the result to another destination. When you're investigating missing changes, make sure you **measure which changes are being received at the ingestion point** (when the Azure Function starts), not on the destination.
9289

9390
If some changes are missing on the destination, this could mean that is some error happening during the Azure Function execution after the changes were received.
9491

95-
In this scenario, the best course of action is to add `try/catch` blocks in your code and inside the loops that might be processing the changes, to detect any failure for a particular subset of items and handle them accordingly (send them to another storage for further analysis or retry).
92+
In this scenario, the best course of action is to add `try/catch` blocks in your code and inside the loops that might be processing the changes, to detect any failure for a particular subset of items and handle them accordingly (send them to another storage for further analysis or retry). Alternatively, you can configure Azure Functions [retry policies](../../azure-functions/functions-bindings-error-pages.md#retries).
9693

9794
> [!NOTE]
98-
> The Azure Functions trigger for Azure Cosmos DB, by default, won't retry a batch of changes if there was an unhandled exception during your code execution. This means that the reason that the changes did not arrive at the destination is because that you are failing to process them.
95+
> The Azure Functions trigger for Azure Cosmos DB, by default, won't retry a batch of changes if there was an unhandled exception during your code execution. This means that the reason that the changes did not arrive at the destination might be because you are failing to process them.
9996
100-
If the destination is another Azure Cosmos DB container and you are performing Upsert operations to copy the items, **verify that the Partition Key Definition on both the monitored and destination container are the same**. Upsert operations could be saving multiple source items as one in the destination because of this configuration difference.
97+
If the destination is another Azure Cosmos DB container and you're performing Upsert operations to copy the items, **verify that the Partition Key Definition on both the monitored and destination container are the same**. Upsert operations could be saving multiple source items as one in the destination because of this configuration difference.
10198

102-
If you find that some changes were not received at all by your trigger, the most common scenario is that there is **another Azure Function running**. It could be another Azure Function deployed in Azure or an Azure Function running locally on a developer's machine that has **exactly the same configuration** (same monitored and lease containers), and this Azure Function is stealing a subset of the changes you would expect your Azure Function to process.
99+
If you find that some changes weren't received at all by your trigger, the most common scenario is that there's **another Azure Function running**. It could be another Azure Function deployed in Azure or an Azure Function running locally on a developer's machine that has **exactly the same configuration** (same monitored and lease containers), and this Azure Function is stealing a subset of the changes you would expect your Azure Function to process.
103100

104101
Additionally, the scenario can be validated, if you know how many Azure Function App instances you have running. If you inspect your leases container and count the number of lease items within, the distinct values of the `Owner` property in them should be equal to the number of instances of your Function App. If there are more owners than the known Azure Function App instances, it means that these extra owners are the ones "stealing" the changes.
105102

106103
One easy way to work around this situation, is to apply a `LeaseCollectionPrefix/leaseCollectionPrefix` to your Function with a new/different value or, alternatively, test with a new leases container.
107104

108105
### Need to restart and reprocess all the items in my container from the beginning
109106
To reprocess all the items in a container from the beginning:
110-
1. Stop your Azure function if it is currently running.
111-
1. Delete the documents in the lease collection (or delete and re-create the lease collection so it is empty)
107+
1. Stop your Azure function if it's currently running.
108+
1. Delete the documents in the lease collection (or delete and re-create the lease collection so it's empty)
112109
1. Set the [StartFromBeginning](../../azure-functions/functions-bindings-cosmosdb-v2-trigger.md#configuration) CosmosDBTrigger attribute in your function to true.
113110
1. Restart the Azure function. It will now read and process all changes from the beginning.
114111

0 commit comments

Comments
 (0)