Skip to content

Commit eea86e3

Browse files
committed
Revamp troubleshooting article
1 parent 85e7733 commit eea86e3

File tree

1 file changed

+10
-8
lines changed

1 file changed

+10
-8
lines changed

articles/cosmos-db/nosql/troubleshoot-service-unavailable.md

Lines changed: 10 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -2,24 +2,25 @@
22
title: Troubleshoot Azure Cosmos DB service unavailable exceptions
33
description: Learn how to diagnose and fix Azure Cosmos DB service unavailable exceptions.
44
author: seesharprun
5+
ms.author: sidandrews
6+
ms.reviewer: mjbrown
57
ms.service: cosmos-db
68
ms.subservice: nosql
7-
ms.date: 08/31/2022
8-
ms.author: sidandrews
99
ms.topic: troubleshooting
10-
ms.reviewer: mjbrown
10+
ms.date: 04/03/2023
1111
---
1212

1313
# Diagnose and troubleshoot Azure Cosmos DB service unavailable exceptions
14+
1415
[!INCLUDE[NoSQL](../includes/appliesto-nosql.md)]
1516

1617
The SDK wasn't able to connect to Azure Cosmos DB. This scenario can be transient or permanent depending on the network conditions.
1718

18-
It is important to make sure the application design is following our [guide for designing resilient applications with Azure Cosmos DB SDKs](conceptual-resilient-sdk-applications.md) to make sure it correctly reacts to different network conditions. Your application should have retries in place for service unavailable errors.
19+
It's important to make sure the application design is following our [guide for designing resilient applications with Azure Cosmos DB SDKs](conceptual-resilient-sdk-applications.md) to make sure it correctly reacts to different network conditions. Your application should have retries in place for service unavailable errors.
1920

2021
When evaluating the case for service unavailable errors:
2122

22-
* What is the impact measured in volume of operations affected compared to the operations succeeding? Is it within the service SLAs?
23+
* What is the effect measured in volume of operations affected compared to the operations succeeding? Is it within the service SLAs?
2324
* Is the P99 latency / availability affected?
2425
* Are the failures affecting all your application instances or only a subset? When the issue is reduced to a subset of instances, it's commonly a problem related to those instances.
2526

@@ -29,15 +30,16 @@ The following list contains known causes and solutions for service unavailable e
2930

3031
### Verify the substatus code
3132

32-
In certain conditions, the HTTP 503 Service Unavailable error will include a substatus code that helps to identify the cause.
33+
In certain conditions, the HTTP 503 Service Unavailable error includes a substatus code that helps to identify the cause.
3334

34-
| SubStatus Code | Description |
35+
| Substatus Code | Description |
3536
|----------|-------------|
3637
| 20001 | The service unavailable error happened because there are client side [connectivity issues](#client-side-transient-connectivity-issues) (failures attempting to connect). The client attempted to recover by [retrying](conceptual-resilient-sdk-applications.md#timeouts-and-connectivity-related-failures-http-408503) but all retries failed. |
3738
| 20002 | The service unavailable error happened because there are client side [timeouts](troubleshoot-dotnet-sdk-request-timeout.md#troubleshooting-steps). The client attempted to recover by [retrying](conceptual-resilient-sdk-applications.md#timeouts-and-connectivity-related-failures-http-408503) but all retries failed. |
3839
| 20003 | The service unavailable error happened because there are underlying I/O errors related to the operating system. See the exception details for the related I/O error. |
3940
| 20004 | The service unavailable error happened because [client machine's CPU is overloaded](troubleshoot-dotnet-sdk-request-timeout.md#high-cpu-utilization). |
40-
| 20005 | The service unavailable error happened because client machine's threadpool is starved. Verify any potential [blocking async calls in your code](https://github.com/davidfowl/AspNetCoreDiagnosticScenarios/blob/master/AsyncGuidance.md#avoid-using-taskresult-and-taskwait). |
41+
| 20005 | The service unavailable error happened because client machine's thread pool is starved. Verify any potential [blocking async calls in your code](https://github.com/davidfowl/AspNetCoreDiagnosticScenarios/blob/master/AsyncGuidance.md#avoid-using-taskresult-and-taskwait). |
42+
| 20006 | The connection between the service and client was interrupted or terminated in an unexpected manner. |
4143
| >= 21001 | This service unavailable error happened due to a transient service condition. Verify the conditions in the above section, confirm if you have retry policies in place. If the volume of these errors is high compared with successes, reach out to Azure Support. |
4244

4345
### The required ports are being blocked

0 commit comments

Comments
 (0)