Skip to content

Commit 5f86257

Browse files
committed
PR feedback
1 parent dfb35c4 commit 5f86257

File tree

2 files changed

+10
-9
lines changed

2 files changed

+10
-9
lines changed

azure/ConsiderationsForServiceDesign.md

Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -230,29 +230,30 @@ status monitor LRO, whereas the status of all RELO operations is combined into t
230230
So status monitor LROs are "one-to-one" with their operation status, whereas RELO-style LROs are "many-to-one".
231231

232232
## Errors
233-
One of the most important parts of service design is also one of the most overlooked. The errors returned by your service are a critical part of your developer experience. Your service and your customer's application together form a distributed system. Errors are inevitable, but well-designed errors can help you avoid costly customer support incidents by empowering customers to self-diagnose problems.
233+
One of the most important parts of service design is also one of the most overlooked. The errors returned by your service are a critical part of your developer experience and are part of your API contract. Your service and your customer's application together form a distributed system. Errors are inevitable, but well-designed errors can help you avoid costly customer support incidents by empowering customers to self-diagnose problems.
234234

235-
First, you should always try to design errors out of existence if possible. You'll get a lot of this for free by following the Guidelines. Some examples include:
235+
First, you should always try to design errors out of existence if possible. You'll get a lot of this for free by following the [API Guidelines](https://aka.ms/azapi/guidelines). Some examples include:
236236
- Idempotent APIs solve a whole class of network issues where customers have no idea how to proceed if they send a request to the service but never get a response.
237-
- Accessing resources from multiple microservices can quickly lead to complex race conditions so conditional requests provide optimistic concurrency for safe usage.
238-
- Reframing the purpose of an API can obviate some errors. This is most often specific to your operations, but an example from the Guidelines is thinking about `DELETE`s as _"ensure no resource at this location exists"_ so they can return an easier to use `204` instead of _"delete this exact resource instance"_ which would fail with a non-idempotent `404`.
237+
- Accessing resources from multiple microservices can quickly lead to complex race conditions. These can be avoided by supporting conditional requests through an [optimistic concurrency strategy](https://github.com/microsoft/api-guidelines/blob/vNext/azure/Guidelines.md#optimistic-concurrency), e.g. by leveraging `If-Match`/`If-None-Match` request headers.
238+
- Reframing the purpose of an API can obviate some errors. This is most often specific to your operations, but an [example from the API Guidelines](https://github.com/microsoft/api-guidelines/blob/vNext/azure/Guidelines.md#http-return-codes) is thinking about `DELETE`s as _"ensure no resource at this location exists"_ so they can return an easier to use `204` instead of _"delete this exact resource instance"_ which would fail with a `404`.
239239

240240
There are two types of errors returned from your service and customers handle them differently.
241241
- Usage errors where the customer is calling your API incorrectly. The customer can easily make these errors go away by fixing their code. We expect most usage errors to be found during testing.
242242
- Runtime errors that can't be prevented by the customer and need to be recovered from. Some runtime errors like `429` throttling will be handled automatically by client libraries, but most will be situations like a `409` conflict requiring knowledge about the customer's application to remedy.
243243

244-
We provide appropriate [HTTP status codes](https://developer.mozilla.org/en-US/docs/Web/HTTP/Status#client_error_responses) for customers to handle errors generically and error code strings in our common error schema and the `x-ms-error-code` header for customers to handle errors specifically. As an example, consider what a customer would do when trying to get the properties of a Storage blob:
244+
You should use appropriate [HTTP status codes](https://developer.mozilla.org/docs/Web/HTTP/Status#client_error_responses) for customers to handle errors generically and error code strings in our common error schema and the `x-ms-error-code` header for customers to handle errors specifically. As an example, consider what a customer would do when trying to get the properties of a Storage blob:
245245
- A `404` status code tells them the blob doesn't exist and the customer can report the error to their users
246246
- A `BlobNotFound` or `ContainerNotFound` error code will tell them why the blob doesn't exist so they can take steps to recreate it
247247

248248
The common error schema in the Guidelines allows nested details and inner errors that have their own error codes, but the top-level error code is the most important. The HTTP status code and the top-level error code are the only part of your error that we consider part of your API contract that follows the same compatibility requirements as the rest of your API. Importantly, this means you **can not change the HTTP status code or top-level error code for an API in GA'ed service version**. You can only return new status codes and error codes in future API versions if customers make use of new features that trigger new classes of errors. Battle tested error handling is some of the hardest code to get right and we can't break that for customers when they upgrade to the latest version. The rest of the properties in your error like `message`, `details`, etc., are not considered part of your API contract and can change to improve the diagnosability of your service.
249249

250-
We also return the top-level error code as the `x-ms-error-code` header so client libraries have the ability to automatically retry requests when possible without having to parse a JSON payload. We recommend unique error codes like `ContainerBeingDeleted` for every distinct recoverable error that can occur, but suggest reusing common error codes like `InvalidHeaderValue` for usage errors where a descriptive error message is more important for resolving the problem. The Storage [Common](https://docs.microsoft.com/en-us/rest/api/storageservices/common-rest-api-error-codes) and [Blob](https://docs.microsoft.com/en-us/rest/api/storageservices/blob-service-error-codes) error codes are a good starting point if you're looking for examples.
250+
You should also return the top-level error code as the `x-ms-error-code` header so client libraries have the ability to automatically retry requests when possible without having to parse a JSON payload. We recommend unique error codes like `ContainerBeingDeleted` for every distinct recoverable error that can occur, but suggest reusing common error codes like `InvalidHeaderValue` for usage errors where a descriptive error message is more important for resolving the problem. The Storage [Common](https://docs.microsoft.com/rest/api/storageservices/common-rest-api-error-codes) and [Blob](https://docs.microsoft.com/rest/api/storageservices/blob-service-error-codes) error codes are a good starting point if you're looking for examples. You can [define an enum in your spec](https://github.com/Azure/azure-rest-api-specs/blob/main/specification/storage/data-plane/Microsoft.BlobStorage/preview/2021-04-10/blob.json#L10419) with `"modelAsString": true` that lists all of the top-level error codes to make it [easier for your customers to handle specific error codes](https://github.com/Azure/azure-sdk-for-net/tree/main/sdk/storage/Azure.Storage.Blobs#troubleshooting).
251251

252-
You should not use your OpenAPI/Swagger spec to document every error that can occur. The `"default"` response is the only thing AutoRest considers an error response unless you provide other annotations. Every unique status code turns into a separate code path in your client libraries so we do not encourage this practice. The only reason to document specific error codes is if they return a different error response than the default, but that is also heavily discouraged. You can [define an enum in your spec](https://github.com/Azure/azure-rest-api-specs/blob/main/specification/storage/data-plane/Microsoft.BlobStorage/preview/2021-04-10/blob.json#L10419) with `"modelAsString": true` that lists all of the top-level error codes to make it [easier for your customers to handle specific error codes](https://github.com/Azure/azure-sdk-for-net/tree/main/sdk/storage/Azure.Storage.Blobs#troubleshooting).
252+
You should not document specific error status codes in your OpenAPI/Swagger spec. The `"default"` response is the only thing AutoRest considers an error response unless you provide other annotations. Every unique status code turns into a separate code path in your client libraries so we do not encourage this practice. The only reason to document specific error status codes is if they return a different error response than the default, but that is also heavily discouraged.
253253

254254
Be as precise as possible when writing error messages. A message with just `Invalid Argument` is almost useless to a customer who sent 100KB of JSON to your endpoint. ``Query parameter `top` must be less than or equal to 1000`` tells a customer exactly what went wrong so they can quickly fix the problem. Don't go overboard while writing great, understandable error messages and include any sensitive customer information or secrets though. Many developers will blindly write any error to logs that don't have the same level of access control as Azure resources.
255255

256+
All responses should include the `x-ms-request-id` header with a unique id for the request, but this is particularly important for error responses. Service logs for the request should contain the `x-ms-request-id` so that support staff can use this value to diagnose specific customer reported errors.
256257

257258
Finally, write sample code for your service's workflow and add the code you'd want customers using to gracefully recover from errors. Is it actually graceful? Is it something you'd be comfortable asking most customers to write? We also highly encourage reaching out to customers during private preview and asking them for code they've written against your service. Their error handling might match your expectations, you might find a strong need for better documentation, or you might find important opportunities to improve the errors you're returning.
258259

azure/Guidelines.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -294,7 +294,7 @@ There are 2 kinds of errors:
294294

295295
*NOTE: `x-ms-error-code` values are part of your API contract (because customer code is likely to do comparisons against them) and cannot change in the future.*
296296

297-
:heavy_check_mark: **YOU MAY** implement the `x-ms-error-code` values as an enum with `"modelAsString": true` because it's possible add new values over time. In particular, it's only a breaking change if the same conditions result in a different top-level error code.
297+
:heavy_check_mark: **YOU MAY** implement the `x-ms-error-code` values as an enum with `"modelAsString": true` because it's possible add new values over time. In particular, it's only a breaking change if the same conditions result in a *different* top-level error code.
298298

299299
:warning: **YOU SHOULD NOT** add new top-level error codes to an existing API without bumping the service version.
300300

@@ -353,7 +353,7 @@ Example:
353353

354354
*Note: Do not use this mechanism to provide information developers need to rely on in code (ex: the error message can give details about why you've been throttled, but the `Retry-After` should be what developers rely on to back off).*
355355

356-
:warning: **YOU SHOULD NOT** use your OpenAPI/Swagger specification to document every failing status code or error code for each operation.
356+
:warning: **YOU SHOULD NOT** document specific error status codes in your OpenAPI/Swagger spec unless the "default" response cannot properly describe the specific error response (e.g. body schema is different).
357357

358358
### JSON
359359
Services, and the clients that access them, may be written in multiple languages. To ensure interoperability, JSON establishes the "lowest common denominator" type system, which is always sent over the wire as UTF-8 bytes. This system is very simple and consists of three types:

0 commit comments

Comments
 (0)