You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Some Azure services return substring offset & length values within a string. For example, the offset & length within a string to a name, email address, or phone #.
521
+
When a service response includes a string, the client's programming language deserializes that string into that language's internal string encoding. Below are the possible encodings and examples of languages that use each encoding:
522
+
523
+
| Encoding | Example languages |
524
+
| -------- | ------- |
525
+
| UTF-8 | Go, Rust, Ruby, PHP |
526
+
| UTF-16 | JavaScript, Java, C# |
527
+
| CodePoint (UTF-32) | Python |
528
+
529
+
Because the service doesn't know what language a client is written in and what string encoding that language uses, the service can't return UTF-agnostic offset and length values that the client can use to index within the string. To address this, the service response must include offset & length values for all 3 possible encodings and then the client code must select the encoding it required by its language's internal string encoding.
530
+
531
+
For example, if a service response needed to identify offset & length values for "name" and "email" substrings, the JSON response would look like this:
532
+
533
+
```
534
+
{
535
+
(... other properties not shown...)
536
+
"fullString": "(...some string containing a name and an email address...)",
537
+
"name": {
538
+
"offset": {
539
+
"utf8": 12,
540
+
"utf16": 10,
541
+
"codePoint": 4
542
+
},
543
+
"length": {
544
+
"uft8": 10,
545
+
"utf16": 8,
546
+
"codePoint": 2
547
+
}
548
+
},
549
+
"email": {
550
+
"offset": {
551
+
"utf8": 12,
552
+
"utf16": 10,
553
+
"codePoint": 4
554
+
},
555
+
"length": {
556
+
"uft8": 10,
557
+
"utf16": 8,
558
+
"codePoint": 4
559
+
}
560
+
}
561
+
}
562
+
```
563
+
564
+
Then, the Go developer, for example, would get the substring containing the name using code like this:
565
+
566
+
```
567
+
var response := client.SomeMethodReturningJSONShownAbove(...)
568
+
name := response.fullString[ response.name.offset.utf8 : response.name.offset.utf8 + response.name.length.utf8]
569
+
```
570
+
571
+
The service must calculate the offset & length for all 3 encodings and return them because clients find it difficult working with Unicode encodings and how to convert from one encoding to another. In other words, we do this to simplify client development and ensure customer success when isolating a substring.
572
+
518
573
## Getting Help: The Azure REST API Stewardship Board
519
574
The Azure REST API Stewardship board is a collection of dedicated architects that are passionate about helping Azure service teams build interfaces that are intuitive, maintainable, consistent, and most importantly, delight our customers. Because APIs affect nearly all downstream decisions, you are encouraged to reach out to the Stewardship board early in the development process. These architects will work with you to apply these guidelines and identify any hidden pitfalls in your design.
| 2023-May-12 | Explain service response for missing/unsupported `api-version`|
20
21
| 2023-Apr-21 | Update/clarify guidelines on POST method repeatability |
21
22
| 2023-Apr-07 | Update/clarify guidelines on polymorphism |
@@ -438,7 +439,7 @@ This indicates to client libraries and customers that values of the enumeration
438
439
439
440
Polymorphism types in REST APIs refers to the possibility to use the same property of a request or response to have similar but different shapes. This is commonly expressed as a `oneOf` in JsonSchema or OpenAPI. In order to simplify how to determine which specific type a given request or response payload corresponds to, Azure requires the use of an explicit discriminator field.
440
441
441
-
Note: Polymorphic types can make your service more difficult for nominally typed languages to consume. See the corresponding section in the [Considerations for service design](./ConsiderationsForServiceDesign.md#avoid-surprises) for more information.
442
+
Note: Polymorphic types can make your service more difficult for nominally typed languages to consume. See the corresponding section in the [Considerations for service design](./ConsiderationsForServiceDesign.md#avoid-surprises) for more information.
442
443
443
444
<ahref="#json-use-discriminator-for-polymorphism"name="json-use-discriminator-for-polymorphism">:white_check_mark:</a> **DO** define a discriminator field indicating the kind of the resource and include any kind-specific fields in the body.
444
445
@@ -838,7 +839,7 @@ For example:
838
839
### Repeatability of requests
839
840
840
841
Fault tolerant applications require that clients retry requests for which they never got a response, and services must handle these retried requests idempotently. In Azure, all HTTP operations are naturally idempotent except for POST used to create a resource and [POST when used to invoke an action](
<ahref="#repeatability-headers"name="repeatability-headers">:ballot_box_with_check:</a> **YOU SHOULD** support repeatable requests as defined in [OASIS Repeatable Requests Version 1.0](https://docs.oasis-open.org/odata/repeatable-requests/v1.0/repeatable-requests-v1.0.html) for POST operations to make them retriable.
844
845
- The tracked time window (difference between the `Repeatability-First-Sent` value and the current time) **MUST** be at least 5 minutes.
@@ -1098,6 +1099,14 @@ While it may be tempting to use a revision/version number for the resource as th
1098
1099
1099
1100
<ahref="#condreq-etag-depends-on-encoding"name="condreq-etag-depends-on-encoding">:white_check_mark:</a> **DO**, when supporting multiple representations (e.g. Content-Encodings) for the same resource, generate different ETag values for the different representations.
All string values in JSON are inherently Unicode and UTF-8 encoded, but clients written in a high-level programming language must work with strings in that language's string encoding, which may be UTF-8, UTF-16, or CodePoints (UTF-32).
1106
+
When a service response includes a string offset or length value, it should specify these values in all 3 encodings to simplify client development and ensure customer success when isolating a substring.
1107
+
1108
+
<ahref="#substrings-return-value-for-each-encoding"name="substrings-return-value-for-each-encoding">:white_check_mark:</a> **DO** include all 3 encodings (UTF-8, UTF-16, and CodePoint) for every string offset or length value in a service response.
1109
+
1101
1110
<ahref="#telemetry"name="telemetry"></a>
1102
1111
### Distributed Tracing & Telemetry
1103
1112
Azure SDK client guidelines specify that client libraries must send telemetry data through the `User-Agent` header, `X-MS-UserAgent` header, and Open Telemetry.
0 commit comments