UTF-8 decode should not be required for response.clientDataJSON and cData

Currently the spec states:

> Let JSONtext be the result of running [UTF-8 decode](https://encoding.spec.whatwg.org/#utf-8-decode) on the value of response.[clientDataJSON](https://www.w3.org/TR/webauthn-3/#dom-authenticatorresponse-clientdatajson).
>
>Note: Using any implementation of [UTF-8 decode](https://encoding.spec.whatwg.org/#utf-8-decode) is acceptable as long as it yields the same result as that yielded by the [UTF-8 decode](https://encoding.spec.whatwg.org/#utf-8-decode) algorithm. In particular, any leading byte order mark (BOM) MUST be stripped.

for [step 5 in Registering a New Credential](https://www.w3.org/TR/webauthn-3/#sctn-registering-a-new-credential) and

>Let JSONtext be the result of running [UTF-8 decode](https://encoding.spec.whatwg.org/#utf-8-decode) on the value of cData.
>
>Note: Using any implementation of [UTF-8 decode](https://encoding.spec.whatwg.org/#utf-8-decode) is acceptable as long as it yields the same result as that yielded by the [UTF-8 decode](https://encoding.spec.whatwg.org/#utf-8-decode) algorithm. In particular, any leading byte order mark (BOM) MUST be stripped.

for [step 8 in Verifying an Authentication Assertion](https://www.w3.org/TR/webauthn-3/#sctn-verifying-assertion).

This seems _slightly_ too strict. While the notes call out stripping a BOM, they also state "yields the _same_ result …" (emphasis added); however [UTF-8 decode](https://encoding.spec.whatwg.org/#utf-8-decode) requires decoding with the `"replacement"` handler as well.

According to [the serialization of the `CollectedClientData`](https://www.w3.org/TR/webauthn-3/#clientdatajson-serialization), it is impossible for invalid UTF-8 to be generated. This means that RPs should only have to worry about stripping a BOM but _not_ replacing invalid UTF-8 code units with the "replacement character" (i.e., U+FFFD); as the existence of invalid UTF-8 implies the serialization algorithm has not been adhered to as mandated by the spec. I suppose one could argue prepending the "zero width no-break space" character (i.e., U+FEFF) also violates the serialization algorithm; thus interpreting its existence as a byte-order mark (BOM) and subsequently stripping it seems bizarre too.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UTF-8 decode should not be required for response.clientDataJSON and cData #2100

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

UTF-8 decode should not be required for response.clientDataJSON and cData #2100

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions