Skip to content

UTF-8 decode should not be required for response.clientDataJSON and cData #2100

@zacknewman

Description

@zacknewman

Currently the spec states:

Let JSONtext be the result of running UTF-8 decode on the value of response.clientDataJSON.

Note: Using any implementation of UTF-8 decode is acceptable as long as it yields the same result as that yielded by the UTF-8 decode algorithm. In particular, any leading byte order mark (BOM) MUST be stripped.

for step 5 in Registering a New Credential and

Let JSONtext be the result of running UTF-8 decode on the value of cData.

Note: Using any implementation of UTF-8 decode is acceptable as long as it yields the same result as that yielded by the UTF-8 decode algorithm. In particular, any leading byte order mark (BOM) MUST be stripped.

for step 8 in Verifying an Authentication Assertion.

This seems slightly too strict. While the notes call out stripping a BOM, they also state "yields the same result …" (emphasis added); however UTF-8 decode requires decoding with the "replacement" handler as well.

According to the serialization of the CollectedClientData, it is impossible for invalid UTF-8 to be generated. This means that RPs should only have to worry about stripping a BOM but not replacing invalid UTF-8 code units with the "replacement character" (i.e., U+FFFD); as the existence of invalid UTF-8 implies the serialization algorithm has not been adhered to as mandated by the spec. I suppose one could argue prepending the "zero width no-break space" character (i.e., U+FEFF) also violates the serialization algorithm; thus interpreting its existence as a byte-order mark (BOM) and subsequently stripping it seems bizarre too.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions