-
Notifications
You must be signed in to change notification settings - Fork 61
Description
ZnUrl seems to go against RFC 3986, in that it replaces percent-encoded octets for some reserved characters by those characters. Take the following block:
[ :url | (ZnUrl fromString: url) asString ]
Examples of how this block transforms URLs:
-
https://example.com/?a=b%3Dc⇒https://example.com/?a=b%3Dc
The two URLs are exactly the same. -
https://example.com/?a~b%7Ec⇒https://example.com/?a~b~c
The two URLs differ (%7Eversus~), but per section ‘2.3. Unreserved Characters’ in RFC 3986 they are equivalent: “URIs that differ in the replacement of an unreserved character with its corresponding percent-encoded US-ASCII octet are equivalent”.
The problem is in the third example:
https://example.com/?a;b%3Bc⇒https://example.com/?a;b;c
The two URLs differ (%3Bversus;), and per section ‘2.2 Reserved Characters’ in RFC 3986, they are not equivalent: “URIs that differ in the replacement of a reserved character with its corresponding percent-encoded octet are not equivalent”.
Note that the equals sign, used in the first example, is also a reserved character and is used as a delimiter in the URL-encoding of forms in HTML. As far as I understand, the intent of section 2.2 in RFC 3986 is that one could define a similar encoding that uses other reserved characters as delimiters: the queries of the URLs in the third example could be encodings of arrays of strings, in which the array #('a' 'b;c') is encoded as a;b%3Bc and the array#('a' 'b' 'c') as a;b;c.
Section ‘4.2.3. http(s) Normalization and Comparison’ in RFC 9110 states the following, for which it refers back to RFC 3986: “characters other than those in the "reserved" set are equivalent to their percent-encoded octets”.
See my comment in issue #89 for how this is related to that issue.