Skip to content

Commit 249f1e4

Browse files
authored
Merge pull request bitcoin#587 from mruddy/bip173
bip-0173: test vectors, HRP, and casing requirements updates
2 parents ed96b6b + 96d39e8 commit 249f1e4

File tree

1 file changed

+24
-9
lines changed

1 file changed

+24
-9
lines changed

bip-0173.mediawiki

Lines changed: 24 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -76,7 +76,7 @@ increase, but that does not matter when copy-pasting addresses.</ref> format cal
7676

7777
A Bech32<ref>'''Why call it Bech32?''' "Bech" contains the characters BCH (the error
7878
detection algorithm used) and sounds a bit like "base".</ref> string is at most 90 characters long and consists of:
79-
* The '''human-readable part''', which is intended to convey the type of data or anything else that is relevant for the reader. Its validity (including the used set of characters) is application specific, but restricted to ASCII characters with values in the range 33-126.
79+
* The '''human-readable part''', which is intended to convey the type of data, or anything else that is relevant to the reader. This part MUST contain 1 to 83 US-ASCII characters, with each character having a value in the range [33-126]. HRP validity may be further restricted by specific applications.
8080
* The '''separator''', which is always "1". In case "1" is allowed inside the human-readable part, the last one in the string is the separator<ref>'''Why include a separator in addresses?''' That way the human-readable
8181
part is unambiguously separated from the data part, avoiding potential
8282
collisions with other human-readable parts that share a prefix. It also
@@ -153,7 +153,7 @@ guarantees detection of '''any error affecting at most 4 characters'''
153153
and has less than a 1 in 10<sup>9</sup> chance of failing to detect more
154154
errors. More details about the properties can be found in the
155155
Checksum Design appendix. The human-readable part is processed by first
156-
feeding the higher bits of each character's ASCII value into the
156+
feeding the higher bits of each character's US-ASCII value into the
157157
checksum calculation followed by a zero and then the lower bits of each<ref>'''Why are the high bits of the human-readable part processed first?'''
158158
This results in the actually checksummed data being ''[high hrp] 0 [low hrp] [data]''. This means that under the assumption that errors to the
159159
human readable part only change the low 5 bits (like changing an alphabetical character into another), errors are restricted to the ''[low hrp] [data]''
@@ -182,11 +182,15 @@ to make.
182182

183183
'''Uppercase/lowercase'''
184184

185-
Decoders MUST accept both uppercase and lowercase strings, but
186-
not mixed case. The lowercase form is used when determining a character's
187-
value for checksum purposes. For presentation, lowercase is usually
188-
preferable, but inside QR codes uppercase SHOULD be used, as those permit
189-
the use of
185+
The lowercase form is used when determining a character's value for checksum purposes.
186+
187+
Encoders MUST always output an all lowercase Bech32 string.
188+
If an uppercase version of the encoding result is desired, (e.g.- for presentation purposes, or QR code use),
189+
then an uppercasing procedure can be performed external to the encoding process.
190+
191+
Decoders MUST NOT accept strings where some characters are uppercase and some are lowercase (such strings are referred to as mixed case strings).
192+
193+
For presentation, lowercase is usually preferable, but inside QR codes uppercase SHOULD be used, as those permit the use of
190194
''[http://www.thonky.com/qr-code-tutorial/alphanumeric-mode-encoding alphanumeric mode]'', which is 45% more compact than the normal
191195
''[http://www.thonky.com/qr-code-tutorial/byte-mode-encoding byte mode]''.
192196

@@ -262,22 +266,33 @@ P2PKH addresses can be used.
262266

263267
===Test vectors===
264268

265-
The following strings have a valid Bech32 checksum.
269+
The following strings are valid Bech32:
266270
* <tt>A12UEL5L</tt>
271+
* <tt>a12uel5l</tt>
267272
* <tt>an83characterlonghumanreadablepartthatcontainsthenumber1andtheexcludedcharactersbio1tt5tgs</tt>
268273
* <tt>abcdef1qpzry9x8gf2tvdw0s3jn54khce6mua7lmqqqxw</tt>
269274
* <tt>11qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqc8247j</tt>
270275
* <tt>split1checkupstagehandshakeupstreamerranterredcaperred2y9e3w</tt>
276+
* <tt>?1ezyfcl</tt> WARNING: During conversion to US-ASCII some encoders may set unmappable characters to a valid US-ASCII character, such as '?'. For example:
277+
278+
<pre>
279+
>>> bech32_encode('\x80'.encode('ascii', 'replace').decode('ascii'), [])
280+
'?1ezyfcl'
281+
</pre>
271282

272-
The following strings have an invalid Bech32 checksum (with reason for invalidity):
283+
The following string are not valid Bech32 (with reason for invalidity):
273284
* 0x20 + <tt>1nwldj5</tt>: HRP character out of range
274285
* 0x7F + <tt>1axkwrx</tt>: HRP character out of range
286+
* 0x80 + <tt>1eym55h</tt>: HRP character out of range
275287
* <tt>an84characterslonghumanreadablepartthatcontainsthenumber1andtheexcludedcharactersbio1569pvx</tt>: overall max length exceeded
276288
* <tt>pzry9x0s0muk</tt>: No separator character
277289
* <tt>1pzry9x0s0muk</tt>: Empty HRP
278290
* <tt>x1b4n0q5v</tt>: Invalid data character
279291
* <tt>li1dgmt3</tt>: Too short checksum
280292
* <tt>de1lg7wt</tt> + 0xFF: Invalid character in checksum
293+
* <tt>A1G7SGD8</tt>: checksum calculated with uppercase form of HRP
294+
* <tt>10a06t8</tt>: empty HRP
295+
* <tt>1qzzfhee</tt>: empty HRP
281296
282297
The following list gives valid segwit addresses and the scriptPubKey that they
283298
translate to in hex.

0 commit comments

Comments
 (0)