@@ -25,80 +25,133 @@ Standard Annex #44, `"Unicode Character Database"
2525<https://www.unicode.org/reports/tr44/> `_.  It defines the
2626following functions:
2727
28+ .. seealso ::
29+ 
30+    The :ref: `unicode-howto ` for more information about Unicode and how to use
31+    this module.
32+ 
2833
2934.. function :: lookup(name) 
3035
3136   Look up character by name.  If a character with the given name is found, return
3237   the corresponding character.  If not found, :exc: `KeyError ` is raised.
38+    For example::
39+ 
40+       >>> unicodedata.lookup('LEFT CURLY BRACKET') 
41+       '{' 
42+ 
43+    The characters returned by this function are the same as those produced by
44+    ``\N `` escape sequence in string literals. For example::
45+ 
46+       >>> unicodedata.lookup('MIDDLE DOT') == '\N{MIDDLE DOT}' 
47+       True 
3348
3449   .. versionchanged :: 3.3 
3550      Support for name aliases [# ]_ and named sequences [# ]_ has been added.
3651
3752
38- .. function :: name(chr[ , default] ) 
53+ .. function :: name(chr, default=None, / ) 
3954
4055   Returns the name assigned to the character *chr * as a string. If no
4156   name is defined, *default * is returned, or, if not given, :exc: `ValueError ` is
42-    raised.
57+    raised. For example::
58+ 
59+       >>> unicodedata.name('½') 
60+       'VULGAR FRACTION ONE HALF' 
61+       >>> unicodedata.name('\uFFFF', 'fallback') 
62+       'fallback' 
4363
4464
45- .. function :: decimal(chr[ , default] ) 
65+ .. function :: decimal(chr, default=None, / ) 
4666
4767   Returns the decimal value assigned to the character *chr * as integer.
4868   If no such value is defined, *default * is returned, or, if not given,
49-    :exc: `ValueError ` is raised.
69+    :exc: `ValueError ` is raised. For example:: 
5070
71+       >>> unicodedata.decimal('\N{ARABIC-INDIC DIGIT NINE}') 
72+       9 
73+       >>> unicodedata.decimal('\N{SUPERSCRIPT NINE}', -1) 
74+       -1 
5175
52- .. function :: digit(chr[, default]) 
76+ 
77+ .. function :: digit(chr, default=None, /) 
5378
5479   Returns the digit value assigned to the character *chr * as integer.
5580   If no such value is defined, *default * is returned, or, if not given,
56-    :exc: `ValueError ` is raised.
81+    :exc: `ValueError ` is raised::
82+ 
83+       >>> unicodedata.digit('\N{SUPERSCRIPT NINE}') 
84+       9 
5785
5886
59- .. function :: numeric(chr[ , default] ) 
87+ .. function :: numeric(chr, default=None, / ) 
6088
6189   Returns the numeric value assigned to the character *chr * as float.
6290   If no such value is defined, *default * is returned, or, if not given,
63-    :exc: `ValueError ` is raised.
91+    :exc: `ValueError ` is raised::
92+ 
93+       >>> unicodedata.numeric('½') 
94+       0.5 
6495
6596
6697.. function :: category(chr) 
6798
6899   Returns the general category assigned to the character *chr * as
69-    string.
100+    string. General category names consist of two letters.
101+    See the `General Category Values section of the Unicode Character 
102+    Database documentation <https://www.unicode.org/reports/tr44/#General_Category_Values> `_
103+    for a list of category codes. For example::
104+ 
105+       >>> unicodedata.category('A')  # 'L'etter, 'u'ppercase 
106+       'Lu' 
70107
71108
72109.. function :: bidirectional(chr) 
73110
74111   Returns the bidirectional class assigned to the character *chr * as
75112   string. If no such value is defined, an empty string is returned.
113+    See the `Bidirectional Class Values section of the Unicode Character 
114+    Database <https://www.unicode.org/reports/tr44/#Bidi_Class_Values> `_
115+    documentation for a list of bidirectional codes. For example::
116+ 
117+       >>> unicodedata.bidirectional('\N{ARABIC-INDIC DIGIT SEVEN}') # 'A'rabic, 'N'umber 
118+       'AN' 
76119
77120
78121.. function :: combining(chr) 
79122
80123   Returns the canonical combining class assigned to the character *chr *
81124   as integer. Returns ``0 `` if no combining class is defined.
125+    See the `Canonical Combining Class Values section of the Unicode Character 
126+    Database <www.unicode.org/reports/tr44/#Canonical_Combining_Class_Values> `_
127+    for more information.
82128
83129
84130.. function :: east_asian_width(chr) 
85131
86132   Returns the east asian width assigned to the character *chr * as
87-    string.
133+    string. For a list of widths and or more information, see the
134+    `Unicode Standard Annex #11  <https://www.unicode.org/reports/tr11/ >`_.
88135
89136
90137.. function :: mirrored(chr) 
91138
92139   Returns the mirrored property assigned to the character *chr * as
93140   integer. Returns ``1 `` if the character has been identified as a "mirrored"
94-    character in bidirectional text, ``0 `` otherwise.
141+    character in bidirectional text, ``0 `` otherwise. For example::
142+ 
143+       >>> unicodedata.mirrored('>') 
144+       1 
95145
96146
97147.. function :: decomposition(chr) 
98148
99149   Returns the character decomposition mapping assigned to the character
100150   *chr * as string. An empty string is returned in case no such mapping is
101-    defined.
151+    defined. For example::
152+ 
153+       >>> unicodedata.decomposition('Ã') 
154+       '0041 0303' 
102155
103156
104157.. function :: normalize(form, unistr) 
@@ -122,9 +175,9 @@ following functions:
122175   normally would be unified with other characters. For example, U+2160 (ROMAN
123176   NUMERAL ONE) is really the same thing as U+0049 (LATIN CAPITAL LETTER I).
124177   However, it is supported in Unicode for compatibility with existing character
125-    sets (e.g.  gb2312).
178+    sets (for example,  gb2312).
126179
127-    The normal form KD (NFKD) will apply the compatibility decomposition, i.e. 
180+    The normal form KD (NFKD) will apply the compatibility decomposition, that is, 
128181   replace all compatibility characters with their equivalents. The normal form KC
129182   (NFKC) first applies the compatibility decomposition, followed by the canonical
130183   composition.
@@ -133,6 +186,7 @@ following functions:
133186   a human reader, if one has combining characters and the other
134187   doesn't, they may not compare equal.
135188
189+ 
136190.. function :: is_normalized(form, unistr) 
137191
138192   Return whether the Unicode string *unistr * is in the normal form *form *. Valid
@@ -154,24 +208,6 @@ In addition, the module exposes the following constant:
154208   Unicode database version 3.2 instead, for applications that require this
155209   specific version of the Unicode database (such as IDNA).
156210
157- Examples:
158- 
159-    >>> import  unicodedata
160-    >>> unicodedata.lookup(' LEFT CURLY BRACKET'  
161-    '{' 
162-    >>> unicodedata.name(' /'  
163-    'SOLIDUS' 
164-    >>> unicodedata.decimal(' 9'  
165-    9 
166-    >>> unicodedata.decimal(' a'  
167-    Traceback (most recent call last): 
168-      File "<stdin>", line 1, in <module> 
169-    ValueError: not a decimal 
170-    >>> unicodedata.category(' A' #  'L'etter, 'u'ppercase 
171-    'Lu' 
172-    >>> unicodedata.bidirectional(' \u0660 ' #  'A'rabic, 'N'umber 
173-    'AN' 
174- 
175211
176212.. rubric :: Footnotes 
177213
0 commit comments