@@ -68,11 +68,21 @@ The full details for each codec can also be looked up directly:
6868 Looks up the codec info in the Python codec registry and returns a
6969 :class: `CodecInfo ` object as defined below.
7070
71- Encodings are first looked up in the registry's cache. If not found, the list of
71+ This function first normalizes the *encoding *: all ASCII letters are
72+ converted to lower case, spaces are replaced with hyphens.
73+ Then encoding is looked up in the registry's cache. If not found, the list of
7274 registered search functions is scanned. If no :class: `CodecInfo ` object is
7375 found, a :exc: `LookupError ` is raised. Otherwise, the :class: `CodecInfo ` object
7476 is stored in the cache and returned to the caller.
7577
78+ .. versionchanged :: 3.9
79+ Any characters except ASCII letters and digits and a dot are converted to underscore.
80+
81+ .. versionchanged :: next
82+ No characters are converted to underscore anymore.
83+ Spaces are converted to hyphens.
84+
85+
7686.. class :: CodecInfo(encode, decode, streamreader=None, streamwriter=None, incrementalencoder=None, incrementaldecoder=None, name=None)
7787
7888 Codec details when looking up the codec registry. The constructor
@@ -167,14 +177,11 @@ function:
167177.. function :: register(search_function, /)
168178
169179 Register a codec search function. Search functions are expected to take one
170- argument, being the encoding name in all lower case letters with hyphens
171- and spaces converted to underscores , and return a :class: `CodecInfo ` object.
180+ argument, being the encoding name in all lower case letters with spaces
181+ converted to hyphens , and return a :class: `CodecInfo ` object.
172182 In case a search function cannot find a given encoding, it should return
173183 ``None ``.
174184
175- .. versionchanged :: 3.9
176- Hyphens and spaces are converted to underscore.
177-
178185
179186.. function :: unregister(search_function, /)
180187
@@ -1065,7 +1072,7 @@ or with dictionaries as mapping tables. The following table lists the codecs by
10651072name, together with a few common aliases, and the languages for which the
10661073encoding is likely used. Neither the list of aliases nor the list of languages
10671074is meant to be exhaustive. Notice that spelling alternatives that only differ in
1068- case or use a hyphen instead of an underscore are also valid aliases
1075+ case or use a space or a hyphen instead of an underscore are also valid aliases
10691076because they are equivalent when normalized by
10701077:func: `~encodings.normalize_encoding `. For example, ``'utf-8' `` is a valid
10711078alias for the ``'utf_8' `` codec.
0 commit comments