Skip to content

Commit 9417ea5

Browse files
[3.13] gh-87281: Improve documentation for locale.setlocale() and locale.getlocale() (GH-137313) (GH-137723)
Add a section explaining the locale name formats. (cherry picked from commit 15ab457) Co-authored-by: Serhiy Storchaka <[email protected]>
1 parent 3a74d52 commit 9417ea5

File tree

1 file changed

+74
-10
lines changed

1 file changed

+74
-10
lines changed

Doc/library/locale.rst

Lines changed: 74 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -34,12 +34,17 @@ The :mod:`locale` module defines the following exception and functions:
3434

3535
If *locale* is given and not ``None``, :func:`setlocale` modifies the locale
3636
setting for the *category*. The available categories are listed in the data
37-
description below. *locale* may be a string, or an iterable of two strings
38-
(language code and encoding). If it's an iterable, it's converted to a locale
39-
name using the locale aliasing engine. An empty string specifies the user's
37+
description below. *locale* may be a :ref:`string <locale_name>`, or a pair,
38+
language code and encoding. An empty string specifies the user's
4039
default settings. If the modification of the locale fails, the exception
4140
:exc:`Error` is raised. If successful, the new locale setting is returned.
4241

42+
If *locale* is a pair, it is converted to a locale name using
43+
the locale aliasing engine.
44+
The language code has the same format as a :ref:`locale name <locale_name>`,
45+
but without encoding and ``@``-modifier.
46+
The language code and encoding can be ``None``.
47+
4348
If *locale* is omitted or ``None``, the current setting for *category* is
4449
returned.
4550

@@ -336,22 +341,26 @@ The :mod:`locale` module defines the following exception and functions:
336341
``'LANG'``. The GNU gettext search path contains ``'LC_ALL'``,
337342
``'LC_CTYPE'``, ``'LANG'`` and ``'LANGUAGE'``, in that order.
338343

339-
Except for the code ``'C'``, the language code corresponds to :rfc:`1766`.
340-
*language code* and *encoding* may be ``None`` if their values cannot be
344+
The language code has the same format as a :ref:`locale name <locale_name>`,
345+
but without encoding and ``@``-modifier.
346+
The language code and encoding may be ``None`` if their values cannot be
341347
determined.
348+
The "C" locale is represented as ``(None, None)``.
342349

343350
.. deprecated-removed:: 3.11 3.15
344351

345352

346353
.. function:: getlocale(category=LC_CTYPE)
347354

348-
Returns the current setting for the given locale category as sequence containing
349-
*language code*, *encoding*. *category* may be one of the :const:`!LC_\*` values
350-
except :const:`LC_ALL`. It defaults to :const:`LC_CTYPE`.
355+
Returns the current setting for the given locale category as a tuple containing
356+
the language code and encoding. *category* may be one of the :const:`!LC_\*`
357+
values except :const:`LC_ALL`. It defaults to :const:`LC_CTYPE`.
351358

352-
Except for the code ``'C'``, the language code corresponds to :rfc:`1766`.
353-
*language code* and *encoding* may be ``None`` if their values cannot be
359+
The language code has the same format as a :ref:`locale name <locale_name>`,
360+
but without encoding and ``@``-modifier.
361+
The language code and encoding may be ``None`` if their values cannot be
354362
determined.
363+
The "C" locale is represented as ``(None, None)``.
355364

356365

357366
.. function:: getpreferredencoding(do_setlocale=True)
@@ -606,6 +615,61 @@ whose high bit is set (i.e., non-ASCII bytes) are never converted or considered
606615
part of a character class such as letter or whitespace.
607616

608617

618+
.. _locale_name:
619+
620+
Locale names
621+
------------
622+
623+
The format of the locale name is platform dependent, and the set of supported
624+
locales can depend on the system configuration.
625+
626+
On Posix platforms, it usually has the format [1]_:
627+
628+
.. productionlist:: locale_name
629+
: language ["_" territory] ["." charset] ["@" modifier]
630+
631+
where *language* is a two- or three-letter language code from `ISO 639`_,
632+
*territory* is a two-letter country or region code from `ISO 3166`_,
633+
*charset* is a locale encoding, and *modifier* is a script name,
634+
a language subtag, a sort order identifier, or other locale modifier
635+
(for example, "latin", "valencia", "stroke" and "euro").
636+
637+
On Windows, several formats are supported. [2]_ [3]_
638+
A subset of `IETF BCP 47`_ tags:
639+
640+
.. productionlist:: locale_name
641+
: language ["-" script] ["-" territory] ["." charset]
642+
: language ["-" script] "-" territory "-" modifier
643+
644+
where *language* and *territory* have the same meaning as in Posix,
645+
*script* is a four-letter script code from `ISO 15924`_,
646+
and *modifier* is a language subtag, a sort order identifier
647+
or custom modifier (for example, "valencia", "stroke" or "x-python").
648+
Both hyphen (``'-'``) and underscore (``'_'``) separators are supported.
649+
Only UTF-8 encoding is allowed for BCP 47 tags.
650+
651+
Windows also supports locale names in the format:
652+
653+
.. productionlist:: locale_name
654+
: language ["_" territory] ["." charset]
655+
656+
where *language* and *territory* are full names, such as "English" and
657+
"United States", and *charset* is either a code page number (for example, "1252")
658+
or UTF-8.
659+
Only the underscore separator is supported in this format.
660+
661+
The "C" locale is supported on all platforms.
662+
663+
.. _ISO 639: https://www.iso.org/iso-639-language-code
664+
.. _ISO 3166: https://www.iso.org/iso-3166-country-codes.html
665+
.. _IETF BCP 47: https://www.rfc-editor.org/info/bcp47
666+
.. _ISO 15924: https://www.unicode.org/iso15924/
667+
668+
.. [1] `IEEE Std 1003.1-2024; 8.2 Internationalization Variables <https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/V1_chap08.html#tag_08_02>`_
669+
.. [2] `UCRT Locale names, Languages, and Country/Region strings <https://learn.microsoft.com/en-us/cpp/c-runtime-library/locale-names-languages-and-country-region-strings>`_
670+
.. [3] `Locale Names <https://learn.microsoft.com/en-us/windows/win32/intl/locale-names>`_
671+
672+
609673
.. _embedding-locale:
610674

611675
For extension writers and programs that embed Python

0 commit comments

Comments
 (0)