@@ -34,12 +34,17 @@ The :mod:`locale` module defines the following exception and functions:
3434
3535 If *locale * is given and not ``None ``, :func: `setlocale ` modifies the locale
3636 setting for the *category *. The available categories are listed in the data
37- description below. *locale * may be a string, or an iterable of two strings
38- (language code and encoding). If it's an iterable, it's converted to a locale
39- name using the locale aliasing engine. An empty string specifies the user's
37+ description below. *locale * may be a :ref: `string <locale_name >`, or a pair,
38+ language code and encoding. An empty string specifies the user's
4039 default settings. If the modification of the locale fails, the exception
4140 :exc: `Error ` is raised. If successful, the new locale setting is returned.
4241
42+ If *locale * is a pair, it is converted to a locale name using
43+ the locale aliasing engine.
44+ The language code has the same format as a :ref: `locale name <locale_name >`,
45+ but without encoding and ``@ ``-modifier.
46+ The language code and encoding can be ``None ``.
47+
4348 If *locale * is omitted or ``None ``, the current setting for *category * is
4449 returned.
4550
@@ -336,22 +341,26 @@ The :mod:`locale` module defines the following exception and functions:
336341 ``'LANG' ``. The GNU gettext search path contains ``'LC_ALL' ``,
337342 ``'LC_CTYPE' ``, ``'LANG' `` and ``'LANGUAGE' ``, in that order.
338343
339- Except for the code ``'C' ``, the language code corresponds to :rfc: `1766 `.
340- *language code * and *encoding * may be ``None `` if their values cannot be
344+ The language code has the same format as a :ref: `locale name <locale_name >`,
345+ but without encoding and ``@ ``-modifier.
346+ The language code and encoding may be ``None `` if their values cannot be
341347 determined.
348+ The "C" locale is represented as ``(None, None) ``.
342349
343350 .. deprecated-removed :: 3.11 3.15
344351
345352
346353.. function :: getlocale(category=LC_CTYPE)
347354
348- Returns the current setting for the given locale category as sequence containing
349- * language code *, * encoding * . *category * may be one of the :const: `!LC_\* ` values
350- except :const: `LC_ALL `. It defaults to :const: `LC_CTYPE `.
355+ Returns the current setting for the given locale category as a tuple containing
356+ the language code and encoding. *category * may be one of the :const: `!LC_\* `
357+ values except :const: `LC_ALL `. It defaults to :const: `LC_CTYPE `.
351358
352- Except for the code ``'C' ``, the language code corresponds to :rfc: `1766 `.
353- *language code * and *encoding * may be ``None `` if their values cannot be
359+ The language code has the same format as a :ref: `locale name <locale_name >`,
360+ but without encoding and ``@ ``-modifier.
361+ The language code and encoding may be ``None `` if their values cannot be
354362 determined.
363+ The "C" locale is represented as ``(None, None) ``.
355364
356365
357366.. function :: getpreferredencoding(do_setlocale=True)
@@ -606,6 +615,61 @@ whose high bit is set (i.e., non-ASCII bytes) are never converted or considered
606615part of a character class such as letter or whitespace.
607616
608617
618+ .. _locale_name :
619+
620+ Locale names
621+ ------------
622+
623+ The format of the locale name is platform dependent, and the set of supported
624+ locales can depend on the system configuration.
625+
626+ On Posix platforms, it usually has the format [1 ]_:
627+
628+ .. productionlist :: locale_name
629+ : language ["_" territory] ["." charset] ["@" modifier]
630+
631+ where *language * is a two- or three-letter language code from `ISO 639 `_,
632+ *territory * is a two-letter country or region code from `ISO 3166 `_,
633+ *charset * is a locale encoding, and *modifier * is a script name,
634+ a language subtag, a sort order identifier, or other locale modifier
635+ (for example, "latin", "valencia", "stroke" and "euro").
636+
637+ On Windows, several formats are supported. [2 ]_ [3 ]_
638+ A subset of `IETF BCP 47 `_ tags:
639+
640+ .. productionlist :: locale_name
641+ : language ["-" script] ["-" territory] ["." charset]
642+ : language ["-" script] "-" territory "-" modifier
643+
644+ where *language * and *territory * have the same meaning as in Posix,
645+ *script * is a four-letter script code from `ISO 15924 `_,
646+ and *modifier * is a language subtag, a sort order identifier
647+ or custom modifier (for example, "valencia", "stroke" or "x-python").
648+ Both hyphen (``'-' ``) and underscore (``'_' ``) separators are supported.
649+ Only UTF-8 encoding is allowed for BCP 47 tags.
650+
651+ Windows also supports locale names in the format:
652+
653+ .. productionlist :: locale_name
654+ : language ["_" territory] ["." charset]
655+
656+ where *language * and *territory * are full names, such as "English" and
657+ "United States", and *charset * is either a code page number (for example, "1252")
658+ or UTF-8.
659+ Only the underscore separator is supported in this format.
660+
661+ The "C" locale is supported on all platforms.
662+
663+ .. _ISO 639 : https://www.iso.org/iso-639-language-code
664+ .. _ISO 3166 : https://www.iso.org/iso-3166-country-codes.html
665+ .. _IETF BCP 47 : https://www.rfc-editor.org/info/bcp47
666+ .. _ISO 15924 : https://www.unicode.org/iso15924/
667+
668+ .. [1 ] `IEEE Std 1003.1-2024; 8.2 Internationalization Variables <https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/V1_chap08.html#tag_08_02 >`_
669+ .. [2 ] `UCRT Locale names, Languages, and Country/Region strings <https://learn.microsoft.com/en-us/cpp/c-runtime-library/locale-names-languages-and-country-region-strings >`_
670+ .. [3 ] `Locale Names <https://learn.microsoft.com/en-us/windows/win32/intl/locale-names >`_
671+
672+
609673 .. _embedding-locale :
610674
611675For extension writers and programs that embed Python
0 commit comments