@@ -34,12 +34,17 @@ The :mod:`locale` module defines the following exception and functions:
3434
3535 If *locale * is given and not ``None ``, :func: `setlocale ` modifies the locale
3636 setting for the *category *. The available categories are listed in the data
37- description below. *locale * may be a string, or an iterable of two strings
38- (language code and encoding). If it's an iterable, it's converted to a locale
39- name using the locale aliasing engine. An empty string specifies the user's
37+ description below. *locale * may be a :ref: `string <locale_name >`, or a pair,
38+ language code and encoding. An empty string specifies the user's
4039 default settings. If the modification of the locale fails, the exception
4140 :exc: `Error ` is raised. If successful, the new locale setting is returned.
4241
42+ If *locale * is a pair, it is converted to a locale name using
43+ the locale aliasing engine.
44+ The language code has the same format as a :ref: `locale name <locale_name >`,
45+ but without encoding and ``@ ``-modifier.
46+ The language code and encoding can be ``None ``.
47+
4348 If *locale * is omitted or ``None ``, the current setting for *category * is
4449 returned.
4550
@@ -345,22 +350,26 @@ The :mod:`locale` module defines the following exception and functions:
345350 ``'LANG' ``. The GNU gettext search path contains ``'LC_ALL' ``,
346351 ``'LC_CTYPE' ``, ``'LANG' `` and ``'LANGUAGE' ``, in that order.
347352
348- Except for the code ``'C' ``, the language code corresponds to :rfc: `1766 `.
349- *language code * and *encoding * may be ``None `` if their values cannot be
353+ The language code has the same format as a :ref: `locale name <locale_name >`,
354+ but without encoding and ``@ ``-modifier.
355+ The language code and encoding may be ``None `` if their values cannot be
350356 determined.
357+ The "C" locale is represented as ``(None, None) ``.
351358
352359 .. deprecated-removed :: 3.11 3.15
353360
354361
355362.. function :: getlocale(category=LC_CTYPE)
356363
357- Returns the current setting for the given locale category as sequence containing
358- * language code *, * encoding * . *category * may be one of the :const: `!LC_\* ` values
359- except :const: `LC_ALL `. It defaults to :const: `LC_CTYPE `.
364+ Returns the current setting for the given locale category as a tuple containing
365+ the language code and encoding. *category * may be one of the :const: `!LC_\* `
366+ values except :const: `LC_ALL `. It defaults to :const: `LC_CTYPE `.
360367
361- Except for the code ``'C' ``, the language code corresponds to :rfc: `1766 `.
362- *language code * and *encoding * may be ``None `` if their values cannot be
368+ The language code has the same format as a :ref: `locale name <locale_name >`,
369+ but without encoding and ``@ ``-modifier.
370+ The language code and encoding may be ``None `` if their values cannot be
363371 determined.
372+ The "C" locale is represented as ``(None, None) ``.
364373
365374
366375.. function :: getpreferredencoding(do_setlocale=True)
@@ -615,6 +624,61 @@ whose high bit is set (i.e., non-ASCII bytes) are never converted or considered
615624part of a character class such as letter or whitespace.
616625
617626
627+ .. _locale_name :
628+
629+ Locale names
630+ ------------
631+
632+ The format of the locale name is platform dependent, and the set of supported
633+ locales can depend on the system configuration.
634+
635+ On Posix platforms, it usually has the format [1 ]_:
636+
637+ .. productionlist :: locale_name
638+ : language ["_" territory] ["." charset] ["@" modifier]
639+
640+ where *language * is a two- or three-letter language code from `ISO 639 `_,
641+ *territory * is a two-letter country or region code from `ISO 3166 `_,
642+ *charset * is a locale encoding, and *modifier * is a script name,
643+ a language subtag, a sort order identifier, or other locale modifier
644+ (for example, "latin", "valencia", "stroke" and "euro").
645+
646+ On Windows, several formats are supported. [2 ]_ [3 ]_
647+ A subset of `IETF BCP 47 `_ tags:
648+
649+ .. productionlist :: locale_name
650+ : language ["-" script] ["-" territory] ["." charset]
651+ : language ["-" script] "-" territory "-" modifier
652+
653+ where *language * and *territory * have the same meaning as in Posix,
654+ *script * is a four-letter script code from `ISO 15924 `_,
655+ and *modifier * is a language subtag, a sort order identifier
656+ or custom modifier (for example, "valencia", "stroke" or "x-python").
657+ Both hyphen (``'-' ``) and underscore (``'_' ``) separators are supported.
658+ Only UTF-8 encoding is allowed for BCP 47 tags.
659+
660+ Windows also supports locale names in the format:
661+
662+ .. productionlist :: locale_name
663+ : language ["_" territory] ["." charset]
664+
665+ where *language * and *territory * are full names, such as "English" and
666+ "United States", and *charset * is either a code page number (for example, "1252")
667+ or UTF-8.
668+ Only the underscore separator is supported in this format.
669+
670+ The "C" locale is supported on all platforms.
671+
672+ .. _ISO 639 : https://www.iso.org/iso-639-language-code
673+ .. _ISO 3166 : https://www.iso.org/iso-3166-country-codes.html
674+ .. _IETF BCP 47 : https://www.rfc-editor.org/info/bcp47
675+ .. _ISO 15924 : https://www.unicode.org/iso15924/
676+
677+ .. [1 ] `IEEE Std 1003.1-2024; 8.2 Internationalization Variables <https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/V1_chap08.html#tag_08_02 >`_
678+ .. [2 ] `UCRT Locale names, Languages, and Country/Region strings <https://learn.microsoft.com/en-us/cpp/c-runtime-library/locale-names-languages-and-country-region-strings >`_
679+ .. [3 ] `Locale Names <https://learn.microsoft.com/en-us/windows/win32/intl/locale-names >`_
680+
681+
618682 .. _embedding-locale :
619683
620684For extension writers and programs that embed Python
0 commit comments