@@ -34,16 +34,15 @@ The :mod:`locale` module defines the following exception and functions:
3434
3535   If *locale * is given and not ``None ``, :func: `setlocale ` modifies the locale
3636   setting for the *category *. The available categories are listed in the data
37-    description below. *locale * may be a string, or a pair,
38-    language code and encoding. If it is a pair, it is converted to a locale
39-    name using the locale aliasing engine. An empty string specifies the user's
37+    description below. *locale * may be a :ref: `string  <locale_name >`, or a pair,
38+    language code and encoding. An empty string specifies the user's
4039   default settings. If the modification of the locale fails, the exception
4140   :exc: `Error ` is raised. If successful, the new locale setting is returned.
4241
43-    The format of the  *locale * and the language code strings  is platform 
44-    dependent, but  the forms `` language[_territory][.encoding][@modifier] `` 
45-    and `` language[_territory]  `` respectively are typically accepted on all 
46-    platforms .
42+    If  *locale * is a pair, it  is converted to a locale name using 
43+    the locale aliasing engine. 
44+    The  language code has the same format as a  :ref: ` locale name  < locale_name >`, 
45+    but without encoding and `` @ ``-modifier .
4746   The language code and encoding can be ``None ``.
4847
4948   If *locale * is omitted or ``None ``, the current setting for *category * is
@@ -351,8 +350,8 @@ The :mod:`locale` module defines the following exception and functions:
351350   ``'LANG' ``.  The GNU gettext search path contains ``'LC_ALL' ``,
352351   ``'LC_CTYPE' ``, ``'LANG' `` and ``'LANGUAGE' ``, in that order.
353352
354-    The format of  the language code is platform depended, but on Posix 
355-    platforms it usually looks like `` language[_territory] `` .
353+    The language code has  the same format as a  :ref: ` locale name  < locale_name >`, 
354+    but without encoding and `` @ ``-modifier .
356355   The language code and encoding may be ``None `` if their values cannot be
357356   determined.
358357   The "C" locale is represented as ``(None, None) ``.
@@ -366,8 +365,8 @@ The :mod:`locale` module defines the following exception and functions:
366365   the language code and encoding. *category * may be one of the :const: `!LC_\*  `
367366   values except :const: `LC_ALL `.  It defaults to :const: `LC_CTYPE `.
368367
369-    The format of  the language code is platform dependent, but on Posix 
370-    platforms it usually looks like `` language[_territory] `` .
368+    The language code has  the same format as a  :ref: ` locale name  < locale_name >`, 
369+    but without encoding and `` @ ``-modifier .
371370   The language code and encoding may be ``None `` if their values cannot be
372371   determined.
373372   The "C" locale is represented as ``(None, None) ``.
@@ -625,6 +624,59 @@ whose high bit is set (i.e., non-ASCII bytes) are never converted or considered
625624part of a character class such as letter or whitespace.
626625
627626
627+ .. _locale_name :
628+ 
629+ Locale names
630+ ------------ 
631+ 
632+ The format of the locale name is platform dependent, and the set of supported
633+ locales can depend on the system configuration.
634+ 
635+ On Posix platforms, it usually has the format
636+ 
637+ .. productionlist :: locale_name 
638+    : language ["_" territory] ["." charset] ["@" modifier]
639+ 
640+ where *language * is a two- or three-letter language code from `ISO 639 `_,
641+ *territory * is a two-letter country or region code from ISO 3166,
642+ *charset * is a locale encoding, and *modifier * is a script name,
643+ a language subtag, a sort order identifier, or other locale modifier
644+ (e.g. "latin", "valencia", "stroke" and "euro").
645+ 
646+ On Windows, several formats are supported.
647+ A subset of `IETF BCP 47 `_ tags:
648+ 
649+ .. productionlist :: locale_name 
650+    : language ["-" script] ["-" territory] ["." charset]
651+    : language ["-" script] "-" territory "-" modifier
652+ 
653+ where *language * and *territory * has the same meaning as in Posix,
654+ *script * is a four-letter script code from `ISO 15924 `_,
655+ and *modifier * is a language subtag, a sort order identifier
656+ or custom modifier (e.g. "valencia", "stroke" or "x-python").
657+ Both hyphen ("``- ``") and underscore ("``_ ``") separators are supported.
658+ Only UTF-8 encoding is allowed for BCP 47 tags.
659+ 
660+ Windows supports also locale names in the format
661+ 
662+ .. productionlist :: locale_name 
663+    : language ["_" territory] ["." charset]
664+ 
665+ where *language * and *territory * are long names, such as "English" and
666+ "United States", and *charset * is either a code page number (e.g. "1252")
667+ or UTF-8.
668+ Only the underscore separator is supported in this format.
669+ 
670+ The "C" locale is supported on all platforms.
671+ 
672+ .. _ISO 639 : https://www.iso.org/iso-639-language-code 
673+ .. _IETF BCP 47 : https://www.rfc-editor.org/info/bcp47 
674+ .. _ISO 15924 : https://www.unicode.org/iso15924/ 
675+ 
676+ ..  https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/V1_chap08.html#tag_08_02
677+ ..  https://learn.microsoft.com/en-us/cpp/c-runtime-library/locale-names-languages-and-country-region-strings
678+ 
679+ 
628680 .. _embedding-locale :
629681
630682For extension writers and programs that embed Python
0 commit comments