diff --git a/xml/System/Char.xml b/xml/System/Char.xml index 42f7f3d3e16..2cd1431f142 100644 --- a/xml/System/Char.xml +++ b/xml/System/Char.xml @@ -75,9 +75,11 @@ structure to represent a Unicode character. The Unicode Standard identifies each Unicode character with a unique 21-bit scalar number called a code point, and defines the UTF-16 encoding form that specifies how a code point is encoded into a sequence of one or more 16-bit values. Each 16-bit value ranges from hexadecimal 0x0000 through 0xFFFF and is stored in a structure. The value of a object is its 16-bit numeric (ordinal) value. +.NET uses the structure to represent Unicode code points by using UTF-16 encoding. The value of a object is its 16-bit numeric (ordinal) value. - The following sections examine the relationship between a object and a character and discuss some common tasks performed with instances. +If you aren't familiar with Unicode, scalar values, code points, surrogate pairs, UTF-16, and the type, see [Introduction to character encoding in .NET](/dotnet/standard/base-types/character-encoding-introduction). + +The following sections examine the relationship between a object and a character and discuss some common tasks performed with instances. We recommend that you consider the type, introduced in .NET Core 3.0, as an alternative to for performing some of these tasks. - [Char objects, Unicode characters, and strings](#Relationship) - [Characters and character categories](#Categories) @@ -103,9 +105,9 @@ ## Characters and character categories - Each Unicode character or valid surrogate pair belongs to a Unicode category. In the .NET Framework, Unicode categories are represented by members of the enumeration and include values such as , , and , for example. +Each Unicode character or valid surrogate pair belongs to a Unicode category. In the .NET Framework, Unicode categories are represented by members of the enumeration and include values such as , , and , for example. - To determine the Unicode category of a character, you call the method. For example, the following example calls the to display the Unicode category of each character in a string. +To determine the Unicode category of a character, call the method. For example, the following example calls the to display the Unicode category of each character in a string. The example works correctly only if there are no surrogate pairs in the instance. [!code-csharp[System.Char.Class#6](~/samples/snippets/csharp/VS_Snippets_CLR_System/system.char.class/cs/GetUnicodeCategory3.cs#6)] [!code-vb[System.Char.Class#6](~/samples/snippets/visualbasic/VS_Snippets_CLR_System/system.char.class/vb/GetUnicodeCategory3.vb#6)] @@ -123,6 +125,10 @@ - You can work with a object in its entirety instead of working with its individual characters to represent and analyze linguistic content. +- You can use as shown in the following example: + + :::code language="csharp" source="~/snippets/System.Text/Rune/csharp/CountLettersInString.cs" id="SnippetGoodExample"::: + - You can use the class to work with text elements instead of individual objects. The following example uses the object to count the number of text elements in a string that consists of the Aegean numbers zero through nine. Because it considers a surrogate pair a single character, it correctly reports that the string contains ten characters. [!code-csharp[System.Char.Class#4](~/samples/snippets/csharp/VS_Snippets_CLR_System/system.char.class/cs/textelements2a.cs#4)] @@ -140,14 +146,14 @@ |To do this|Use these `System.Char` methods| |----------------|-------------------------------------| |Compare objects| and | -|Convert a code point to a string|| -|Convert a object or a surrogate pair of objects to a code point|For a single character:

For a surrogate pair or a character in a string: | -|Get the Unicode category of a character|| -|Determine whether a character is in a particular Unicode category such as digit, letter, punctuation, control character, and so on|, , , , , , , , , , , , , , and | -|Convert a object that represents a number to a numeric value type|| +|Convert a code point to a string|

See also the type.| +|Convert a object or a surrogate pair of objects to a code point|For a single character:

For a surrogate pair or a character in a string:

See also the type.| +|Get the Unicode category of a character|

See also .| +|Determine whether a character is in a particular Unicode category such as digit, letter, punctuation, control character, and so on|, , , , , , , , , , , , , , and

See also corresponding methods on the type.| +|Convert a object that represents a number to a numeric value type|

See also .| |Convert a character in a string into a object| and | |Convert a object to a object|| -|Change the case of a object|, , , and | +|Change the case of a object|, , , and

See also corresponding methods on the type.| ## Char values and interop @@ -423,6 +429,7 @@ When a managed type, which is represented as a Unicode UTF-16 is not a valid 21-bit Unicode code point ranging from U+0 through U+10FFFF, excluding the surrogate pair range from U+D800 through U+DFFF. + @@ -503,6 +510,7 @@ When a managed type, which is represented as a Unicode UTF-16 is not in the range U+D800 through U+DBFF, or is not in the range U+DC00 through U+DFFF. + @@ -571,6 +579,7 @@ When a managed type, which is represented as a Unicode UTF-16 is not a position within . The specified index position contains a surrogate pair, and either the first character in the pair is not a valid high surrogate or the second character in the pair is not a valid low surrogate. + @@ -824,6 +833,7 @@ When a managed type, which is represented as a Unicode UTF-16 ]]>
+ @@ -1021,6 +1031,7 @@ When a managed type, which is represented as a Unicode UTF-16 ]]> + @@ -1114,6 +1125,7 @@ When a managed type, which is represented as a Unicode UTF-16 ]]> + @@ -1322,6 +1334,7 @@ When a managed type, which is represented as a Unicode UTF-16 ]]> + @@ -1475,6 +1488,7 @@ When a managed type, which is represented as a Unicode UTF-16 ]]> + @@ -1543,6 +1557,7 @@ When a managed type, which is represented as a Unicode UTF-16 is . is not a position within . + @@ -1644,6 +1659,7 @@ When a managed type, which is represented as a Unicode UTF-16 ]]> + @@ -1816,6 +1832,7 @@ When a managed type, which is represented as a Unicode UTF-16 ]]> + @@ -1966,6 +1983,8 @@ When a managed type, which is represented as a Unicode UTF-16 ]]> + + @@ -2118,6 +2137,7 @@ When a managed type, which is represented as a Unicode UTF-16 ]]> + @@ -2186,7 +2206,8 @@ When a managed type, which is represented as a Unicode UTF-16 is . is not a position within . - + + @@ -2270,6 +2291,7 @@ When a managed type, which is represented as a Unicode UTF-16 ]]> + @@ -2494,6 +2516,7 @@ When a managed type, which is represented as a Unicode UTF-16 ]]> + @@ -2697,6 +2720,7 @@ When a managed type, which is represented as a Unicode UTF-16 ]]> + @@ -2857,6 +2881,7 @@ When a managed type, which is represented as a Unicode UTF-16 ]]> + @@ -2929,6 +2954,7 @@ When a managed type, which is represented as a Unicode UTF-16 is . is less than zero or greater than the last position in . + @@ -3007,6 +3033,7 @@ When a managed type, which is represented as a Unicode UTF-16 ]]> + @@ -3075,6 +3102,7 @@ When a managed type, which is represented as a Unicode UTF-16 is . is not a position within . + @@ -3194,6 +3222,7 @@ When a managed type, which is represented as a Unicode UTF-16 ]]> + @@ -3354,6 +3383,8 @@ When a managed type, which is represented as a Unicode UTF-16 ]]> + + @@ -3525,6 +3556,7 @@ When a managed type, which is represented as a Unicode UTF-16 ]]> + @@ -4751,6 +4783,8 @@ This member is an explicit interface member implementation. It can be used only As explained in [Best Practices for Using Strings](~/docs/standard/base-types/best-practices-strings.md), we recommend that you avoid calling character-casing and string-casing methods that substitute default values. Instead, you should call methods that require parameters to be explicitly specified. To convert a character to lowercase by using the casing conventions of the current culture, call the method overload with a value of for its parameter. + + @@ -4818,6 +4852,8 @@ This member is an explicit interface member implementation. It can be used only is . + + @@ -4878,6 +4914,7 @@ This member is an explicit interface member implementation. It can be used only ]]> + @@ -5148,6 +5185,8 @@ This member is an explicit interface member implementation. It can be used only As explained in [Best Practices for Using Strings](~/docs/standard/base-types/best-practices-strings.md), we recommend that you avoid calling character-casing and string-casing methods that substitute default values. Instead, you should call methods that require parameters to be explicitly specified. To convert a character to uppercase by using the casing conventions of the current culture, call the method overload with a value of for its parameter. + + @@ -5214,6 +5253,8 @@ This member is an explicit interface member implementation. It can be used only is . + + @@ -5274,6 +5315,7 @@ This member is an explicit interface member implementation. It can be used only ]]> +