|
98 | 98 |
|
99 | 99 | A string is a sequential collection of characters that is used to represent text. A <xref:System.String> object is a sequential collection of <xref:System.Char?displayProperty=nameWithType> objects that represent a string; a <xref:System.Char?displayProperty=nameWithType> object corresponds to a UTF-16 code unit. The value of the <xref:System.String> object is the content of the sequential collection of <xref:System.Char?displayProperty=nameWithType> objects, and that value is immutable (that is, it is read-only). For more information about the immutability of strings, see the [Immutability and the StringBuilder class](#Immutability) section later in this topic. The maximum size of a <xref:System.String> object in memory is 2GB, or about 1 billion characters.
|
100 | 100 |
|
| 101 | +For more information about Unicode, UTF-16, code units, code points, and the <xref:System.Char> and <xref:System.Text.Rune> types, see [Introduction to character encoding in .NET](/dotnet/standard/base-types/character-encoding-introduction). |
| 102 | + |
101 | 103 | In this section:
|
102 | 104 |
|
103 | 105 | [Instantiate a String object](#Instantiation)\
|
|
214 | 216 | [!code-csharp-interactive[System.String.Class#5](~/samples/snippets/csharp/VS_Snippets_CLR_System/system.String.Class/cs/index2.cs#5)]
|
215 | 217 | [!code-vb[System.String.Class#5](~/samples/snippets/visualbasic/VS_Snippets_CLR_System/system.String.Class/vb/index2.vb#5)]
|
216 | 218 |
|
217 |
| - Consecutive index values might not correspond to consecutive Unicode characters, because a Unicode character might be encoded as more than one <xref:System.Char> object. In particular, a string may contain multi-character units of text that are formed by a base character followed by one or more combining characters or by surrogate pairs. To work with Unicode characters instead of <xref:System.Char> objects, use the <xref:System.Globalization.StringInfo?displayProperty=nameWithType> and <xref:System.Globalization.TextElementEnumerator> classes. The following example illustrates the difference between code that works with <xref:System.Char> objects and code that works with Unicode characters. It compares the number of characters or text elements in each word of a sentence. The string includes two sequences of a base character followed by a combining character. |
| 219 | + Consecutive index values might not correspond to consecutive Unicode characters, because a Unicode character might be encoded as more than one <xref:System.Char> object. In particular, a string may contain multi-character units of text that are formed by a base character followed by one or more combining characters or by surrogate pairs. To work with Unicode characters instead of <xref:System.Char> objects, use the <xref:System.Globalization.StringInfo?displayProperty=nameWithType> and <xref:System.Globalization.TextElementEnumerator> classes, or the <xref:System.String.EnumerateRunes%2A?displayProperty=nameWithType> method and the <xref:System.Text.Rune> type. The following example illustrates the difference between code that works with <xref:System.Char> objects and code that works with Unicode characters. It compares the number of characters or text elements in each word of a sentence. The string includes two sequences of a base character followed by a combining character. |
218 | 220 |
|
219 | 221 | [!code-cpp[System.String.Class#6](~/samples/snippets/cpp/VS_Snippets_CLR_System/system.String.Class/cpp/string.index3.cpp#6)]
|
220 | 222 | [!code-csharp-interactive[System.String.Class#6](~/samples/snippets/csharp/VS_Snippets_CLR_System/system.String.Class/cs/index3.cs#6)]
|
221 | 223 | [!code-vb[System.String.Class#6](~/samples/snippets/visualbasic/VS_Snippets_CLR_System/system.String.Class/vb/index3.vb#6)]
|
222 | 224 |
|
223 | 225 | This example works with text elements by using the <xref:System.Globalization.StringInfo.GetTextElementEnumerator%2A?displayProperty=nameWithType> method and the <xref:System.Globalization.TextElementEnumerator> class to enumerate all the text elements in a string. You can also retrieve an array that contains the starting index of each text element by calling the <xref:System.Globalization.StringInfo.ParseCombiningCharacters%2A?displayProperty=nameWithType> method.
|
224 | 226 |
|
225 |
| - For more information about working with units of text rather than individual <xref:System.Char> values, see the <xref:System.Globalization.StringInfo> class. |
| 227 | + For more information about working with units of text rather than individual <xref:System.Char> values, see [Introduction to character encoding in .NET](/dotnet/standard/base-types/character-encoding-introduction). |
226 | 228 |
|
227 | 229 | <a name="Nulls"></a>
|
228 | 230 | ## Null strings and empty strings
|
|
0 commit comments