Add Rune links to String class #4206

tdykstra · 2020-05-07T22:37:40Z

Follow-up to #4189

gewarren · 2020-05-08T22:17:31Z

xml/System/String.xml

@@ -98,6 +98,8 @@

 A string is a sequential collection of characters that is used to represent text. A <xref:System.String> object is a sequential collection of <xref:System.Char?displayProperty=nameWithType> objects that represent a string; a <xref:System.Char?displayProperty=nameWithType> object corresponds to a UTF-16 code unit. The value of the <xref:System.String> object is the content of the sequential collection of <xref:System.Char?displayProperty=nameWithType> objects, and that value is immutable (that is, it is read-only). For more information about the immutability of strings, see the [Immutability and the StringBuilder class](#Immutability) section later in this topic. The maximum size of a <xref:System.String> object in memory is 2GB, or about 1 billion characters.

+For more information about Unicode, UTF-16, code units, code points, and the <xref:System.Char> and <xref:System.Text.Rune> types, see [Introduction to character encoding in .NET](/dotnet/standard/base-types/character-encoding-introduction).


Is it better to have the link in the ~ format? I see other links in this file that use the ~ format.

Suggested change

For more information about Unicode, UTF-16, code units, code points, and the <xref:System.Char> and <xref:System.Text.Rune> types, see [Introduction to character encoding in .NET](/dotnet/standard/base-types/character-encoding-introduction).

For more information about Unicode, UTF-16, code units, code points, and the <xref:System.Char> and <xref:System.Text.Rune> types, see [Introduction to character encoding in .NET](~/docs/standard/base-types/character-encoding-introduction.md).

Have to include .md with that format.

gewarren · 2020-05-08T22:19:08Z

xml/System/String.xml

@@ -214,15 +216,15 @@
 [!code-csharp-interactive[System.String.Class#5](~/samples/snippets/csharp/VS_Snippets_CLR_System/system.String.Class/cs/index2.cs#5)]
 [!code-vb[System.String.Class#5](~/samples/snippets/visualbasic/VS_Snippets_CLR_System/system.String.Class/vb/index2.vb#5)]

- Consecutive index values might not correspond to consecutive Unicode characters, because a Unicode character might be encoded as more than one <xref:System.Char> object. In particular, a string may contain multi-character units of text that are formed by a base character followed by one or more combining characters or by surrogate pairs. To work with Unicode characters instead of <xref:System.Char> objects, use the <xref:System.Globalization.StringInfo?displayProperty=nameWithType> and <xref:System.Globalization.TextElementEnumerator> classes. The following example illustrates the difference between code that works with <xref:System.Char> objects and code that works with Unicode characters. It compares the number of characters or text elements in each word of a sentence. The string includes two sequences of a base character followed by a combining character.
+ Consecutive index values might not correspond to consecutive Unicode characters, because a Unicode character might be encoded as more than one <xref:System.Char> object. In particular, a string may contain multi-character units of text that are formed by a base character followed by one or more combining characters or by surrogate pairs. To work with Unicode characters instead of <xref:System.Char> objects, use the <xref:System.Globalization.StringInfo?displayProperty=nameWithType> and <xref:System.Globalization.TextElementEnumerator> classes, or the <xref:System.String.EnumerateRunes%2A?displayProperty=nameWithType> method and the <xref:System.Text.Rune> type. The following example illustrates the difference between code that works with <xref:System.Char> objects and code that works with Unicode characters. It compares the number of characters or text elements in each word of a sentence. The string includes two sequences of a base character followed by a combining character.


Suggested change

Consecutive index values might not correspond to consecutive Unicode characters, because a Unicode character might be encoded as more than one <xref:System.Char> object. In particular, a string may contain multi-character units of text that are formed by a base character followed by one or more combining characters or by surrogate pairs. To work with Unicode characters instead of <xref:System.Char> objects, use the <xref:System.Globalization.StringInfo?displayProperty=nameWithType> and <xref:System.Globalization.TextElementEnumerator> classes, or the <xref:System.String.EnumerateRunes%2A?displayProperty=nameWithType> method and the <xref:System.Text.Rune> type. The following example illustrates the difference between code that works with <xref:System.Char> objects and code that works with Unicode characters. It compares the number of characters or text elements in each word of a sentence. The string includes two sequences of a base character followed by a combining character.

Consecutive index values might not correspond to consecutive Unicode characters, because a Unicode character might be encoded as more than one <xref:System.Char> object. In particular, a string may contain multi-character units of text that are formed by a base character followed by one or more combining characters or by surrogate pairs. To work with Unicode characters instead of <xref:System.Char> objects, use the <xref:System.Globalization.StringInfo?displayProperty=nameWithType> and <xref:System.Globalization.TextElementEnumerator> classes, or the <xref:System.String.EnumerateRunes%2A?displayProperty=nameWithType> method and the <xref:System.Text.Rune> struct. The following example illustrates the difference between code that works with <xref:System.Char> objects and code that works with Unicode characters. It compares the number of characters or text elements in each word of a sentence. The string includes two sequences of a base character followed by a combining character.

type -> class for consistency with the previous sentence. Not sure which is better.

Rune is actually a struct, so I'll use that here.

gewarren · 2020-05-08T22:19:23Z

xml/System/String.xml


 [!code-cpp[System.String.Class#6](~/samples/snippets/cpp/VS_Snippets_CLR_System/system.String.Class/cpp/string.index3.cpp#6)]
 [!code-csharp-interactive[System.String.Class#6](~/samples/snippets/csharp/VS_Snippets_CLR_System/system.String.Class/cs/index3.cs#6)]
 [!code-vb[System.String.Class#6](~/samples/snippets/visualbasic/VS_Snippets_CLR_System/system.String.Class/vb/index3.vb#6)]

 This example works with text elements by using the <xref:System.Globalization.StringInfo.GetTextElementEnumerator%2A?displayProperty=nameWithType> method and the <xref:System.Globalization.TextElementEnumerator> class to enumerate all the text elements in a string. You can also retrieve an array that contains the starting index of each text element by calling the <xref:System.Globalization.StringInfo.ParseCombiningCharacters%2A?displayProperty=nameWithType> method.

- For more information about working with units of text rather than individual <xref:System.Char> values, see the <xref:System.Globalization.StringInfo> class.
+ For more information about working with units of text rather than individual <xref:System.Char> values, see [Introduction to character encoding in .NET](/dotnet/standard/base-types/character-encoding-introduction).


Suggested change

For more information about working with units of text rather than individual <xref:System.Char> values, see [Introduction to character encoding in .NET](/dotnet/standard/base-types/character-encoding-introduction).

For more information about working with units of text rather than individual <xref:System.Char> values, see [Introduction to character encoding in .NET](~/docs/standard/base-types/character-encoding-introduction.md).

tdykstra · 2020-05-08T23:29:01Z

Thanks for the review @gewarren.

tdykstra · 2020-05-08T23:41:38Z

xml/System/String.xml

@@ -214,15 +216,15 @@
 [!code-csharp-interactive[System.String.Class#5](~/samples/snippets/csharp/VS_Snippets_CLR_System/system.String.Class/cs/index2.cs#5)]
 [!code-vb[System.String.Class#5](~/samples/snippets/visualbasic/VS_Snippets_CLR_System/system.String.Class/vb/index2.vb#5)]

- Consecutive index values might not correspond to consecutive Unicode characters, because a Unicode character might be encoded as more than one <xref:System.Char> object. In particular, a string may contain multi-character units of text that are formed by a base character followed by one or more combining characters or by surrogate pairs. To work with Unicode characters instead of <xref:System.Char> objects, use the <xref:System.Globalization.StringInfo?displayProperty=nameWithType> and <xref:System.Globalization.TextElementEnumerator> classes. The following example illustrates the difference between code that works with <xref:System.Char> objects and code that works with Unicode characters. It compares the number of characters or text elements in each word of a sentence. The string includes two sequences of a base character followed by a combining character.
+ Consecutive index values might not correspond to consecutive Unicode characters, because a Unicode character might be encoded as more than one <xref:System.Char> object. In particular, a string may contain multi-character units of text that are formed by a base character followed by one or more combining characters or by surrogate pairs. To work with Unicode characters instead of <xref:System.Char> objects, use the <xref:System.Globalization.StringInfo?displayProperty=nameWithType> and <xref:System.Globalization.TextElementEnumerator> classes, or the <xref:System.String.EnumerateRunes%2A?displayProperty=nameWithType> method and the <xref:System.Text.Rune> type. The following example illustrates the difference between code that works with <xref:System.Char> objects and code that works with Unicode characters. It compares the number of characters or text elements in each word of a sentence. The string includes two sequences of a base character followed by a combining character.


Suggested change

Consecutive index values might not correspond to consecutive Unicode characters, because a Unicode character might be encoded as more than one <xref:System.Char> object. In particular, a string may contain multi-character units of text that are formed by a base character followed by one or more combining characters or by surrogate pairs. To work with Unicode characters instead of <xref:System.Char> objects, use the <xref:System.Globalization.StringInfo?displayProperty=nameWithType> and <xref:System.Globalization.TextElementEnumerator> classes, or the <xref:System.String.EnumerateRunes%2A?displayProperty=nameWithType> method and the <xref:System.Text.Rune> type. The following example illustrates the difference between code that works with <xref:System.Char> objects and code that works with Unicode characters. It compares the number of characters or text elements in each word of a sentence. The string includes two sequences of a base character followed by a combining character.

Consecutive index values might not correspond to consecutive Unicode characters, because a Unicode character might be encoded as more than one <xref:System.Char> object. In particular, a string may contain multi-character units of text that are formed by a base character followed by one or more combining characters or by surrogate pairs. To work with Unicode characters instead of <xref:System.Char> objects, use the <xref:System.Globalization.StringInfo?displayProperty=nameWithType> and <xref:System.Globalization.TextElementEnumerator> classes, or the <xref:System.String.EnumerateRunes%2A?displayProperty=nameWithType> method and the <xref:System.Text.Rune> struct. The following example illustrates the difference between code that works with <xref:System.Char> objects and code that works with Unicode characters. It compares the number of characters or text elements in each word of a sentence. The string includes two sequences of a base character followed by a combining character.

tdykstra added 2 commits May 7, 2020 15:23

add Rune links

8faca4d

revise wording

758d1e1

tdykstra requested a review from BillWagner May 7, 2020 22:37

dotnet-bot added this to the May 2020 milestone May 7, 2020

gewarren approved these changes May 8, 2020

View reviewed changes

tdykstra commented May 8, 2020

View reviewed changes

tdykstra removed the request for review from BillWagner May 8, 2020 23:46

tdykstra merged commit 0f4839d into dotnet:master May 8, 2020

tdykstra mentioned this pull request May 9, 2020

Apply feedback from PR 4206 #4211

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Rune links to String class #4206

Add Rune links to String class #4206

Uh oh!

tdykstra commented May 7, 2020

Uh oh!

gewarren May 8, 2020 •

edited by tdykstra

Loading

Uh oh!

tdykstra May 8, 2020

Uh oh!

gewarren May 8, 2020 •

edited by tdykstra

Loading

Uh oh!

tdykstra May 8, 2020

Uh oh!

gewarren May 8, 2020 •

edited by tdykstra

Loading

Uh oh!

tdykstra commented May 8, 2020

Uh oh!

tdykstra May 8, 2020

Uh oh!

Uh oh!

		@@ -98,6 +98,8 @@

		A string is a sequential collection of characters that is used to represent text. A <xref:System.String> object is a sequential collection of <xref:System.Char?displayProperty=nameWithType> objects that represent a string; a <xref:System.Char?displayProperty=nameWithType> object corresponds to a UTF-16 code unit. The value of the <xref:System.String> object is the content of the sequential collection of <xref:System.Char?displayProperty=nameWithType> objects, and that value is immutable (that is, it is read-only). For more information about the immutability of strings, see the [Immutability and the StringBuilder class](#Immutability) section later in this topic. The maximum size of a <xref:System.String> object in memory is 2GB, or about 1 billion characters.

		For more information about Unicode, UTF-16, code units, code points, and the <xref:System.Char> and <xref:System.Text.Rune> types, see [Introduction to character encoding in .NET](/dotnet/standard/base-types/character-encoding-introduction).

	For more information about working with units of text rather than individual <xref:System.Char> values, see [Introduction to character encoding in .NET](/dotnet/standard/base-types/character-encoding-introduction).
	For more information about working with units of text rather than individual <xref:System.Char> values, see [Introduction to character encoding in .NET](~/docs/standard/base-types/character-encoding-introduction.md).

Add Rune links to String class #4206

Add Rune links to String class #4206

Uh oh!

Conversation

tdykstra commented May 7, 2020

Uh oh!

gewarren May 8, 2020 • edited by tdykstra Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tdykstra May 8, 2020

Choose a reason for hiding this comment

Uh oh!

gewarren May 8, 2020 • edited by tdykstra Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tdykstra May 8, 2020

Choose a reason for hiding this comment

Uh oh!

gewarren May 8, 2020 • edited by tdykstra Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tdykstra commented May 8, 2020

Uh oh!

tdykstra May 8, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gewarren May 8, 2020 •

edited by tdykstra

Loading

gewarren May 8, 2020 •

edited by tdykstra

Loading

gewarren May 8, 2020 •

edited by tdykstra

Loading