-
Notifications
You must be signed in to change notification settings - Fork 6.1k
Update docs on strings not being null-terminated #28470
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -9,7 +9,7 @@ ms.assetid: 21580405-cb25-4541-89d5-037846a38b07 | |
| --- | ||
| # Strings (C# Programming Guide) | ||
|
|
||
| A string is an object of type <xref:System.String> whose value is text. Internally, the text is stored as a sequential read-only collection of <xref:System.Char> objects. There is no null-terminating character at the end of a C# string; therefore a C# string can contain any number of embedded null characters ('\0'). The <xref:System.String.Length%2A> property of a string represents the number of `Char` objects it contains, not the number of Unicode characters. To access the individual Unicode code points in a string, use the <xref:System.Globalization.StringInfo> object. | ||
| A string is an object of type <xref:System.String> whose value is text. Internally, the text is stored as a sequential read-only collection of <xref:System.Char> objects. The <xref:System.String.Length%2A> property of a string represents the number of `Char` objects it contains, not the number of Unicode characters. To access the individual Unicode code points in a string, use the <xref:System.Globalization.StringInfo> object. The length of a C# string is stored in a dedicated field and it is not computed by iterating on the string data to find a null-terminator. Therefore, a C# string can contain any number of embedded null characters ('\0'). Note that C# strings are not just length-prefixed, but also internally null-terminated: this makes it safe to marshal them to native code expecting a null-terminated sequence of characters. | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What is the target audience for this document? The description says "Learn about strings in C# programming." that feels like L100. It does not sound right to go into details about strings interop in the first paragraph for L100 audience.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If that's the case, should we remove that mention of the null-terminator at the end of the string entirely? As in, people getting started with C# likely wouldn't know or care about what embedded null characters even are anyway, right? 🤔
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think that's the better change. (Historical note: This text has been around for a long time. I'm betting it exists because a large segment of the audience for the early docs were C++ developers. This note would have been important then.)
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
|
|
||
| ## string vs. System.String | ||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're going to include this in the docs, we should probably also include that
ReadOnlySpan<char>is frequently used to represent strings or slices of strings and are not guaranteed to be null-terminated.