You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
docs: update string spec docs for Str-to-string migration
All three string spec files still referenced the removed Sharpy.Str
wrapper type. Updated to reflect the current architecture where str
maps directly to System.String with extension methods.
- string_type.md: rewrite intro, method table, iteration semantics;
fix incorrect split("") docs (now raises ValueError matching Python)
- string_literals.md: remove native string literals section (n"..."),
add String Type section clarifying all literals produce System.String
- string_operators.md: update implementation note for StringHelpers
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy file name to clipboardExpand all lines: docs/language_specification/string_literals.md
+7-38Lines changed: 7 additions & 38 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -35,7 +35,7 @@ multi-line string
35
35
|`\UHHHHHHHH`| Unicode 32-bit |
36
36
37
37
*Implementation*
38
-
-*✅ Native - Single quotes become double quotes; escape sequences map directly.*
38
+
-*✅ Single quotes become double quotes; escape sequences map directly to C# string literals.*
39
39
40
40
## Raw Strings
41
41
@@ -48,45 +48,14 @@ regex = r"\d+\.\d+"
48
48
*Implementation*
49
49
-*✅ Native - Maps to C# verbatim strings `@"..."`.*
50
50
51
-
## Native String Literals
51
+
## String Type
52
52
53
-
Native string literals produce a `System.String` value instead of `Sharpy.Str`. Use them when interfacing with .NET APIs that expect `System.String`, or when you need to avoid the `Sharpy.Str` wrapper overhead.
53
+
All string literals (regular, raw, multi-line) produce `System.String` values (Sharpy's `str` type):
54
54
55
55
```python
56
-
# Single-quoted native strings
57
-
path = n'hello'
58
-
message = n"Hello, World!"
59
-
60
-
# Triple-quoted native strings
61
-
text = n"""
62
-
Multi-line native string
63
-
"""
64
-
alt = n'''
65
-
Also a native string
66
-
'''
67
-
68
-
# Raw native strings (no escape processing)
69
-
regex = nr"\d+\.\d+"
70
-
win_path = nr"C:\Users\Alice\Documents"
71
-
```
72
-
73
-
### When to Use Native Strings
74
-
75
-
| Scenario | Use |
76
-
|----------|-----|
77
-
| Normal Sharpy code |`"hello"` (regular string → `Sharpy.Str`) |
> **Historical note:** Sharpy previously supported native string literals (`n"..."`) to produce `System.String` instead of `Sharpy.Str`. Since `str` now maps directly to `System.String`, native string literals are no longer needed and have been removed. See [SRP-0007](../rejected_proposals/SRP-0007-str-wrapper-type.md).
-*✅ Operators defined on `Sharpy.Str` — `+` maps to `Str.operator+`, `*` to `Str.operator*`, comparisons use ordinal comparison via `IComparable<Str>`, `in` maps to `Contains()`.*
53
+
-*✅ `+` uses native C# string concatenation, `*`maps to `Sharpy.StringHelpers.Repeat()`, comparisons use native `System.String` ordinal comparison, `in` maps to `string.Contains()`.*
Copy file name to clipboardExpand all lines: docs/language_specification/string_type.md
+37-61Lines changed: 37 additions & 61 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,27 +1,18 @@
1
1
# String Type and UTF-16 Semantics
2
2
3
-
Sharpy's `str` type maps to `Sharpy.Str`, an immutable readonly struct that wraps .NET's `System.String`. The `Str` type provides Python-compatible string semantics while maintaining full .NET interoperability through implicit conversion operators.
4
-
5
-
## Sharpy.Str
6
-
7
-
`Sharpy.Str` is a value type (readonly struct) that:
8
-
9
-
-**Wraps `System.String`** — zero-allocation access to the underlying string via implicit conversions
Sharpy's `str` type maps directly to `System.String` (C# `string`). Python-compatible string methods (`upper()`, `find()`, `split()`, etc.) are provided as extension methods on `string` via `Sharpy.StringExtensions`. Operations that `System.String` doesn't natively support (repetition, negative indexing, iteration as single-character strings) use static helper methods in `Sharpy.StringHelpers`.
14
4
15
5
```python
16
-
s: str="hello"# Type is Sharpy.Str
17
-
n: str= n"hello"# Type is System.String (native string literal)
6
+
s: str="hello"# Type is System.String (C# string)
18
7
```
19
8
20
-
> **Note:** For .NET interop scenarios where a raw `System.String`is needed, use native string literals (`n"..."`) or the implicit conversion.
9
+
This design follows the Kotlin model — Kotlin's `String` is `java.lang.String`with extension functions — and aligns with all three Sharpy axioms:
21
10
22
-
## UTF-16 Code Units
11
+
-**Axiom 1 (.NET):**`string` is the native .NET type. Zero interop friction.
12
+
-**Axiom 2 (Python):** Extension methods provide `s.upper()`, `s.find()`, etc. — same surface as Python.
13
+
-**Axiom 3 (Type Safety):** No implicit conversions, no boxing, no overload ambiguity.
23
14
24
-
`Str` uses UTF-16 encoding internally (inherited from `System.String`). This has important implications for string operations.
15
+
> **Historical note:** Sharpy originally used a `Sharpy.Str` readonly struct wrapper. This was removed — see [SRP-0007](../rejected_proposals/SRP-0007-str-wrapper-type.md) for rationale.
25
16
26
17
## UTF-16 Code Units
27
18
@@ -83,7 +74,7 @@ s[0:3] # "A😀" - correct
83
74
84
75
## Iterating Over Strings
85
76
86
-
Iterating over a string yields individual `char` values (UTF-16 code units):
77
+
Iterating over a string yields single-character `str` values (one UTF-16 code unit each), via `StringHelpers.Iterate()`:
87
78
88
79
```python
89
80
for c in"Hi😀":
@@ -95,6 +86,8 @@ for c in "Hi😀":
95
86
# � (low surrogate)
96
87
```
97
88
89
+
Each iteration variable `c` is a `str` (not a `char`), matching Python's behavior where iterating a string yields single-character strings.
90
+
98
91
## Working with Unicode Correctly
99
92
100
93
For applications that need to work with user-perceived characters (grapheme clusters) or Unicode code points, use the appropriate .NET APIs:
Sharpy provides **both**Pythonic method names and .NET method names for string operations. The Pythonic names are aliases that compile to the corresponding .NET methods.
|`s.find(sub)`|`s.IndexOf(sub)`| Find substring (returns -1 if not found) |
147
-
|`s.rfind(sub)`|`s.LastIndexOf(sub)`| Find last occurrence |
126
+
Sharpy provides Python-compatible string methods as **extension methods**on `string` in `Sharpy.StringExtensions`. The compiler's `NameMangler` converts snake_case method names to PascalCase (e.g., `upper` → `Upper`), and generated code includes `using global::Sharpy;`to bring these extensions into scope.
|`s[::2]`| Every other char | Slice syntax supported |
197
185
198
-
**`s.split("")` behavior:**
199
-
200
-
In Python, `"hello".split("")` raises `ValueError: empty separator`. In Sharpy/.NET, this returns a single-element array containing the original string:
201
-
202
-
```python
203
-
# Python
204
-
"hello".split("") # ValueError: empty separator
205
-
206
-
# Sharpy
207
-
"hello".split("") # ["hello"] - no splitting occurs
208
-
```
209
-
210
186
To split a string into individual characters in Sharpy, use:
211
187
212
188
```python
@@ -226,4 +202,4 @@ chars = [c for c in "hello"] # ['h', 'e', 'l', 'l', 'o']
226
202
4.**Most common text works as expected:** ASCII text and most European/Asian scripts (within the BMP) have a 1:1 correspondence between characters and code units.
227
203
228
204
*Implementation*
229
-
-*✅ `Sharpy.Str` readonly struct wrapping `System.String` with Pythonic API and implicit conversions.*
205
+
-*✅ `str` maps to `System.String`; Python methods via `Sharpy.StringExtensions`; operators/indexing/iteration via `Sharpy.StringHelpers`.*
0 commit comments