Skip to content

Commit 61ba240

Browse files
antonsyndclaude
andcommitted
docs: update string spec docs for Str-to-string migration
All three string spec files still referenced the removed Sharpy.Str wrapper type. Updated to reflect the current architecture where str maps directly to System.String with extension methods. - string_type.md: rewrite intro, method table, iteration semantics; fix incorrect split("") docs (now raises ValueError matching Python) - string_literals.md: remove native string literals section (n"..."), add String Type section clarifying all literals produce System.String - string_operators.md: update implementation note for StringHelpers Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent e44b60f commit 61ba240

3 files changed

Lines changed: 45 additions & 100 deletions

File tree

docs/language_specification/string_literals.md

Lines changed: 7 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ multi-line string
3535
| `\UHHHHHHHH` | Unicode 32-bit |
3636

3737
*Implementation*
38-
- *Native - Single quotes become double quotes; escape sequences map directly.*
38+
- *✅ Single quotes become double quotes; escape sequences map directly to C# string literals.*
3939

4040
## Raw Strings
4141

@@ -48,45 +48,14 @@ regex = r"\d+\.\d+"
4848
*Implementation*
4949
- *✅ Native - Maps to C# verbatim strings `@"..."`.*
5050

51-
## Native String Literals
51+
## String Type
5252

53-
Native string literals produce a `System.String` value instead of `Sharpy.Str`. Use them when interfacing with .NET APIs that expect `System.String`, or when you need to avoid the `Sharpy.Str` wrapper overhead.
53+
All string literals (regular, raw, multi-line) produce `System.String` values (Sharpy's `str` type):
5454

5555
```python
56-
# Single-quoted native strings
57-
path = n'hello'
58-
message = n"Hello, World!"
59-
60-
# Triple-quoted native strings
61-
text = n"""
62-
Multi-line native string
63-
"""
64-
alt = n'''
65-
Also a native string
66-
'''
67-
68-
# Raw native strings (no escape processing)
69-
regex = nr"\d+\.\d+"
70-
win_path = nr"C:\Users\Alice\Documents"
71-
```
72-
73-
### When to Use Native Strings
74-
75-
| Scenario | Use |
76-
|----------|-----|
77-
| Normal Sharpy code | `"hello"` (regular string → `Sharpy.Str`) |
78-
| .NET interop requiring `System.String` | `n"hello"` (native string → `System.String`) |
79-
| Regex patterns for .NET Regex API | `nr"\d+"` (raw native string) |
80-
| Performance-critical code avoiding Str wrapper | `n"hello"` |
81-
82-
### Type Relationship
83-
84-
```python
85-
s: str = "hello" # Sharpy.Str
86-
ns: str = n"hello" # System.String
87-
# Implicit conversion allows assignment:
88-
mixed: str = n"native" # System.String implicitly converts to Str
56+
s: str = "hello" # System.String
57+
r: str = r"C:\path" # System.String (verbatim)
58+
m: str = """multi""" # System.String
8959
```
9060

91-
*Implementation*
92-
- *✅ Native - `n"..."` emits a C# `string` literal directly; `nr"..."` emits `@"..."`.*
61+
> **Historical note:** Sharpy previously supported native string literals (`n"..."`) to produce `System.String` instead of `Sharpy.Str`. Since `str` now maps directly to `System.String`, native string literals are no longer needed and have been removed. See [SRP-0007](../rejected_proposals/SRP-0007-str-wrapper-type.md).

docs/language_specification/string_operators.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -50,4 +50,4 @@ f"Count: {42}" # F-strings handle conversion
5050
```
5151

5252
*Implementation*
53-
- *Operators defined on `Sharpy.Str``+` maps to `Str.operator+`, `*` to `Str.operator*`, comparisons use ordinal comparison via `IComparable<Str>`, `in` maps to `Contains()`.*
53+
- *`+` uses native C# string concatenation, `*` maps to `Sharpy.StringHelpers.Repeat()`, comparisons use native `System.String` ordinal comparison, `in` maps to `string.Contains()`.*

docs/language_specification/string_type.md

Lines changed: 37 additions & 61 deletions
Original file line numberDiff line numberDiff line change
@@ -1,27 +1,18 @@
11
# String Type and UTF-16 Semantics
22

3-
Sharpy's `str` type maps to `Sharpy.Str`, an immutable readonly struct that wraps .NET's `System.String`. The `Str` type provides Python-compatible string semantics while maintaining full .NET interoperability through implicit conversion operators.
4-
5-
## Sharpy.Str
6-
7-
`Sharpy.Str` is a value type (readonly struct) that:
8-
9-
- **Wraps `System.String`** — zero-allocation access to the underlying string via implicit conversions
10-
- **Implicit conversions** — converts to/from `System.String` automatically, enabling seamless .NET interop
11-
- **Operator overloads**`+` (concatenation), `*` (repetition), `==`, `!=`, `<`, `>`, `<=`, `>=`, `in` (containment)
12-
- **Implements interfaces**`IEquatable<Str>`, `IComparable<Str>`, `IEnumerable<Str>`, `ISized`, `IBoolConvertible`
13-
- **Pythonic methods**`upper()`, `lower()`, `strip()`, `split()`, `join()`, `find()`, `replace()`, `format()`, etc.
3+
Sharpy's `str` type maps directly to `System.String` (C# `string`). Python-compatible string methods (`upper()`, `find()`, `split()`, etc.) are provided as extension methods on `string` via `Sharpy.StringExtensions`. Operations that `System.String` doesn't natively support (repetition, negative indexing, iteration as single-character strings) use static helper methods in `Sharpy.StringHelpers`.
144

155
```python
16-
s: str = "hello" # Type is Sharpy.Str
17-
n: str = n"hello" # Type is System.String (native string literal)
6+
s: str = "hello" # Type is System.String (C# string)
187
```
198

20-
> **Note:** For .NET interop scenarios where a raw `System.String` is needed, use native string literals (`n"..."`) or the implicit conversion.
9+
This design follows the Kotlin model — Kotlin's `String` is `java.lang.String` with extension functions — and aligns with all three Sharpy axioms:
2110

22-
## UTF-16 Code Units
11+
- **Axiom 1 (.NET):** `string` is the native .NET type. Zero interop friction.
12+
- **Axiom 2 (Python):** Extension methods provide `s.upper()`, `s.find()`, etc. — same surface as Python.
13+
- **Axiom 3 (Type Safety):** No implicit conversions, no boxing, no overload ambiguity.
2314

24-
`Str` uses UTF-16 encoding internally (inherited from `System.String`). This has important implications for string operations.
15+
> **Historical note:** Sharpy originally used a `Sharpy.Str` readonly struct wrapper. This was removed — see [SRP-0007](../rejected_proposals/SRP-0007-str-wrapper-type.md) for rationale.
2516
2617
## UTF-16 Code Units
2718

@@ -83,7 +74,7 @@ s[0:3] # "A😀" - correct
8374

8475
## Iterating Over Strings
8576

86-
Iterating over a string yields individual `char` values (UTF-16 code units):
77+
Iterating over a string yields single-character `str` values (one UTF-16 code unit each), via `StringHelpers.Iterate()`:
8778

8879
```python
8980
for c in "Hi😀":
@@ -95,6 +86,8 @@ for c in "Hi😀":
9586
# � (low surrogate)
9687
```
9788

89+
Each iteration variable `c` is a `str` (not a `char`), matching Python's behavior where iterating a string yields single-character strings.
90+
9891
## Working with Unicode Correctly
9992

10093
For applications that need to work with user-perceived characters (grapheme clusters) or Unicode code points, use the appropriate .NET APIs:
@@ -130,57 +123,52 @@ escape_str = "\u0048\u0065\u006C\u006C\u006F" # "Hello"
130123

131124
## String Method Availability
132125

133-
Sharpy provides **both** Pythonic method names and .NET method names for string operations. The Pythonic names are aliases that compile to the corresponding .NET methods.
134-
135-
### Pythonic String Methods (Aliases)
136-
137-
| Sharpy Method | .NET Method | Notes |
138-
|---------------|-------------|-------|
139-
| `s.upper()` | `s.ToUpper()` | Uppercase |
140-
| `s.lower()` | `s.ToLower()` | Lowercase |
141-
| `s.strip()` | `s.Trim()` | Remove leading/trailing whitespace |
142-
| `s.lstrip()` | `s.TrimStart()` | Remove leading whitespace |
143-
| `s.rstrip()` | `s.TrimEnd()` | Remove trailing whitespace |
144-
| `s.startswith(prefix)` | `s.StartsWith(prefix)` | Check prefix |
145-
| `s.endswith(suffix)` | `s.EndsWith(suffix)` | Check suffix |
146-
| `s.find(sub)` | `s.IndexOf(sub)` | Find substring (returns -1 if not found) |
147-
| `s.rfind(sub)` | `s.LastIndexOf(sub)` | Find last occurrence |
126+
Sharpy provides Python-compatible string methods as **extension methods** on `string` in `Sharpy.StringExtensions`. The compiler's `NameMangler` converts snake_case method names to PascalCase (e.g., `upper``Upper`), and generated code includes `using global::Sharpy;` to bring these extensions into scope.
127+
128+
### Pythonic String Methods (Extension Methods)
129+
130+
| Sharpy Method | Extension Method | Notes |
131+
|---------------|-----------------|-------|
132+
| `s.upper()` | `s.Upper()` | Uppercase (invariant culture) |
133+
| `s.lower()` | `s.Lower()` | Lowercase (invariant culture) |
134+
| `s.strip()` | `s.Strip()` | Remove leading/trailing whitespace |
135+
| `s.lstrip()` | `s.Lstrip()` | Remove leading whitespace |
136+
| `s.rstrip()` | `s.Rstrip()` | Remove trailing whitespace |
137+
| `s.startswith(prefix)` | `s.Startswith(prefix)` | Check prefix |
138+
| `s.endswith(suffix)` | `s.Endswith(suffix)` | Check suffix |
139+
| `s.find(sub)` | `s.Find(sub)` | Find substring (returns -1 if not found) |
140+
| `s.rfind(sub)` | `s.Rfind(sub)` | Find last occurrence |
148141
| `s.replace(old, new)` | `s.Replace(old, new)` | Replace all occurrences |
149142
| `s.split()` | `s.Split()` | Split on whitespace |
150143
| `s.split(sep)` | `s.Split(sep)` | Split on separator |
151-
| `s.join(items)` | `string.Join(s, items)` | Join with separator |
152-
| `s.count(sub)` | Custom extension | Count occurrences |
153-
| `s.isdigit()` | Custom extension | Check if all digits |
154-
| `s.isalpha()` | Custom extension | Check if all alphabetic |
155-
| `s.isalnum()` | Custom extension | Check if alphanumeric |
156-
| `s.isspace()` | Custom extension | Check if all whitespace |
144+
| `s.join(items)` | `s.Join(items)` | Join with separator |
145+
| `s.count(sub)` | `s.Count(sub)` | Count occurrences |
146+
| `s.isdigit()` | `s.Isdigit()` | Check if all digits |
147+
| `s.isalpha()` | `s.Isalpha()` | Check if all alphabetic |
148+
| `s.isalnum()` | `s.Isalnum()` | Check if alphanumeric |
149+
| `s.isspace()` | `s.Isspace()` | Check if all whitespace |
150+
| `s.casefold()` | `s.Casefold()` | Full Unicode case folding |
157151

158152
### .NET Methods (Direct Access)
159153

160-
All `System.String` methods are directly available:
154+
Since `str` is `System.String`, all .NET string methods are directly available:
161155

162156
```python
163157
s = "Hello, World!"
164158

165159
# .NET methods work directly
166-
s.ToUpper() # "HELLO, WORLD!"
167160
s.Contains("World") # True
168161
s.Substring(0, 5) # "Hello"
169162
s.PadLeft(20) # " Hello, World!"
170163
s.Insert(7, "Beautiful ") # "Hello, Beautiful World!"
171-
172-
# Static methods via str class
173-
str.IsNullOrEmpty(s) # False
174-
str.Join(", ", ["a", "b"]) # "a, b"
175164
```
176165

177166
### Method Resolution
178167

179-
When both a Pythonic alias and a .NET method could apply, the Pythonic alias takes precedence:
168+
When both a Sharpy extension method and a .NET method could apply, the Sharpy extension method takes precedence via the compiler's name mangling:
180169

181170
```python
182-
s.upper() # Calls ToUpper() - Pythonic preferred
183-
s.ToUpper() # Also works - explicit .NET name
171+
s.upper() # Mangled to s.Upper() — calls Sharpy extension method
184172
```
185173

186174
### Differences from Python
@@ -191,22 +179,10 @@ Some Python string methods have slightly different behavior due to .NET semantic
191179
|-----------|--------|-------------|
192180
| `"ab" * 3` | `"ababab"` | `"ababab"` (✅ same) |
193181
| `s.split()` | Splits on any whitespace | Splits on whitespace (✅ same) |
194-
| `s.split("")` | `ValueError` | Returns array with original string (see below) |
182+
| `s.split("")` | `ValueError` | `ValueError` (✅ same) |
195183
| `s.count(sub)` | Count non-overlapping | Count non-overlapping (✅ same) |
196184
| `s[::2]` | Every other char | Slice syntax supported |
197185

198-
**`s.split("")` behavior:**
199-
200-
In Python, `"hello".split("")` raises `ValueError: empty separator`. In Sharpy/.NET, this returns a single-element array containing the original string:
201-
202-
```python
203-
# Python
204-
"hello".split("") # ValueError: empty separator
205-
206-
# Sharpy
207-
"hello".split("") # ["hello"] - no splitting occurs
208-
```
209-
210186
To split a string into individual characters in Sharpy, use:
211187

212188
```python
@@ -226,4 +202,4 @@ chars = [c for c in "hello"] # ['h', 'e', 'l', 'l', 'o']
226202
4. **Most common text works as expected:** ASCII text and most European/Asian scripts (within the BMP) have a 1:1 correspondence between characters and code units.
227203

228204
*Implementation*
229-
- *`Sharpy.Str` readonly struct wrapping `System.String` with Pythonic API and implicit conversions.*
205+
- *`str` maps to `System.String`; Python methods via `Sharpy.StringExtensions`; operators/indexing/iteration via `Sharpy.StringHelpers`.*

0 commit comments

Comments
 (0)