Skip to content

Commit f99dadb

Browse files
committed
Rejig motivation/solution
1 parent 35d9132 commit f99dadb

File tree

1 file changed

+38
-15
lines changed

1 file changed

+38
-15
lines changed

Documentation/Evolution/DelimiterSyntax.md

Lines changed: 38 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -4,17 +4,46 @@
44

55
## Introduction
66

7-
This proposal introduces regex literals to Swift source code. The proposed syntax mirrors literals in other programing languages such as Perl, JavaScript and Ruby. As in those languages, literals are delimited with the `/` character:
7+
This proposal helps complete the story told in [Regex Type and Overview][regex-type] and [elsewhere][pitch-status]. We propose the introduction of regex literals to Swift source code. The proposed syntax mirrors literals in other programing languages such as Perl, JavaScript and Ruby. As in those languages, literals are delimited with the `/` character:
88

99
```swift
1010
let re = /[0-9]+/
1111
```
1212

1313
## Motivation
1414

15-
This proposal helps complete the story told in [Regex Type and Overview][regex-type] and [elsewhere][pitch-status]. Literals are compiled directly, allowing errors to be found at compile time, rather than at run time. Using a literal also allows editors to support features such as syntax coloring inside the literal, highlighting sub-structure of the regex, and conversion of the literal to an equivalent result builder DSL (see [Regex builder DSL][regex-dsl]). It would be difficult to support all of this if regexes could only be defined inside a string.
15+
In [Regex Type and Overview][regex-type] we introduced the `Regex` type, which is able to dynamically compile a regex pattern:
1616

17-
A regex literal also allows for seamless composition with the Regex DSL, enabling the intermixing of a regex syntax with other elements of the builder:
17+
```swift
18+
let pattern = #"(\w+)\s\s+(\S+)\s\s+((?:(?!\s\s).)*)\s\s+(.*)"#
19+
let regex = try! Regex(compiling: pattern)
20+
// regex: Regex<AnyRegexOutput>
21+
```
22+
23+
The ability to compile regex patterns at runtime is useful for cases where it is e.g provided as user input, however it is suboptimal when the pattern is statically known for a number of reasons:
24+
25+
- Regex syntax errors aren't detected until runtime, and explicit error handling (e.g `try!`) is required to deal with these errors.
26+
- No special source tooling support, such as syntactic highlighting, code completion, and refactoring support, is available.
27+
- Capture types aren't known until runtime, and as such a dynamic `AnyRegexOutput` capture type must be used.
28+
- The syntax is overly verbose, especially for e.g an argument to a matching function.
29+
30+
## Proposed solution
31+
32+
We propose introducing a new kind of literal for a regex. In Swift 5.7 mode, a regex literal may be written using `/.../` delimiters:
33+
34+
```swift
35+
// Matches "<identifier> = <hexadecimal value>", extracting the identifier and hex number
36+
let regex = /(?<identifier>[[:alpha:]]\w*) = (?<hex>[0-9A-F]+)/
37+
// regex: Regex<(Substring, identifier: Substring, hex: Substring)>
38+
```
39+
40+
Forward slashes are a regex term of art, and are used as the delimiters for regex literals in Perl, JavaScript and Ruby (though Perl and Ruby also provide alternatives). Their ubiquity and familiarity makes them a compelling choice for Swift.
41+
42+
A regex literal may also be spelled using an extended syntax `#/.../#`, which allows the placement of an arbitrary number of balanced `#` characters around a regex literal. This syntax allows regex literals to contain unescaped forward slashes, and may be used without needing to upgrade to Swift 5.7 mode.
43+
44+
Within a regex literal, the compiler will parse the regex syntax outlined in in [the Regex Syntax pitch][internal-syntax], and diagnose any errors at compile time. The capture types are automatically inferred based on the capture groups present in the regex. Using a literal allows editors to support features such as syntax coloring inside the literal, highlighting sub-structure of the regex, and conversion of the literal to an equivalent result builder DSL (see [Regex builder DSL][regex-dsl]).
45+
46+
A regex literal also allows for seamless composition with the Regex DSL, enabling lightweight intermixing of a regex syntax with other elements of the builder:
1847

1948
```swift
2049
// A regex literal for parsing an amount of currency in dollars or pounds.
@@ -30,24 +59,18 @@ let regex = Regex {
3059

3160
This flexibility allows for terse matching syntax to be used when it's suitable, and more explicit syntax where clarity and strong types are required.
3261

33-
## Proposed solution
34-
35-
A regex literal will be introduced in Swift 5.7 mode using `/.../` delimiters, within which the compiler will parse a regex (the details of which are outlined in [the Regex Syntax pitch][internal-syntax]):
62+
Due to the existing use of `/` in comment syntax and operators, there are some syntactic ambiguities to consider. While there are quite a few cases to consider, we do not feel that the impact of any individual case is sufficient to disqualify the syntax. Some of these ambiguities require a couple of source breaking language changes, and as such the `/.../` syntax requires upgrading to a new language mode in order to use.
3663

37-
```swift
38-
// Matches "<identifier> = <hexadecimal value>", extracting the identifier and hex number
39-
let regex = /([[:alpha:]]\w*) = ([0-9A-F]+)/
40-
```
64+
## Detailed design
4165

42-
The above regex literal will be inferred to be [the regex type][regex-type] `Regex<(Substring, Substring, Substring)>`, where the capture types have been automatically inferred. Errors in the regex will be diagnosed by the compiler.
66+
### Named typed captures
4367

44-
Forward slashes are a regex term of art, and are used as the delimiters for regex literals in Perl, JavaScript and Ruby (though Perl and Ruby also provide alternatives). Their ubiquity and familiarity makes them a compelling choice for Swift.
68+
Regex literals have their capture types statically determined by the capture groups present. Each capture group adds an additional capture to the match tuple, with named capture groups receiving a corresponding tuple label.
4569

46-
Due to the existing use of `/` in comment syntax and operators, there are some syntactic ambiguities to consider. While there are quite a few cases to consider, we do not feel that the impact of any individual case is sufficient to disqualify the syntax. Some of these ambiguities require a couple of source breaking language changes, and as such the `/.../` syntax requires upgrading to a new language mode in order to use.
70+
**TODO: Example**
4771

48-
A regex literal may also be spelled using an extended syntax `#/.../#`, which allows the placement of an arbitrary number of balanced `#` characters around a regex literal. This syntax allows regex literals to contain unescaped forward slashes, and provides a delimiter option which does not require a new language mode to use.
4972

50-
## Detailed design
73+
**TODO: Should we cover more general typed capture behavior here? e.g Quantifier types. It overlaps with the typed capture behavior of the DSL tho**
5174

5275
### Extended delimiters `#/.../#`, `##/.../##`
5376

0 commit comments

Comments
 (0)