You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: Documentation/Evolution/DelimiterSyntax.md
+38-15Lines changed: 38 additions & 15 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,17 +4,46 @@
4
4
5
5
## Introduction
6
6
7
-
This proposal introduces regex literals to Swift source code. The proposed syntax mirrors literals in other programing languages such as Perl, JavaScript and Ruby. As in those languages, literals are delimited with the `/` character:
7
+
This proposal helps complete the story told in [Regex Type and Overview][regex-type] and [elsewhere][pitch-status]. We propose the introduction of regex literals to Swift source code. The proposed syntax mirrors literals in other programing languages such as Perl, JavaScript and Ruby. As in those languages, literals are delimited with the `/` character:
8
8
9
9
```swift
10
10
let re =/[0-9]+/
11
11
```
12
12
13
13
## Motivation
14
14
15
-
This proposal helps complete the story told in [Regex Type and Overview][regex-type]and [elsewhere][pitch-status]. Literals are compiled directly, allowing errors to be found at compile time, rather than at run time. Using a literal also allows editors to support features such as syntax coloring inside the literal, highlighting sub-structure of the regex, and conversion of the literal to an equivalent result builder DSL (see [Regex builder DSL][regex-dsl]). It would be difficult to support all of this if regexes could only be defined inside a string.
15
+
In [Regex Type and Overview][regex-type]we introduced the `Regex` type, which is able to dynamically compile a regex pattern:
16
16
17
-
A regex literal also allows for seamless composition with the Regex DSL, enabling the intermixing of a regex syntax with other elements of the builder:
17
+
```swift
18
+
let pattern =#"(\w+)\s\s+(\S+)\s\s+((?:(?!\s\s).)*)\s\s+(.*)"#
19
+
let regex =try!Regex(compiling: pattern)
20
+
// regex: Regex<AnyRegexOutput>
21
+
```
22
+
23
+
The ability to compile regex patterns at runtime is useful for cases where it is e.g provided as user input, however it is suboptimal when the pattern is statically known for a number of reasons:
24
+
25
+
- Regex syntax errors aren't detected until runtime, and explicit error handling (e.g `try!`) is required to deal with these errors.
26
+
- No special source tooling support, such as syntactic highlighting, code completion, and refactoring support, is available.
27
+
- Capture types aren't known until runtime, and as such a dynamic `AnyRegexOutput` capture type must be used.
28
+
- The syntax is overly verbose, especially for e.g an argument to a matching function.
29
+
30
+
## Proposed solution
31
+
32
+
We propose introducing a new kind of literal for a regex. In Swift 5.7 mode, a regex literal may be written using `/.../` delimiters:
33
+
34
+
```swift
35
+
// Matches "<identifier> = <hexadecimal value>", extracting the identifier and hex number
36
+
let regex =/(?<identifier>[[:alpha:]]\w*) = (?<hex>[0-9A-F]+)/
Forward slashes are a regex term of art, and are used as the delimiters for regex literals in Perl, JavaScript and Ruby (though Perl and Ruby also provide alternatives). Their ubiquity and familiarity makes them a compelling choice for Swift.
41
+
42
+
A regex literal may also be spelled using an extended syntax `#/.../#`, which allows the placement of an arbitrary number of balanced `#` characters around a regex literal. This syntax allows regex literals to contain unescaped forward slashes, and may be used without needing to upgrade to Swift 5.7 mode.
43
+
44
+
Within a regex literal, the compiler will parse the regex syntax outlined in in [the Regex Syntax pitch][internal-syntax], and diagnose any errors at compile time. The capture types are automatically inferred based on the capture groups present in the regex. Using a literal allows editors to support features such as syntax coloring inside the literal, highlighting sub-structure of the regex, and conversion of the literal to an equivalent result builder DSL (see [Regex builder DSL][regex-dsl]).
45
+
46
+
A regex literal also allows for seamless composition with the Regex DSL, enabling lightweight intermixing of a regex syntax with other elements of the builder:
18
47
19
48
```swift
20
49
// A regex literal for parsing an amount of currency in dollars or pounds.
@@ -30,24 +59,18 @@ let regex = Regex {
30
59
31
60
This flexibility allows for terse matching syntax to be used when it's suitable, and more explicit syntax where clarity and strong types are required.
32
61
33
-
## Proposed solution
34
-
35
-
A regex literal will be introduced in Swift 5.7 mode using `/.../` delimiters, within which the compiler will parse a regex (the details of which are outlined in [the Regex Syntax pitch][internal-syntax]):
62
+
Due to the existing use of `/` in comment syntax and operators, there are some syntactic ambiguities to consider. While there are quite a few cases to consider, we do not feel that the impact of any individual case is sufficient to disqualify the syntax. Some of these ambiguities require a couple of source breaking language changes, and as such the `/.../` syntax requires upgrading to a new language mode in order to use.
36
63
37
-
```swift
38
-
// Matches "<identifier> = <hexadecimal value>", extracting the identifier and hex number
39
-
let regex =/([[:alpha:]]\w*) = ([0-9A-F]+)/
40
-
```
64
+
## Detailed design
41
65
42
-
The above regex literal will be inferred to be [the regex type][regex-type]`Regex<(Substring, Substring, Substring)>`, where the capture types have been automatically inferred. Errors in the regex will be diagnosed by the compiler.
66
+
### Named typed captures
43
67
44
-
Forward slashes are a regex term of art, and are used as the delimiters for regex literals in Perl, JavaScript and Ruby (though Perl and Ruby also provide alternatives). Their ubiquity and familiarity makes them a compelling choice for Swift.
68
+
Regex literals have their capture types statically determined by the capture groups present. Each capture group adds an additional capture to the match tuple, with named capture groups receiving a corresponding tuple label.
45
69
46
-
Due to the existing use of `/` in comment syntax and operators, there are some syntactic ambiguities to consider. While there are quite a few cases to consider, we do not feel that the impact of any individual case is sufficient to disqualify the syntax. Some of these ambiguities require a couple of source breaking language changes, and as such the `/.../` syntax requires upgrading to a new language mode in order to use.
70
+
**TODO: Example**
47
71
48
-
A regex literal may also be spelled using an extended syntax `#/.../#`, which allows the placement of an arbitrary number of balanced `#` characters around a regex literal. This syntax allows regex literals to contain unescaped forward slashes, and provides a delimiter option which does not require a new language mode to use.
49
72
50
-
## Detailed design
73
+
**TODO: Should we cover more general typed capture behavior here? e.g Quantifier types. It overlaps with the typed capture behavior of the DSL tho**
0 commit comments