Skip to content

Commit a236e7f

Browse files
committed
Reason V4 Feature [String Template Literals]
Summary:This diff implements string template literals. Test Plan: Reviewers: CC:
1 parent 8047d1d commit a236e7f

17 files changed

+977
-49
lines changed

docs/TEMPLATE_LITERALS.md

Lines changed: 146 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,146 @@
1+
2+
Contributors: Lexing and Parsing String Templates:
3+
===================================================
4+
Supporting string templates requires coordination between the lexer, parser and
5+
printer. The lexer (as always) creates a token stream, but when it encounters a
6+
backtick, it begins a special parsing mode that collects the (mostly) raw text,
7+
until either hitting a closing backtick, or a `${`. If it encounters the `${`
8+
(called an "interpolation region"), it will temporarily resume the "regular"
9+
lexing approach, instead of collecting the raw text - until it hits a balanced
10+
`}`, upon which it will enter the "raw text" mode again until it hits the
11+
closing backtick.
12+
13+
- Parsing of raw text regions and regular tokenizing: Handled by
14+
`reason_declarative_lexer.ml`.
15+
- Token balancing: Handled by `reason_lexer.ml`.
16+
17+
The output of lexing becomes tokens streamed into the parser, and the parser
18+
`reason_parser.mly` turns those tokens into AST expressions.
19+
20+
## Lexing:
21+
22+
String templates are opened by:
23+
- A backtick.
24+
- Followed by any whitespace character (newline, or space/tab).
25+
26+
- Any whitespace character (newline, or space/tab).
27+
- Followed by a backtick
28+
29+
```reason
30+
let x = ` hi this is my string template `
31+
let x = `
32+
The newline counts as a whitespace character both for opening and closing.
33+
`
34+
35+
```
36+
37+
Within the string template literal, there may be regions of non-string
38+
"interpolation" where expressions are lexed/parsed.
39+
40+
```reason
41+
let x = ` hi this is my ${expressionHere() ++ "!"} template `
42+
```
43+
44+
Template strings are lexed into tokens, some of those tokens contain a string
45+
"payload" with portions of the string content.
46+
The opening backtick, closing backtick, and `${` characters do not become a
47+
token that is fed to the parser, and are not included in the text payload of
48+
any token. The Right Brace `}` closing an interpolation region `${` _does_
49+
become a token that is fed to the parser. There are three tokens that are
50+
produced when lexing string templates.
51+
52+
- `STRING_TEMPLATE_TERMINATED(string)`: A string region that is terminated with
53+
closing backtick. It may be the entire string template contents if there are
54+
no interpolation regions `${}`, or it may be the final string segment after
55+
an interpolation region `${}`, as long as it is the closing of the entire
56+
template.
57+
- `STRING_TEMPLATE_SEGMENT_LBRACE(string)`: A string region occuring _before_
58+
an interpolation region `${`. The `string` payload of this token is the
59+
contents up until (but not including) the next `${`.
60+
- `RBRACE`: A `}` character that terminates an interpolation region that
61+
started with `${`.
62+
63+
Simple example:
64+
65+
STRING_TEMPLATE_TERMINATED
66+
| |
67+
` lorem ipsum lorem ipsum bla `
68+
^ ^
69+
| |
70+
| The closing backtick also doesn't show up in the token
71+
| stream, but the last white space is part of the lexed
72+
| STRING_TEMPLATE_TERMINATED token
73+
| (it is used to compute indentation, but is stripped from
74+
| the string constant, or re-inserted in refmting if not present)
75+
|
76+
The backtick doesn't show up anywhere in the token stream. The first
77+
single white space after backtick is also not part of the lexed tokens.
78+
79+
Multiline example:
80+
81+
All of this leading line whitespace remains parts of the tokens' payloads
82+
but it is is normalized and stripped when the parser converts the tokens
83+
into string expressions.
84+
|
85+
| This newline not part of any token
86+
| |
87+
| v
88+
| `
89+
+-> lorem ipsum lorem
90+
ipsum bla
91+
`
92+
^
93+
|
94+
All of this white space on final line is part of the token as well.
95+
96+
97+
For interpolation, the token `STRING_TEMPLATE_SEGMENT_LBRACE` represents the
98+
string contents (minus any single/first white space after backtick), up to the
99+
`${`. As with non-interpolated string templates, the opening and closing
100+
backtick does not show up in the token stream, the first white space character
101+
after opening backtick is not included in the lexed string contents, the final
102+
white space character before closing backtick *is* part of the lexed string
103+
token (to compute indentation), but that final white space character, along
104+
with leading line whitespace is stripped from the string expression when the
105+
parsing stage converts from lexed tokens to AST string expressions.
106+
107+
` lorem ipsum lorem ipsum bla${expression}lorem ipsum lorem ip lorem`
108+
| | || |
109+
STRING_TEMPLATE_TERMINATED |STRING_TEMPLATE_TERMINATED
110+
RBRACE
111+
## Parsing:
112+
113+
The string template tokens are turned into normal AST expressions.
114+
`STRING_TEMPLATE_SEGMENT_LBRACE` and `STRING_TEMPLATE_TERMINATED` lexed tokens
115+
contains all of the string contents, plus leading line whitespace for each
116+
line, including the final whitespace before the closing backtick. These are
117+
normalized in the parser by stripping that leading whitespace including two
118+
additional spaces for nice indentation, before turning them into some
119+
combination of string contants with a special attribute on the AST, or string
120+
concats with a special attribute on the concat AST node.
121+
122+
```reason
123+
124+
// This:
125+
let x = `
126+
Hello there
127+
`;
128+
// Becomes:
129+
let x = [@reason.template] "Hello there";
130+
131+
// This:
132+
let x = `
133+
${expr} Hello there
134+
`;
135+
// Becomes:
136+
let x = [@reason.template] (expr ++ [@reason.template] "Hello there");
137+
138+
```
139+
140+
User Documentation:
141+
===================
142+
> This section is the user documentation for string template literals, which
143+
> will be published to the [official Reason Syntax
144+
> documentation](https://reasonml.github.io/) when
145+
146+
TODO
Lines changed: 189 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,189 @@
1+
/**
2+
* Comments:
3+
*/
4+
5+
let addTwo = (a, b) => string_of_int(a + b);
6+
let singleLineConstant = `
7+
Single line template
8+
`;
9+
let singleLineInterpolate = `
10+
Single line ${addTwo(1, 2)}!
11+
`;
12+
13+
let multiLineConstant = `
14+
Multi line template
15+
Multi %a{x, y}line template
16+
Multi line template
17+
Multi line template
18+
`;
19+
20+
let printTwo = (a, b) => {
21+
print_string(a);
22+
print_string(b);
23+
};
24+
25+
let templteWithAttribute =
26+
[@attrHere]
27+
`
28+
Passing line template
29+
Passing line template
30+
Passing line template
31+
Passing line template
32+
`;
33+
34+
let result =
35+
print_string(
36+
`
37+
Passing line template
38+
Passing line template
39+
Passing line template
40+
Passing line template
41+
`,
42+
);
43+
44+
let resultPrintTwo =
45+
printTwo(
46+
"short one",
47+
`
48+
Passing line template
49+
Passing line template
50+
Passing line template
51+
Passing line template
52+
`,
53+
);
54+
55+
let hasBackSlashes = `
56+
One not escaped: \
57+
Three not escaped: \ \ \
58+
Two not escaped: \\
59+
Two not escaped: \\\
60+
One not escaped slash, and one escaped tick: \\`
61+
Two not escaped slashes, and one escaped tick: \\\`
62+
Two not escaped slashes, and one escaped dollar-brace: \\\${
63+
One not escaped slash, then a close tick: \
64+
`;
65+
66+
let singleLineInterpolateWithEscapeTick = `
67+
Single \`line ${addTwo(1, 2)}!
68+
`;
69+
70+
let singleLineConstantWithEscapeDollar = `
71+
Single \${line template
72+
`;
73+
74+
// The backslash here is a backslash literal.
75+
let singleLineInterpolateWithBackslashThenDollar = `
76+
Single \$line ${addTwo(2, 3)}!
77+
`;
78+
79+
let beforeExpressionCommentInNonLetty = `
80+
Before expression comment in non-letty interpolation:
81+
${/* Comment */ string_of_int(1 + 2)}
82+
`;
83+
84+
let beforeExpressionCommentInNonLetty2 = `
85+
Same thing but with comment on own line:
86+
${
87+
/* Comment */
88+
string_of_int(10 + 8)
89+
}
90+
`;
91+
module StringIndentationWorksInModuleIndentation = {
92+
let beforeExpressionCommentInNonLetty2 = `
93+
Same thing but with comment on own line:
94+
${
95+
/* Comment */
96+
string_of_int(10 + 8)
97+
}
98+
`;
99+
};
100+
101+
let beforeExpressionCommentInNonLetty3 = `
102+
Same thing but with text after final brace on same line:
103+
${
104+
/* Comment */
105+
string_of_int(20 + 1000)
106+
}TextAfterBrace
107+
`;
108+
109+
let beforeExpressionCommentInNonLetty3 = `
110+
Same thing but with text after final brace on next line:
111+
${
112+
/* Comment */
113+
string_of_int(100)
114+
}
115+
TextAfterBrace
116+
`;
117+
118+
let x = 0;
119+
let commentInLetSequence = `
120+
Comment in letty interpolation:
121+
${
122+
/* Comment */
123+
let x = 200 + 49;
124+
string_of_int(x);
125+
}
126+
`;
127+
128+
let commentInLetSequence2 = `
129+
Same but with text after final brace on same line:
130+
${
131+
/* Comment */
132+
let x = 200 + 49;
133+
string_of_int(x);
134+
}TextAfterBrace
135+
`;
136+
137+
let commentInLetSequence3 = `
138+
Same but with text after final brace on next line:
139+
${
140+
/* Comment */
141+
let x = 200 + 49;
142+
string_of_int(x);
143+
}
144+
TextAfterBrace
145+
`;
146+
147+
let reallyCompicatedNested = `
148+
Comment in non-letty interpolation:
149+
150+
${
151+
/* Comment on first line of interpolation region */
152+
153+
let y = (a, b) => a + b;
154+
let x = 0 + y(0, 2);
155+
// Nested string templates
156+
let s = `
157+
asdf${addTwo(0, 0)}
158+
alskdjflakdsjf
159+
`;
160+
s ++ s;
161+
}same line as brace with one space
162+
and some more text at the footer no newline
163+
`;
164+
165+
let reallyLongIdent = "!";
166+
let backToBackInterpolations = `
167+
Two interpolations side by side:
168+
${addTwo(0, 0)}${addTwo(0, 0)}
169+
Two interpolations side by side with leading and trailing:
170+
Before${addTwo(0, 0)}${addTwo(0, 0)}After
171+
172+
Two interpolations side by side second one should break:
173+
Before${addTwo(0, 0)}${
174+
reallyLongIdent
175+
++ reallyLongIdent
176+
++ reallyLongIdent
177+
++ reallyLongIdent
178+
}After
179+
180+
Three interpolations side by side:
181+
Before${addTwo(0, 0)}${
182+
reallyLongIdent
183+
++ reallyLongIdent
184+
++ reallyLongIdent
185+
++ reallyLongIdent
186+
}${
187+
""
188+
}After
189+
`;

0 commit comments

Comments
 (0)