Skip to content

Commit 0bb279a

Browse files
committed
types-grammar, ch1: adding a bunch of text about strings
1 parent afe159a commit 0bb279a

File tree

1 file changed

+63
-1
lines changed

1 file changed

+63
-1
lines changed

types-grammar/ch1.md

Lines changed: 63 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -190,12 +190,74 @@ The `!` operator negates/flips a boolean value to the other one: `false` becomes
190190
191191
### String Values
192192
193-
The `string` type contains any value which is a collection of one or more characters:
193+
The `string` type contains any value which is a collection of one or more characters, delimited (surrounding on either side) by quote characters:
194194
195195
```js
196196
myName = "Kyle";
197197
```
198198
199+
Strings can be delimited by double-quotes (`"`), single-quotes (`'`), or back-ticks (`` ` ``). The ending delimiter must always match the starting delimiter.
200+
201+
Strings have an intrinsic length which corresponds to how many code-points they contain. This does not necessarily correspond to the number of visible characters you type between the start and end delimiters (aka, the string literal). It can sometimes be a little confusing to keep straight the difference between a string literal and the underlying string value, so pay close attention.
202+
203+
If `"` or `'` are used to delimit a string literal, the contents are only parsed for *character-escape sequences*: `\` followed by one or more characters that JS recognizes and parses with special meaning. Any other characters in a string that don't parse as escape-sequences (single-character or multi-character), are inserted as-is into the string value.
204+
205+
#### Single-Character Escapes
206+
207+
For single-character escape sequences, the following characters are recognized after a `\`: `bfnrtv0'"\`. For example, `\n` (new-line), `\t` (tab), etc.
208+
209+
If a `\` is followed by any other character (except `x` and `u` -- explained below), like for example `\g`, such a sequence is parsed as just the literal character itself (`g`), dropping the preceding `\`.
210+
211+
If you want to include a `"` in the middle of a `"`-delimited string literal, use the `\"` escape sequence. Similarly, if you're including a `'` character in the middle of a `'`-delimited string literal, use the `\'` escape sequence. By contrast, a `'` does *not* need to be escaped inside a `"`-delimited string, nor vice versa.
212+
213+
```js
214+
myName = "Kyle Simpson (aka, \"getify\")";
215+
216+
console.log(myName);
217+
// Kyle Simpson (aka, "getify")
218+
```
219+
220+
To include a literal `\` backslash character in a string literal, use the `\\` (two backslashes) character-escape sequence. So, then... what would `\\\` (three backslashes) parse as? The first two `\`'s would be a `\\` escape sequence, thereby inserting just a single `\` character in the string value. The remaining third `\` would just escape whatever character comes immediately after it.
221+
222+
```js
223+
windowsDriveLocation =
224+
"C:\\\"Program Files\\Common Files\\\"";
225+
226+
console.log(windowsDriveLocation);
227+
// C:\"Program Files\Common Files\"
228+
```
229+
230+
| TIP: |
231+
| :--- |
232+
| What about four backslashes `\\\\` in a string literal? Well, that's just two `\\` escape sequences next to each other, so it results in two adjacent backslashes (`\\`) in the underlying string value. If you're paying attention, you'll see there's an odd/even pattern rule here. You should thus be able to deciper any odd (`\\\\\`, `\\\\\\\\\`, etc) or even (`\\\\\\`, `\\\\\\\\\\`, etc) number of backslashes in a string literal. |
233+
234+
#### Multi-Character Escapes
235+
236+
Multi-character escape sequences may be hexadecimal or unicode sequences.
237+
238+
Hexidecimal escape sequences are used to encode any of the base ASCII characters (codes 0-255), and look like `\x` followed by exactly two hexidecimal characters (`0-9` and `a-f` / `A-F` -- case insensitive). For example, the escape-sequence `\xA9` (or `\xa9`) corresponds to the ASCII character with code-point `169`: `©` (copyright symbol).
239+
240+
Unicode escape sequences encode any of the characters in the unicode set whose code-point values are from 0-65535, and look like `\u` followed by exactly four hexidecimal characters. For example, the escape-sequence `\u00A9` (or `\u00a9`) corresponds to that same `©` symbol, while `\u263A` (or `\u263a`) corresponds to the unicode character with code-point `9786`: `` (smiley face symbol).
241+
242+
When any character-escape sequence (regardless of length) is recognized, the single character it represents is inserted into the string, rather than the original separate characters. So, in the string `"\u263A"`, there's only one (smiley) character, not six individual characters.
243+
244+
#### Line Continuation
245+
246+
The `\` followed by an actual new-line character (not just literal `n`) is a special case, and it creates what's called a line-continuation:
247+
248+
```js
249+
greeting = "Hello \
250+
Friends!";
251+
252+
console.log(greeting);
253+
// Hello
254+
// Friends!
255+
```
256+
257+
As you can see, the new-line at the end of the `greeting = ` line is immediately preceded by a `\`, which allows this string literal to continue onto the subsequent line. Without the escaping `\` before it, a new-line appearing in a `"` or `'` delimited string literal would actually produce a JS syntax parsing error.
258+
259+
The new-line itself is still in the string value.
260+
199261
// TODO
200262
201263
### Number Values

0 commit comments

Comments
 (0)