|
110 | 110 | \indextext{line splicing}% |
111 | 111 | If the first translation character is \unicode{feff}{byte order mark}, |
112 | 112 | it is deleted. |
113 | | -Each sequence of a backslash character (\textbackslash) |
| 113 | +Each sequence of a backslash character (\unicode{005c}{reverse solidus}) |
114 | 114 | immediately followed by |
115 | | -zero or more whitespace characters other than new-line followed by |
| 115 | +zero or more \grammarterm{whitespace-character}s other than new-line followed by |
116 | 116 | a new-line character is deleted, splicing |
117 | 117 | physical source lines to form \defnx{logical source lines}{source line!logical}. Only the last |
118 | 118 | backslash on any physical source line shall be eligible for being part |
|
126 | 126 | shall be processed as if an additional new-line character were appended |
127 | 127 | to the file. |
128 | 128 |
|
129 | | -\item The source file is decomposed into preprocessing |
130 | | -tokens\iref{lex.pptoken} and sequences of whitespace characters |
131 | | -(including comments). A source file shall not end in a partial |
| 129 | +\item |
| 130 | +\indextext{whitespace}% |
| 131 | +\indextext{comment}% |
| 132 | +\indextext{token!preprocessing}% |
| 133 | +The source file is decomposed into preprocessing |
| 134 | +tokens\iref{lex.pptoken} and whitespace\iref{lex.whitespace} (sequences of \grammarterm{whitespace-character}s |
| 135 | +and comments). A source file shall not end in a partial |
132 | 136 | preprocessing token or in a partial comment. |
133 | 137 | \begin{footnote} |
134 | 138 | A partial preprocessing |
|
140 | 144 | would arise from a source file ending with an unclosed \tcode{/*} |
141 | 145 | comment. |
142 | 146 | \end{footnote} |
143 | | -Each comment\iref{lex.comment} is replaced by one space character. New-line characters are |
144 | | -retained. Whether each nonempty sequence of whitespace characters other |
145 | | -than new-line is retained or replaced by one space character is |
| 147 | +Each comment\iref{lex.comment} is replaced by one \unicode{0020}{space} character. New-line characters are |
| 148 | +retained. Whether each nonempty sequence of \grammarterm{whitespace-character}s other |
| 149 | +than new-line is retained or replaced by one \unicode{0020}{space} character is |
146 | 150 | unspecified. |
147 | 151 | As characters from the source file are consumed |
148 | 152 | to form the next preprocessing token |
|
178 | 182 | \item |
179 | 183 | Adjacent \grammarterm{string-literal} tokens are concatenated\iref{lex.string}. |
180 | 184 |
|
181 | | -\item Whitespace characters separating tokens are no longer |
| 185 | +\item |
| 186 | +Any \grammarterm{whitespace-character}s separating tokens are no longer |
182 | 187 | significant. Each preprocessing token is converted into a |
183 | 188 | token\iref{lex.token}. The resulting tokens |
184 | 189 | constitute a \defn{translation unit} and |
|
467 | 472 | None of these names or aliases have leading or trailing spaces. |
468 | 473 | \end{note} |
469 | 474 |
|
470 | | -\rSec1[lex.comment]{Comments} |
| 475 | +\rSec1[lex.whitespace]{Whitespace} |
| 476 | +\indextext{whitespace|(}% |
| 477 | + |
| 478 | +\rSec2[lex.whitechar]{Whitespace Characters} |
| 479 | + |
| 480 | +\indextext{character!whitespace|(}% |
| 481 | +\begin{bnf} |
| 482 | +\nontermdef{whitespace-character}\br |
| 483 | + \unicode{0009}{character tabulation}\br |
| 484 | + \textnormal{new-line}\br |
| 485 | + \unicode{000b}{line tabulation}\br |
| 486 | + \unicode{000c}{form feed}\br |
| 487 | + \unicode{0020}{space}\br |
| 488 | +\end{bnf} |
| 489 | + |
| 490 | +\pnum |
| 491 | +\begin{note} |
| 492 | +Whitespace characters are used to separate elements of the \Cpp grammar. |
| 493 | +\end{note} |
| 494 | +\indextext{character!whitespace|)} |
| 495 | + |
| 496 | +\rSec2[lex.comment]{Comments} |
471 | 497 |
|
472 | 498 | \pnum |
473 | 499 | \indextext{comment|(}% |
|
477 | 503 | characters \tcode{*/}. These comments do not nest. |
478 | 504 | \indextext{comment!\tcode{//}}% |
479 | 505 | The characters \tcode{//} start a comment, which terminates immediately before the |
480 | | -next new-line character. If there is a form-feed or a vertical-tab |
481 | | -character in such a comment, only whitespace characters shall appear |
| 506 | +next new-line character. If there is a \unicode{000c}{form feed} or a \unicode{000b}{line tabulation} |
| 507 | +character in such a comment, only \grammarterm{whitespace-character}s shall appear |
482 | 508 | between it and the new-line that terminates the comment; no diagnostic |
483 | 509 | is required. |
484 | 510 | \begin{note} |
|
489 | 515 | \tcode{/*} comment. |
490 | 516 | \end{note} |
491 | 517 | \indextext{comment|)} |
| 518 | +\indextext{whitespace|)}% |
492 | 519 |
|
493 | 520 | \rSec1[lex.pptoken]{Preprocessing tokens} |
494 | 521 |
|
|
506 | 533 | string-literal\br |
507 | 534 | user-defined-string-literal\br |
508 | 535 | preprocessing-op-or-punc\br |
509 | | - \textnormal{each non-whitespace character that cannot be one of the above} |
| 536 | + \textnormal{each non-\grammarterm{whitespace-character} that cannot be one of the above} |
510 | 537 | \end{bnf} |
511 | 538 |
|
512 | 539 | \pnum |
|
520 | 547 | (\grammarterm{import-keyword}, \grammarterm{module-keyword}, and \grammarterm{export-keyword}), |
521 | 548 | identifiers, preprocessing numbers, character literals (including user-defined character |
522 | 549 | literals), string literals (including user-defined string literals), preprocessing |
523 | | -operators and punctuators, and single non-whitespace characters that do not lexically |
| 550 | +operators and punctuators, and single non-\grammarterm{whitespace-character}s that do not lexically |
524 | 551 | match the other preprocessing token categories. |
525 | 552 | If a \unicode{0027}{apostrophe} or a \unicode{0022}{quotation mark} character |
526 | 553 | matches the last category, the program is ill-formed. |
527 | 554 | If any character not in the basic character set matches the last category, |
528 | 555 | the program is ill-formed. |
529 | 556 | Preprocessing tokens can be separated by |
530 | 557 | \indextext{whitespace}% |
531 | | -whitespace; |
| 558 | +whitespace\iref{lex.whitespace}; |
532 | 559 | \indextext{comment}% |
533 | | -this consists of comments\iref{lex.comment}, or whitespace characters |
534 | | -(\unicode{0020}{space}, |
535 | | -\unicode{0009}{character tabulation}, |
536 | | -new-line, |
537 | | -\unicode{000b}{line tabulation}, and |
538 | | -\unicode{000c}{form feed}), or both. |
| 560 | +this consists of comments, \grammarterm{whitespace-character}s, or both. |
539 | 561 | As described in \ref{cpp}, in certain |
540 | 562 | circumstances during translation phase 4, whitespace (or the absence |
541 | 563 | thereof) serves as more than preprocessing token separation. Whitespace |
|
826 | 848 | \end{footnote} |
827 | 849 | operators, and other separators. |
828 | 850 | \indextext{whitespace}% |
829 | | -Blanks, horizontal and vertical tabs, newlines, formfeeds, and comments |
830 | | -(collectively, ``whitespace''), as described below, are ignored except |
831 | | -as they serve to separate tokens. |
| 851 | +Whitespace\iref{lex.whitespace} is ignored except to separate tokens. |
832 | 852 | \begin{note} |
833 | 853 | Whitespace can separate otherwise adjacent identifiers, keywords, numeric |
834 | 854 | literals, and alternative tokens containing alphabetic characters. |
|
1790 | 1810 | \begin{bnf} |
1791 | 1811 | \nontermdef{d-char}\br |
1792 | 1812 | \textnormal{any member of the basic character set except:}\br |
1793 | | - \bnfindent\textnormal{\unicode{0020}{space}, \unicode{0028}{left parenthesis}, \unicode{0029}{right parenthesis}, \unicode{005c}{reverse solidus},}\br |
1794 | | - \bnfindent\textnormal{\unicode{0009}{character tabulation}, \unicode{000b}{line tabulation}, \unicode{000c}{form feed}, and new-line} |
| 1813 | + \bnfindent\textnormal{a \grammarterm{whitespace-character}, \unicode{0028}{left parenthesis}, \unicode{0029}{right parenthesis},}\br |
| 1814 | + \bnfindent\textnormal{and \unicode{005c}{reverse solidus}} |
1795 | 1815 | \end{bnf} |
1796 | 1816 |
|
1797 | 1817 | \pnum |
|
0 commit comments