Skip to content

Commit b5cdd57

Browse files
committed
Explain why the letter d is the choice, rather than other letters
1 parent ae9a668 commit b5cdd57

File tree

1 file changed

+140
-11
lines changed

1 file changed

+140
-11
lines changed

text/3830-dedented-string-literals.md

Lines changed: 140 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -954,14 +954,151 @@ let _ = d"
954954

955955
## Design
956956

957-
### The choice of `d"string"` specifically
957+
### The choice of the letter `d` for "dedent"
958+
959+
When picking a single letter for this feature, we want:
960+
961+
- A letter that represents a mnemonic
962+
- The mnemonic should make sense
963+
- And be memorable
964+
965+
The RFC picks `d` as a mnemonic for "dedent".
966+
967+
- Dedentation is a simple atomic operation which removes the leading indentation of the string
968+
- The transformation is always a dedentation
969+
970+
If there is no leading indentation, removing the it is still accurately described as a "dedentation" because the nothing is removed.
971+
- It might help make the acronym more memorable by thinking about the `d` as "**d**eleting" the leading indentation.
972+
973+
#### Why not `u` for "unindent"
974+
975+
Confusion can arise due to the way this string prefix has been used in other languages:
976+
977+
- In Python 2, `u` is a prefix for Unicode strings
978+
- In C++, `u` is used for UTF-16 strings
979+
980+
The goal a single-letter acronym hopes to accomplish is to be memorable and make sense.
981+
It can be argued that the word "Unindent" is more complex than the word "Dedent":
982+
983+
- Unindent contains a negation, consisting of two "parts": **un** + **indent**. Undoing an indentation.
984+
- Dedent represents an atomic operation, which is removal of indentation and is a synonym to unindent.
985+
986+
Using a negated word can be considered to be less desireable, because in order to undo the negation we have to perform an extra "step" when thinking about it.
987+
988+
Consider that instead of a negated `if` condition:
989+
990+
```rs
991+
if !string.is_empty() {
992+
walk()
993+
} else {
994+
run()
995+
}
996+
```
997+
998+
Writing the non-negated version first is often clearer:
999+
1000+
```rs
1001+
if string.is_empty() {
1002+
run()
1003+
} else {
1004+
walk()
1005+
}
1006+
```
1007+
1008+
Using a word with a lower cognitive complexity may make it easier to think about and more memorable.
1009+
1010+
#### Why not `i` for "indent"
1011+
1012+
Indent is the opposite of dedent. It could make sense, but from a completely different perspective.
1013+
1014+
The question is, which one do we value more:
1015+
1016+
- A word that describes what the string looks like in the source code.
1017+
- A word that describes the transformation that the string goes through when it is evaluated.
1018+
1019+
"Indent" describes what the string looks like in the source code:
1020+
1021+
```rs
1022+
fn main() {
1023+
let table_name = "student";
1024+
1025+
println!(
1026+
d"
1027+
create table {table_name}(
1028+
id int primary key,
1029+
name text
1030+
)
1031+
"
1032+
);
1033+
}
1034+
```
1035+
1036+
But it does not describe the transformation that it goes through:
1037+
1038+
```sh
1039+
create table student(
1040+
id int primary key,
1041+
name text
1042+
)
1043+
```
1044+
1045+
When the string is evaluated, the leading indentation is removed. It is **dedented**.
1046+
1047+
In the source code, the string is **indented**.
1048+
1049+
- When viewing the string from the source code, the indentation is obvious.
1050+
1051+
However, it is *not* obvious what will happen to the string when it is evaluated. "Dedent" can be clearer in this regard, as we already have 1 piece of information and the word "dedent" brings us the other piece.
1052+
1053+
- The string may not always be considered to be indented:
1054+
1055+
```rs
1056+
let _ = d"
1057+
hello world
1058+
";
1059+
```
1060+
1061+
In the above example, there is no indentation for the strings. It would be inaccurate to describe the string as having indentation.
1062+
1063+
Once the string is evaluated, it is accurate to describe the removal of the non-existing indentation as still "dedenting" the string.
1064+
1065+
#### Why not `m` for "multi-line"
1066+
1067+
- Dedented string literals do not necesserily represent a multi-line string:
1068+
1069+
```rs
1070+
let _ = d"
1071+
hello world
1072+
";
1073+
```
1074+
1075+
The above is equivalent to:
1076+
1077+
```rs
1078+
let _ = "hello world";
1079+
```
1080+
1081+
Confusion could arise, as people expect it to evaluate to a string spanning multile lines.
1082+
1083+
#### Why not `h` for "heredoc"
1084+
1085+
RFC #3450 uses `h` as the modifier instead of `d`, as an acronym for [Here document](https://en.wikipedia.org/wiki/Here_document).
1086+
1087+
- The term is likely to be less known, and may raise confusion, especially amongst
1088+
those that don't know what it is.
1089+
- Here documents are more associated with "code blocks", which may associate an "info string"
1090+
with them (such as in markdown). This RFC does not propose an info string.
1091+
1092+
While the feature this RFC proposes (dedented string literals) are useful for code
1093+
blocks, it is not just for them.
1094+
1095+
### The choice of the form `d"string"`
9581096

9591097
The syntax of `d"string"` is chosen for the following reasons:
9601098

9611099
- Fits with existing string modifiers, such as `b"string"`, `r#"string"#"` and `c"string"`
9621100
- Composes with existing string modifiers: `db"string"`, `dc"string"`, `dr#"string"#`, and `dbr#"string"#`.
9631101
- Does not introduce a lot of new syntax. Dedented string literals can be explained in terms of existing language features.
964-
- The acronym `d` for `dedent` is understandable, and not taken by any of the other string modifiers.
9651102
- Adding a single letter `d` before a string literal to turn it into a dedented string literal is an incredibly easy modification.
9661103
- Rust reserves space for additional string modifiers.
9671104

@@ -1066,15 +1203,7 @@ The [RFC #3450: Propose code string literals](https://github.com/rust-lang/rfcs/
10661203

10671204
Differences:
10681205

1069-
- #3450 uses `h` as the modifier instead of `d`.
1070-
1071-
proposes using `h` as acronym for [Here document](https://en.wikipedia.org/wiki/Here_document).
1072-
1073-
The term is likely to be less known, and may raise confusion.
1074-
1075-
Additionally, here documents are more associated with "code blocks". While this feature is useful for code blocks, it is not just for them.
1076-
1077-
While the `d` mnemonic for **dedent** clearly describes what actually happens to the strings.
1206+
- #3450 uses `h` as the modifier instead of `d`. Explained [earlier](#why-not-h-for-heredoc)
10781207

10791208
- #3450 allows to write an *info string*, like in markdown.
10801209

0 commit comments

Comments
 (0)