|
1 |
| -Pythons string type `str` can be very powerful. At its core, a `str` is an immutable [text sequence](https://docs.python.org/3/library/stdtypes.html#text-sequence-type-str) of [Unicode code points](https://stackoverflow.com/questions/27331819/whats-the-difference-between-a-character-a-code-point-a-glyph-and-a-grapheme). There is no separate "character" or "char" type in Python. |
| 1 | +Pythons `str` (_string_) type can be very powerful. At its core, a `str` is an immutable [text sequence][text sequence] of [Unicode code points][unicode code points]. There is no separate "character" or "rune" type in Python. |
2 | 2 |
|
3 |
| -Like any [sequence type](https://docs.python.org/3/library/stdtypes.html#sequence-types-list-tuple-range), code points or "characters" within a string can be referenced by 0-based index number, and can be copied in whole or in part via _slice notation_. Since there is no separate “character” type, indexing a string produces a new `str` of length 1 (_for example "exercism"[0] == "exercism"[0:1] == "e"_). Strings support all [common sequence operations](https://docs.python.org/3/library/stdtypes.html#common-sequence-operations), and individual code points or "characters" can be iterated through in a loop via `for item in`. |
| 3 | +Like any [sequence type][sequence type], code points within a `str` can be referenced by 0-based index number and can be copied in whole or in part via _slice notation_. Since there is no separate “character” type, indexing a string produces a new `str` of length 1: |
4 | 4 |
|
5 |
| -For a deep dive on what information a `str` encodes (or, _"how does the computer know how to translate zeroes and ones into letters?"_), [this blog post is enduringly helpful][joel-on-text]. |
| 5 | +```python |
| 6 | + |
| 7 | +>>> website = "exercism" |
| 8 | +>>> type(website[0]) |
| 9 | +<class 'str'> |
| 10 | + |
| 11 | +>>> len(website[0]) |
| 12 | +1 |
| 13 | + |
| 14 | +>>> website[0] == website[0:1] == 'e' |
| 15 | +True |
| 16 | +``` |
| 17 | + |
| 18 | +Strings support all [common sequence operations][common sequence operations]. Individual code points can be iterated through in a loop via **`for item in`**. |
| 19 | + |
| 20 | +```python |
| 21 | + |
| 22 | +>>> exercise = 'လေ့ကျင့်' |
| 23 | + |
| 24 | +#note that there are more code points than percieved glyphs or characters |
| 25 | +>>> for code_point in exercise: |
| 26 | +... print(code_point) |
| 27 | +... |
| 28 | +လ |
| 29 | +ေ |
| 30 | +့ |
| 31 | +က |
| 32 | +ျ |
| 33 | +င |
| 34 | +် |
| 35 | +့ |
| 36 | +``` |
| 37 | + |
| 38 | +For a deep dive on what information a string encodes (or, _"how does the computer know how to translate zeroes and ones into letters?"_), [this blog post is enduringly helpful][joel-on-text]. Additionally, the docs provide a [unicode HOWTO][unicode how-to] that discusses Pythons support for the Unicode specification in the `str`, `bytes` and `re` modules, and some common issues. |
6 | 39 |
|
7 |
| -Strings can be transformed by [various methods](https://docs.python.org/3/library/stdtypes.html#string-methods), split into letters/symbols, and joined together via [`.join()`](https://docs.python.org/3/library/stdtypes.html#str.join) or `+` to create larger strings . Due to their immutability, any transformations applied to a `str` return a new `str`. |
| 40 | +Strings can be transformed by [various methods][str-methods], split into code points via [`.split()`][str-split], or joined together into larger strings via [`.join()`][str-join] or `+`. Due to their _immutability_, any transformations applied to strings return new `str` objects. |
8 | 41 |
|
9 | 42 | ### Construction
|
10 | 43 |
|
11 | 44 | The simplest way to create a `str` literal is by delimiting it with `"` or `'`. Strings can also be written across multiple lines by using triple quotes (`"""` or `'''`) .
|
12 | 45 |
|
13 |
| -````python |
14 |
| -single_quoted = 'Single quotes allow "double quoting" without "escape" characters.' |
| 46 | +```python |
| 47 | + |
| 48 | +>>> single_quoted = 'Single quotes allow "double quoting" without "escape" characters.' |
15 | 49 |
|
16 |
| -double_quoted = "Double quotes allow embedded 'single quoting' without 'escape' characters". |
| 50 | +>>> double_quoted = "Double quotes allow embedded 'single quoting' without 'escape' characters". |
17 | 51 |
|
18 |
| -triple_quoted = '''Three single quotes or double quotes in a row allow for multi-line string literals. You will most often encounter these as "doc strings" or "doc tests" written just below the first line of a function or class definition. They are often used with auto documentation tools.''' |
| 52 | +>>> triple_quoted = '''Three single quotes or double quotes in a row allow for multi-line string literals. You will most often encounter these as "doc strings" or "doc tests" written just below the first line of a function or class definition. They are often used with auto documentation tools.''' |
19 | 53 | String literals that are part of a single expression and are separated only by white space are _implicitly concatenated_ into a single string literal:
|
20 | 54 |
|
21 |
| -```python |
22 |
| -("I do not " |
23 |
| -"like " |
24 |
| -"green eggs and ham.") == "I do not like green eggs and ham."``` |
25 | 55 |
|
| 56 | +#if you put seperate strings within parenthesis, they will be *implicitly concatenated* by the interpreter |
| 57 | +>>> ("I do not " |
| 58 | +"like " |
| 59 | +"green eggs and ham.") == "I do not like green eggs and ham." |
| 60 | +True |
| 61 | +``` |
26 | 62 |
|
27 |
| -Additionally, [interpolated](https://en.wikipedia.org/wiki/String_interpolation) strings (`f-strings`) can be formed: |
| 63 | +Additionally, [interpolated][f-strings] strings (`f-strings`) can be formed: |
28 | 64 |
|
29 | 65 | ```python
|
30 | 66 | my_name = "Praveen"
|
31 | 67 |
|
32 | 68 | intro_string = f"Hi! My name is {my_name}, and I'm happy to be here!"
|
33 | 69 |
|
34 | 70 | >>>print(intro_string)
|
35 |
| -Hi! My name is Praveen, and I'm happy to be here!``` |
| 71 | +Hi! My name is Praveen, and I'm happy to be here! |
| 72 | +``` |
36 | 73 |
|
37 |
| -Finally, the [`str()` constructor](https://docs.python.org/3/library/stdtypes.html#str) can be used to create/coerce strings from other objects/types. However, the `str` constructor _**will not iterate**_ through an object , so if something like a `list` of elements needs to be connected, `.join()` is a better option: |
| 74 | +Finally, the [`str()` constructor][str-constructor] can be used to create/coerce strings from other objects/types. However, the `str` constructor _**will not iterate**_ through an object , so if something like a `list` of individual elements needs to be connected, `.join()` is a better option: |
38 | 75 |
|
39 | 76 | ```python
|
40 | 77 | >>> my_number = 675
|
41 | 78 | >>> str(my_number)
|
42 | 79 | '675'
|
43 | 80 |
|
| 81 | +#this is a bit surprising, as it will make the entire data structure, complete with the brackets, into a str |
44 | 82 | >>> my_list = ["hen", "egg", "rooster"]
|
45 | 83 | >>> str(my_list)
|
46 | 84 | "['hen', 'egg', 'rooster']"
|
47 | 85 |
|
48 |
| ->>> ' '.join(my_list) |
49 |
| -'hen egg rooster'``` |
50 | 86 |
|
| 87 | +#however, using .join() will iterate and form a string from individual elements |
| 88 | +>>> ' '.join(my_list) |
| 89 | +'hen egg rooster' |
| 90 | +``` |
51 | 91 |
|
52 | 92 | ### Formatting
|
53 | 93 |
|
54 |
| -Python provides a rich set of tools for [formatting](https://docs.python.org/3/library/string.html#custom-string-formatting) and [templating](https://docs.python.org/3/library/string.html#template-strings) strings, as well as more sophisticated text processing through the [re (_regular expressions_)](https://docs.python.org/3/library/re.html), [difflib (_sequence comparison_)](https://docs.python.org/3/library/difflib.html), and [textwrap](https://docs.python.org/3/library/textwrap.html) modules. For a great introduction to string formatting in Python, see [this post at Real Python](https://realpython.com/python-string-formatting/). |
55 |
| - |
56 |
| -For more details on string methods, see [Strings and Character Data in Python](https://realpython.com/python-strings/) at the same site. |
57 |
| - |
58 |
| -### Related types and encodings |
59 |
| - |
60 |
| -In addition to `str` (a *text* sequence), Python has corresponding [binary sequence types](https://docs.python.org/3/library/stdtypes.html#binaryseq) `bytes` (a *binary* sequence), `bytearray` and `memoryview` for the efficient storage and handling of binary data. Additionally, [Streams](https://docs.python.org/3/library/asyncio-stream.html#streams) allow sending and receiving binary data over a network connection without using callbacks. |
61 |
| -```` |
| 94 | +Python provides a rich set of tools for [formatting][str-formatting] and [templating][template-strings] strings, as well as more sophisticated text processing through the [re (_regular expressions_)][re], [difflib (_sequence comparison_)][difflib], and [textwrap][textwrap] modules. For a great introduction to string formatting in Python, see [this post at Real Python][real python string formatting]. For more details on string methods, see [Strings and Character Data in Python][strings and characters] at the same site. |
| 95 | + |
| 96 | +### Related types and encodings |
| 97 | + |
| 98 | +In addition to `str` (a _text_ sequence), Python has corresponding [binary sequence types][binary sequence types] summarized under [binary data services][binary data services] -- `bytes` (a _binary_ sequence), `bytearray`, and `memoryview` for the efficient storage and handling of binary data. Additionally, [Streams][streams] allow sending and receiving binary data over a network connection without using callbacks. |
| 99 | + |
| 100 | +[text sequence]: https://docs.python.org/3/library/stdtypes.html#text-sequence-type-str |
| 101 | +[unicode code points]: https://stackoverflow.com/questions/27331819/whats-the-difference-between-a-character-a-code-point-a-glyph-and-a-grapheme |
| 102 | +[sequence type]: https://docs.python.org/3/library/stdtypes.html#sequence-types-list-tuple-range |
| 103 | +[common sequence operations]: https://docs.python.org/3/library/stdtypes.html#common-sequence-operations |
| 104 | +[joel-on-text]: https://www.joelonsoftware.com/2003/10/08/the-absolute-minimum-every-software-developer-absolutely-positively-must-know-about-unicode-and-character-sets-no-excuses/ |
| 105 | +[str-methods]: https://docs.python.org/3/library/stdtypes.html#string-methods |
| 106 | +[str-join]: https://docs.python.org/3/library/stdtypes.html#str.join |
| 107 | +[f-strings]: https://en.wikipedia.org/wiki/String_interpolation |
| 108 | +[str-constructor]: https://docs.python.org/3/library/stdtypes.html#str |
| 109 | +[str-formatting]: https://docs.python.org/3/library/string.html#custom-string-formatting |
| 110 | +[template-strings]: https://docs.python.org/3/library/string.html#template-strings |
| 111 | +[re]: https://docs.python.org/3/library/re.html |
| 112 | +[difflib]: https://docs.python.org/3/library/difflib.html |
| 113 | +[textwrap]: https://docs.python.org/3/library/textwrap.html |
| 114 | +[real python string formatting]: https://realpython.com/python-string-formatting/ |
| 115 | +[strings and characters]: https://realpython.com/python-strings/ |
| 116 | +[binary sequence types]: https://docs.python.org/3/library/stdtypes.html#binaryseq |
| 117 | +[streams]: https://docs.python.org/3/library/asyncio-stream.html#streams |
| 118 | +[unicode how-to]: https://docs.python.org/3/howto/unicode.html |
| 119 | +[str-split]: https://docs.python.org/3/library/stdtypes.html#str.split |
| 120 | +[binary data services]: https://docs.python.org/3/library/binary.html#binaryservices |
0 commit comments