|
1 | 1 | defmodule String do
|
2 | 2 | @moduledoc %B"""
|
3 |
| - A string in Elixir is a UTF-8 encoded binary. |
| 3 | + A String in Elixir is a UTF-8 encoded binary. |
| 4 | +
|
| 5 | + ## String and binary operations |
4 | 6 |
|
5 | 7 | The functions in this module act according to the
|
6 |
| - Unicode Standard, version 6.2.0. A codepoint is a |
7 |
| - Unicode Character, which may be represented by one |
8 |
| - or more bytes. For example, the character "é" is |
9 |
| - represented with two bytes: |
| 8 | + Unicode Standard, version 6.2.0. For example, |
| 9 | + `titlecase`, `downcase`, `strip` are provided by this |
| 10 | + module. |
| 11 | +
|
| 12 | + Besides this module, Elixir provides more low-level |
| 13 | + operations that works directly with binaries. Some |
| 14 | + of those can be found in the `Kernel` module, as: |
| 15 | +
|
| 16 | + * `binary_part/2` and `binary_part/3` - retrieves part of the binary |
| 17 | + * `bit_size/1` and `byte_size/1` - size related functions |
| 18 | + * `is_bitstring/1` and `is_binary/1` - type checking function |
| 19 | + * Plus a bunch of conversion functions, like `binary_to_atom/2`, |
| 20 | + `binary_to_integer/2`, `binary_to_term/1` and their opposite |
| 21 | + like `integer_to_binary/2` |
| 22 | +
|
| 23 | + Finally, [the `:binary` module](http://erlang.org/doc/man/binary.html) |
| 24 | + provides a couple other functions that works on the byte level. |
| 25 | +
|
| 26 | + ## Codepoints and graphemes |
| 27 | +
|
| 28 | + As per the Unicode Standard, a codepoint is an Unicode |
| 29 | + Character, which may be represented by one or more bytes. |
| 30 | + For example, the character "é" is represented with two |
| 31 | + bytes: |
10 | 32 |
|
11 | 33 | string = "é"
|
12 | 34 | #=> "é"
|
@@ -40,7 +62,7 @@ defmodule String do
|
40 | 62 | ## Integer codepoints
|
41 | 63 |
|
42 | 64 | Although codepoints could be represented as integers, this
|
43 |
| - module represents all codepoints as binaries. For example: |
| 65 | + module represents all codepoints as strings. For example: |
44 | 66 |
|
45 | 67 | String.codepoints "josé" #=> ["j", "o", "s", "é"]
|
46 | 68 |
|
@@ -72,8 +94,10 @@ defmodule String do
|
72 | 94 | characters. For example, `String.length` is going to return
|
73 | 95 | a correct result even if an invalid codepoint is fed into it.
|
74 | 96 |
|
75 |
| - In the future, bang version of such functions may be |
76 |
| - provided which will rather raise on such invalid data. |
| 97 | + In other words, this module expects invalid data to be detected |
| 98 | + when retrieving data from the external source. For example, a |
| 99 | + driver that reads strings from a database will be the one |
| 100 | + responsible to check the validity of the encoding. |
77 | 101 | """
|
78 | 102 |
|
79 | 103 | @type t :: binary
|
|
0 commit comments