|
| 1 | +In Python, a [list][list] is a mutable collection of items in _sequence_. Like most collections (_see the built-ins [`tuple`][tuple], [`dict`][dict] and [`set`][set]_), lists can hold reference to any (or multiple) data type(s) - including other lists. Like any [sequence][sequence type], items are referenced by 0-based index number, and can be copied in whole or in part via _slice notation_. Lists support all [common sequence operations][common sequence operations], as well as [mutable sequence operations][mutable sequence operations] like `.append()` and `.reverse()`. They can be iterated over in a loop by using the `for item in` construct. |
| 2 | + |
| 3 | +Under the hood, lists are implemented as [dynamic arrays][dynamic array] -- similar to Java's [`Arraylist`][arraylist] type. Lists are most often used to store groups of similar data (_strings, numbers, sets etc._) of unknown length (_the number of entries may arbitrarily expand or shrink_). Accessing items in a list, checking for membership via `in`, or appending items to the "right-hand" side of a list are all very efficient. Appending to the "left-hand" side or inserting into the middle of a list is much _less_ efficient because it requires shifting items to keep then in sequence. |
| 4 | + |
| 5 | +Because lists are mutable and can contain references to arbitrary objects, they take up more memory space than a fixed-size `array.array` type of the same apparent length. Despite this, lists are an extremely flexible and useful data structure and many built-in methods and operations in Python produce lists as their output. |
| 6 | + |
| 7 | +## Construction |
| 8 | + |
| 9 | +A list can be declared as a _literal_ with square `[]` brackets and commas between elements: |
| 10 | + |
| 11 | +```python |
| 12 | +>>> no_elements = [] |
| 13 | +[] |
| 14 | + |
| 15 | +>>> one_element = ["Guava"] |
| 16 | +["Guava"] |
| 17 | + |
| 18 | +>>> elements_separated_with_commas = ["Parrot", "Bird", 334782] |
| 19 | +["Parrot", "Bird", 334782] |
| 20 | +``` |
| 21 | + |
| 22 | +For readability, line breaks can be used when there are many elements or nested data structures within a list: |
| 23 | + |
| 24 | +```python |
| 25 | +>>> lots_of_entries =[ |
| 26 | + "Rose", |
| 27 | + "Sunflower", |
| 28 | + "Poppy", |
| 29 | + "Pansy", |
| 30 | + "Tulip", |
| 31 | + "Fuchsia", |
| 32 | + "Cyclamen", |
| 33 | + "Lavender", |
| 34 | + "Daisy", |
| 35 | + "Jasmine", |
| 36 | + "Hydrangea", |
| 37 | + "Hyacinth", |
| 38 | + "Peony", |
| 39 | + "Dahlia", |
| 40 | + "Dandelion", |
| 41 | + "Tuberose", |
| 42 | + "Ranunculus" |
| 43 | + ] |
| 44 | +['Rose', 'Sunflower', 'Poppy', 'Pansy', 'Tulip', 'Fuchsia', 'Cyclamen', 'Lavender', 'Daisy', 'Jasmine', 'Hydrangea', 'Hyacinth', 'Peony', 'Dahlia', 'Dandelion', 'Tuberose', 'Ranunculus'] |
| 45 | + |
| 46 | +>>> nested_data_structures = [ |
| 47 | + {"fish": "gold", "monkey": "brown", "parrot" : "grey"}, |
| 48 | + ("fish", "mammal", "bird"), |
| 49 | + ['water', 'jungle', 'sky'] |
| 50 | + ] |
| 51 | +[{"fish": "gold", "monkey": "brown", "parrot" : "grey"}, ("fish", "mammal", "bird"), ['water', 'jungle', 'sky']] |
| 52 | +``` |
| 53 | + |
| 54 | +The `list()` constructor can be used empty or with an _iterable_ as an argument. Elements in the iterable are cycled through by the constructor and added to the list in order: |
| 55 | + |
| 56 | +```python |
| 57 | +>>> no_elements = list() |
| 58 | +[] |
| 59 | + |
| 60 | +#the tuple is unpacked and each element is added |
| 61 | +>>> multiple_elements_tuple = list(("Parrot", "Bird", 334782)) |
| 62 | +["Parrot", "Bird", 334782] |
| 63 | + |
| 64 | +#the set is unpacked and each element is added |
| 65 | +>>> multiple_elements_set = list({2, 3, 5, 7, 11}) |
| 66 | +[2,3,5,7,11] |
| 67 | +``` |
| 68 | + |
| 69 | +Results when using a list constructor with a string or a dict may be surprising: |
| 70 | + |
| 71 | +````python |
| 72 | + |
| 73 | +#string elements (Unicode code points) are iterated through and added *individually* |
| 74 | +>>> multiple_elements_string = list("Timbuktu") |
| 75 | +['T', 'i', 'm', 'b', 'u', 'k', 't', 'u'] |
| 76 | + |
| 77 | + |
| 78 | +>>> multiple_code_points_string = list('अभ्यास') |
| 79 | +['अ', 'भ', '्', 'य', 'ा', 'स'] |
| 80 | + |
| 81 | +""" |
| 82 | +The iteration default for dictionaries is over the keys. |
| 83 | +""" |
| 84 | +source_data = {"fish": "gold", "monkey": "brown"} |
| 85 | +>>> multiple_elements_dict_1 = list(source_data) |
| 86 | +['fish', 'monkey'] |
| 87 | + |
| 88 | +Because the constructor will only take _iterables_ (or nothing) as arguments, objects that are _not_ iterable will throw a type error. Consequently, it is much easier to create a one-item list via the literal method. |
| 89 | + |
| 90 | +```python |
| 91 | + |
| 92 | +>>> one_element = list(16) |
| 93 | +Traceback (most recent call last): |
| 94 | + File "<stdin>", line 1, in <module> |
| 95 | + TypeError: 'int' object is not iterable |
| 96 | + |
| 97 | +>>> one_element_from_iterable = list((16,)) |
| 98 | +[16] |
| 99 | +```` |
| 100 | + |
| 101 | +## Accessing elements |
| 102 | + |
| 103 | +Items inside lists (_like the sequence types `string` and `tuple`_), can be accessed via 0-based index and _bracket notation_. Indexes can be from **`left`** --> **`right`** (_starting at zero_) or **`right`** --> **`left`** (_starting at -1_). |
| 104 | + |
| 105 | +| **0** | **1** | **2** | **3** | **4** | **5** |\ |
| 106 | + -------------------------\ |
| 107 | + | P | y | t | h | o | n |\ |
| 108 | + -------------------------\ |
| 109 | + |_**-6**_ |_**-5**_ |_**-4**_ |_**-3**_ |_**-2**_ |_**-1**_ | <---- |
| 110 | + |
| 111 | +```python |
| 112 | + |
| 113 | +>>> breakfast_foods = ["Oatmeal", "Fruit Salad", "Eggs", "Toast"] |
| 114 | + |
| 115 | +#Oatmeal is at index 0 or index -4 |
| 116 | +>>> first_breakfast_food = breakfast_foods[0] |
| 117 | +'Oatmeal' |
| 118 | + |
| 119 | +>>> first_breakfast_food = breakfast_foods[-4] |
| 120 | +'Oatmeal' |
| 121 | +``` |
| 122 | + |
| 123 | +The method `.pop()` can be used to both remove and return a value at a given index: |
| 124 | + |
| 125 | +```python |
| 126 | + |
| 127 | +>>> breakfast_foods = ["Oatmeal", "Fruit Salad", "Eggs", "Toast"] |
| 128 | + |
| 129 | +#Fruit Salad is at index 1 or index -3 |
| 130 | +>>> breakfast_foods = ["Oatmeal", "Fruit Salad", "Eggs", "Toast"] |
| 131 | +>>> fruit_on_the_side = breakfast_foods.pop(-3) |
| 132 | +'Fruit Salad' |
| 133 | + |
| 134 | +>>> print(breakfast_foods) |
| 135 | +['Oatmeal', 'Eggs', 'Toast'] |
| 136 | + |
| 137 | +``` |
| 138 | + |
| 139 | +The method `.insert()` can be used to add an element at a specific position. The index given is the element _*before which to insert*_. `list.insert(0,element)` will insert at the front of the list and `list.insert(len(list), element)` is the equivalent of calling `list.append(element)`. |
| 140 | + |
| 141 | +```python |
| 142 | + |
| 143 | +breakfast_foods = ["Oatmeal", "Fruit Salad", "Eggs", "Toast"] |
| 144 | + |
| 145 | +#adding bacon to the mix before index 3 or index -1 |
| 146 | +>>> breakfast_foods.insert(3,"Bacon") |
| 147 | +>>> print(breakfast_foods) |
| 148 | +['Oatmeal', 'Fruit Salad', 'Eggs', 'Bacon', 'Toast'] |
| 149 | + |
| 150 | + |
| 151 | +#adding coffee in the first position |
| 152 | +>>> breakfast_foods.insert(0, "Coffee") |
| 153 | +>>> print(breakfast_foods) |
| 154 | +['Coffee', 'Oatmeal', 'Fruit Salad', 'Eggs', 'Bacon', 'Toast'] |
| 155 | +``` |
| 156 | + |
| 157 | +## Working with lists |
| 158 | + |
| 159 | +Lists supply an _iterator_, and can be looped through/over in the same manner as other _sequence types_: |
| 160 | + |
| 161 | +```python |
| 162 | + |
| 163 | +>>> colors = ["Orange", "Green", "Grey", "Blue"] |
| 164 | +>>> for item in colors: |
| 165 | +... print(item) |
| 166 | +... |
| 167 | +Orange |
| 168 | +Green |
| 169 | +Grey |
| 170 | +Blue |
| 171 | + |
| 172 | +>>> numbers_to_cube = [5, 13, 12, 16] |
| 173 | +>>> for number in numbers_to_cube: |
| 174 | +... print(number*3) |
| 175 | +... |
| 176 | +15 |
| 177 | +39 |
| 178 | +36 |
| 179 | +48 |
| 180 | + |
| 181 | +``` |
| 182 | + |
| 183 | +One common way to compose a list of values is to use `list.append()` with a loop: |
| 184 | + |
| 185 | +```python |
| 186 | + |
| 187 | +>>> cubes_to_1000 = [] |
| 188 | + |
| 189 | +>>> for number in range(11): |
| 190 | +... cubes_to_1000.append(number**3) |
| 191 | + |
| 192 | +>>> print(cubles_to_1000) |
| 193 | +[0, 1, 8, 27, 64, 125, 216, 343, 512, 729, 1000] |
| 194 | +``` |
| 195 | + |
| 196 | +Lists can be combined via various techniques: |
| 197 | + |
| 198 | +```python |
| 199 | +# using the plus + operator unpacks each list and creates a new list, but it is not efficent. |
| 200 | +>>> new_via_concatenate = ["George", 5] + ["cat", "Tabby"] |
| 201 | +["George", 5, "cat", "Tabby"] |
| 202 | + |
| 203 | + |
| 204 | +#likewise, using the multiplication operator * is the equivalent of using + n times |
| 205 | +>>> first_group = ["cat", "dog", "elephant"] |
| 206 | + |
| 207 | +>>> multiplied_group = first_group * 3 |
| 208 | +['cat', 'dog', 'elephant', 'cat', 'dog', 'elephant', 'cat', 'dog', 'elephant'] |
| 209 | + |
| 210 | + |
| 211 | +# a more efficent method of combining 2 lists is to use slice asignment or appending in a loop |
| 212 | +# by mutating one of the original lists |
| 213 | +first_one = ["cat", "Tabby"] |
| 214 | +second_one = ["George", 5] |
| 215 | + |
| 216 | +# this assigns the second list to index 0 in the first list |
| 217 | +>>> first_one[0:0] = second_one |
| 218 | +>>> first_one |
| 219 | +["George", 5, "cat", "Tabby"] |
| 220 | + |
| 221 | +# this loops through the first list and appends it's items to the end of the second list |
| 222 | +>>> for item in first_one: |
| 223 | +>>> second_one.append(item) |
| 224 | +... |
| 225 | +>>> print(second_one) |
| 226 | +["George", 5, "cat", "Tabby"] |
| 227 | +``` |
| 228 | + |
| 229 | +## Related data types |
| 230 | + |
| 231 | +Lists are often used as _stacks_ and _queues_ -- although their underlying implementation makes prepending and inserting slow. The [collections][collections] module offers a [deque][deque] variant optimized for fast appends and pops from either end that is implemented as a [doubly linked list][doubly linked list]. Nested lists are also used to model small _matrices_ -- although the [Numpy][numpy] and [Pandas][pandas] libraries are much more robust for efficient matrix and tabular data manipulation. The collections module also provides a `UserList` type that can be customized to fit specialized list needs. |
| 232 | + |
| 233 | +[list]: https://docs.python.org/3/library/stdtypes.html#list |
| 234 | +[tuple]: https://docs.python.org/3/library/stdtypes.html#tuple |
| 235 | +[dict]: https://docs.python.org/3/library/stdtypes.html#dict |
| 236 | +[set]: https://docs.python.org/3/library/stdtypes.html#set |
| 237 | +[sequence type]: https://docs.python.org/3/library/stdtypes.html#sequence-types-list-tuple-range |
| 238 | +[common sequence operations]: https://docs.python.org/3/library/stdtypes.html#common-sequence-operations |
| 239 | +[mutable sequence operations]: https://docs.python.org/3/library/stdtypes.html#typesseq-mutable |
| 240 | +[dynamic array]: https://en.wikipedia.org/wiki/Dynamic_array |
| 241 | +[arraylist]: https://beginnersbook.com/2013/12/java-arraylist/ |
| 242 | +[doubly linked list]: https://en.wikipedia.org/wiki/Doubly_linked_list |
0 commit comments