|
| 1 | +.. _basics.copies-and-views: |
| 2 | + |
| 3 | +**************** |
| 4 | +Copies and views |
| 5 | +**************** |
| 6 | + |
| 7 | +When operating on NumPy arrays, it is possible to access the internal data |
| 8 | +buffer directly using a :ref:`view <view>` without copying data around. This |
| 9 | +ensures good performance but can also cause unwanted problems if the user is |
| 10 | +not aware of how this works. Hence, it is important to know the difference |
| 11 | +between these two terms and to know which operations return copies and |
| 12 | +which return views. |
| 13 | + |
| 14 | +The NumPy array is a data structure consisting of two parts: |
| 15 | +the :term:`contiguous` data buffer with the actual data elements and the |
| 16 | +metadata that contains information about the data buffer. The metadata |
| 17 | +includes data type, strides, and other important information that helps |
| 18 | +manipulate the :class:`.ndarray` easily. See the :ref:`numpy-internals` |
| 19 | +section for a detailed look. |
| 20 | + |
| 21 | +.. _view: |
| 22 | + |
| 23 | +View |
| 24 | +==== |
| 25 | + |
| 26 | +It is possible to access the array differently by just changing certain |
| 27 | +metadata like :term:`stride` and :term:`dtype` without changing the |
| 28 | +data buffer. This creates a new way of looking at the data and these new |
| 29 | +arrays are called views. The data buffer remains the same, so any changes made |
| 30 | +to a view reflects in the original copy. A view can be forced through the |
| 31 | +:meth:`.ndarray.view` method. |
| 32 | + |
| 33 | +Copy |
| 34 | +==== |
| 35 | + |
| 36 | +When a new array is created by duplicating the data buffer as well as the |
| 37 | +metadata, it is called a copy. Changes made to the copy |
| 38 | +do not reflect on the original array. Making a copy is slower and |
| 39 | +memory-consuming but sometimes necessary. A copy can be forced by using |
| 40 | +:meth:`.ndarray.copy`. |
| 41 | + |
| 42 | +Indexing operations |
| 43 | +=================== |
| 44 | + |
| 45 | +.. seealso:: :ref:`basics.indexing` |
| 46 | + |
| 47 | +Views are created when elements can be addressed with offsets and strides |
| 48 | +in the original array. Hence, basic indexing always creates views. |
| 49 | +For example:: |
| 50 | + |
| 51 | + >>> x = np.arange(10) |
| 52 | + >>> x |
| 53 | + array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) |
| 54 | + >>> y = x[1:3] # creates a view |
| 55 | + >>> y |
| 56 | + array([1, 2]) |
| 57 | + >>> x[1:3] = [10, 11] |
| 58 | + >>> x |
| 59 | + array([ 0, 10, 11, 3, 4, 5, 6, 7, 8, 9]) |
| 60 | + >>> y |
| 61 | + array([10, 11]) |
| 62 | + |
| 63 | +Here, ``y`` gets changed when ``x`` is changed because it is a view. |
| 64 | + |
| 65 | +:ref:`advanced-indexing`, on the other hand, always creates copies. |
| 66 | +For example:: |
| 67 | + |
| 68 | + >>> x = np.arange(9).reshape(3, 3) |
| 69 | + >>> x |
| 70 | + array([[0, 1, 2], |
| 71 | + [3, 4, 5], |
| 72 | + [6, 7, 8]]) |
| 73 | + >>> y = x[[1, 2]] |
| 74 | + >>> y |
| 75 | + array([[3, 4, 5], |
| 76 | + [6, 7, 8]]) |
| 77 | + >>> y.base is None |
| 78 | + True |
| 79 | + |
| 80 | +Here, ``y`` is a copy, as signified by the :attr:`base <.ndarray.base>` |
| 81 | +attribute. We can also confirm this by assigning new values to ``x[[1, 2]]`` |
| 82 | +which in turn will not affect ``y`` at all:: |
| 83 | + |
| 84 | + >>> x[[1, 2]] = [[10, 11, 12], [13, 14, 15]] |
| 85 | + >>> x |
| 86 | + array([[ 0, 1, 2], |
| 87 | + [10, 11, 12], |
| 88 | + [13, 14, 15]]) |
| 89 | + >>> y |
| 90 | + array([[3, 4, 5], |
| 91 | + [6, 7, 8]]) |
| 92 | + |
| 93 | +It must be noted here that during the assignment of ``x[[1, 2]]`` no view |
| 94 | +or copy is created as the assignment happens in-place. |
| 95 | + |
| 96 | + |
| 97 | +Other operations |
| 98 | +================ |
| 99 | + |
| 100 | +The :func:`numpy.reshape` function creates a view where possible or a copy |
| 101 | +otherwise. In most cases, the strides can be modified to reshape the |
| 102 | +array with a view. However, in some cases where the array becomes |
| 103 | +non-contiguous (perhaps after a :meth:`.ndarray.transpose` operation), |
| 104 | +the reshaping cannot be done by modifying strides and requires a copy. |
| 105 | +In these cases, we can raise an error by assigning the new shape to the |
| 106 | +shape attribute of the array. For example:: |
| 107 | + |
| 108 | + >>> x = np.ones((2, 3)) |
| 109 | + >>> y = x.T # makes the array non-contiguous |
| 110 | + >>> y |
| 111 | + array([[1., 1.], |
| 112 | + [1., 1.], |
| 113 | + [1., 1.]]) |
| 114 | + >>> z = y.view() |
| 115 | + >>> z.shape = 6 |
| 116 | + Traceback (most recent call last): |
| 117 | + ... |
| 118 | + AttributeError: Incompatible shape for in-place modification. Use |
| 119 | + `.reshape()` to make a copy with the desired shape. |
| 120 | + |
| 121 | +Taking the example of another operation, :func:`.ravel` returns a contiguous |
| 122 | +flattened view of the array wherever possible. On the other hand, |
| 123 | +:meth:`.ndarray.flatten` always returns a flattened copy of the array. |
| 124 | +However, to guarantee a view in most cases, ``x.reshape(-1)`` may be preferable. |
| 125 | + |
| 126 | +How to tell if the array is a view or a copy |
| 127 | +============================================ |
| 128 | + |
| 129 | +The :attr:`base <.ndarray.base>` attribute of the ndarray makes it easy |
| 130 | +to tell if an array is a view or a copy. The base attribute of a view returns |
| 131 | +the original array while it returns ``None`` for a copy. |
| 132 | + |
| 133 | + >>> x = np.arange(9) |
| 134 | + >>> x |
| 135 | + array([0, 1, 2, 3, 4, 5, 6, 7, 8]) |
| 136 | + >>> y = x.reshape(3, 3) |
| 137 | + >>> y |
| 138 | + array([[0, 1, 2], |
| 139 | + [3, 4, 5], |
| 140 | + [6, 7, 8]]) |
| 141 | + >>> y.base # .reshape() creates a view |
| 142 | + array([0, 1, 2, 3, 4, 5, 6, 7, 8]) |
| 143 | + >>> z = y[[2, 1]] |
| 144 | + >>> z |
| 145 | + array([[6, 7, 8], |
| 146 | + [3, 4, 5]]) |
| 147 | + >>> z.base is None # advanced indexing creates a copy |
| 148 | + True |
| 149 | + |
| 150 | +Note that the ``base`` attribute should not be used to determine |
| 151 | +if an ndarray object is *new*; only if it is a view or a copy |
| 152 | +of another ndarray. |
0 commit comments