Skip to content

Commit b513957

Browse files
authored
Merge pull request numpy#19791 from Mukulikaa/view-copy-doc
DOC: Created an explanation document for copies and views
2 parents 28173db + 2b4563a commit b513957

File tree

2 files changed

+153
-0
lines changed

2 files changed

+153
-0
lines changed

doc/source/user/basics.copies.rst

Lines changed: 152 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,152 @@
1+
.. _basics.copies-and-views:
2+
3+
****************
4+
Copies and views
5+
****************
6+
7+
When operating on NumPy arrays, it is possible to access the internal data
8+
buffer directly using a :ref:`view <view>` without copying data around. This
9+
ensures good performance but can also cause unwanted problems if the user is
10+
not aware of how this works. Hence, it is important to know the difference
11+
between these two terms and to know which operations return copies and
12+
which return views.
13+
14+
The NumPy array is a data structure consisting of two parts:
15+
the :term:`contiguous` data buffer with the actual data elements and the
16+
metadata that contains information about the data buffer. The metadata
17+
includes data type, strides, and other important information that helps
18+
manipulate the :class:`.ndarray` easily. See the :ref:`numpy-internals`
19+
section for a detailed look.
20+
21+
.. _view:
22+
23+
View
24+
====
25+
26+
It is possible to access the array differently by just changing certain
27+
metadata like :term:`stride` and :term:`dtype` without changing the
28+
data buffer. This creates a new way of looking at the data and these new
29+
arrays are called views. The data buffer remains the same, so any changes made
30+
to a view reflects in the original copy. A view can be forced through the
31+
:meth:`.ndarray.view` method.
32+
33+
Copy
34+
====
35+
36+
When a new array is created by duplicating the data buffer as well as the
37+
metadata, it is called a copy. Changes made to the copy
38+
do not reflect on the original array. Making a copy is slower and
39+
memory-consuming but sometimes necessary. A copy can be forced by using
40+
:meth:`.ndarray.copy`.
41+
42+
Indexing operations
43+
===================
44+
45+
.. seealso:: :ref:`basics.indexing`
46+
47+
Views are created when elements can be addressed with offsets and strides
48+
in the original array. Hence, basic indexing always creates views.
49+
For example::
50+
51+
>>> x = np.arange(10)
52+
>>> x
53+
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
54+
>>> y = x[1:3] # creates a view
55+
>>> y
56+
array([1, 2])
57+
>>> x[1:3] = [10, 11]
58+
>>> x
59+
array([ 0, 10, 11, 3, 4, 5, 6, 7, 8, 9])
60+
>>> y
61+
array([10, 11])
62+
63+
Here, ``y`` gets changed when ``x`` is changed because it is a view.
64+
65+
:ref:`advanced-indexing`, on the other hand, always creates copies.
66+
For example::
67+
68+
>>> x = np.arange(9).reshape(3, 3)
69+
>>> x
70+
array([[0, 1, 2],
71+
[3, 4, 5],
72+
[6, 7, 8]])
73+
>>> y = x[[1, 2]]
74+
>>> y
75+
array([[3, 4, 5],
76+
[6, 7, 8]])
77+
>>> y.base is None
78+
True
79+
80+
Here, ``y`` is a copy, as signified by the :attr:`base <.ndarray.base>`
81+
attribute. We can also confirm this by assigning new values to ``x[[1, 2]]``
82+
which in turn will not affect ``y`` at all::
83+
84+
>>> x[[1, 2]] = [[10, 11, 12], [13, 14, 15]]
85+
>>> x
86+
array([[ 0, 1, 2],
87+
[10, 11, 12],
88+
[13, 14, 15]])
89+
>>> y
90+
array([[3, 4, 5],
91+
[6, 7, 8]])
92+
93+
It must be noted here that during the assignment of ``x[[1, 2]]`` no view
94+
or copy is created as the assignment happens in-place.
95+
96+
97+
Other operations
98+
================
99+
100+
The :func:`numpy.reshape` function creates a view where possible or a copy
101+
otherwise. In most cases, the strides can be modified to reshape the
102+
array with a view. However, in some cases where the array becomes
103+
non-contiguous (perhaps after a :meth:`.ndarray.transpose` operation),
104+
the reshaping cannot be done by modifying strides and requires a copy.
105+
In these cases, we can raise an error by assigning the new shape to the
106+
shape attribute of the array. For example::
107+
108+
>>> x = np.ones((2, 3))
109+
>>> y = x.T # makes the array non-contiguous
110+
>>> y
111+
array([[1., 1.],
112+
[1., 1.],
113+
[1., 1.]])
114+
>>> z = y.view()
115+
>>> z.shape = 6
116+
Traceback (most recent call last):
117+
...
118+
AttributeError: Incompatible shape for in-place modification. Use
119+
`.reshape()` to make a copy with the desired shape.
120+
121+
Taking the example of another operation, :func:`.ravel` returns a contiguous
122+
flattened view of the array wherever possible. On the other hand,
123+
:meth:`.ndarray.flatten` always returns a flattened copy of the array.
124+
However, to guarantee a view in most cases, ``x.reshape(-1)`` may be preferable.
125+
126+
How to tell if the array is a view or a copy
127+
============================================
128+
129+
The :attr:`base <.ndarray.base>` attribute of the ndarray makes it easy
130+
to tell if an array is a view or a copy. The base attribute of a view returns
131+
the original array while it returns ``None`` for a copy.
132+
133+
>>> x = np.arange(9)
134+
>>> x
135+
array([0, 1, 2, 3, 4, 5, 6, 7, 8])
136+
>>> y = x.reshape(3, 3)
137+
>>> y
138+
array([[0, 1, 2],
139+
[3, 4, 5],
140+
[6, 7, 8]])
141+
>>> y.base # .reshape() creates a view
142+
array([0, 1, 2, 3, 4, 5, 6, 7, 8])
143+
>>> z = y[[2, 1]]
144+
>>> z
145+
array([[6, 7, 8],
146+
[3, 4, 5]])
147+
>>> z.base is None # advanced indexing creates a copy
148+
True
149+
150+
Note that the ``base`` attribute should not be used to determine
151+
if an ndarray object is *new*; only if it is a view or a copy
152+
of another ndarray.

doc/source/user/basics.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,3 +19,4 @@ fundamental NumPy ideas and philosophy.
1919
basics.dispatch
2020
basics.subclassing
2121
basics.ufuncs
22+
basics.copies

0 commit comments

Comments
 (0)