Skip to content

Commit 14f4cfa

Browse files
add section about copy-on-write
1 parent 315f743 commit 14f4cfa

File tree

1 file changed

+51
-0
lines changed

1 file changed

+51
-0
lines changed

doc/source/whatsnew/v3.0.0.rst

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,57 @@ for the ``.dtype`` being object dtype or checking the exact missing value sentin
6565

6666
TODO add link to migration guide for more details
6767

68+
.. seealso::
69+
70+
`PDEP-14: Dedicated string data type for pandas 3.0 <https://pandas.pydata.org/pdeps/0014-string-dtype.html>`__
71+
72+
73+
.. _whatsnew_300.enhancements.copy_on_write:
74+
75+
Copy-on-Write
76+
^^^^^^^^^^^^^
77+
78+
The new "copy-on-write" behaviour in pandas 3.0 brings changes in behavior in
79+
how pandas operates with respect to copies and views. A summary of the changes:
80+
81+
1. The result of *any* indexing operation (subsetting a DataFrame or Series in any way,
82+
i.e. including accessing a DataFrame column as a Series) or any method returning a
83+
new DataFrame or Series, always *behaves as if* it were a copy in terms of user
84+
API.
85+
2. As a consequence, if you want to modify an object (DataFrame or Series), the only way
86+
to do this is to directly modify that object itself.
87+
88+
The main goal of this change is to make the user API more consistent and
89+
predictable. There is now a clear rule: *any* subset or returned
90+
series/dataframe **always** behaves as a copy of the original, and thus never
91+
modifies the original (before pandas 3.0, whether a derived object would be a
92+
copy or a view depended on the exact operation performed, which was often
93+
confusing).
94+
95+
Because every single indexing step now behaves as a copy, this also means that
96+
"chained assignment" (updating a DataFrame with multiple setitem steps) will
97+
stop working. Because this now consistently never works, the
98+
``SettingWithCopyWarning`` is removed.
99+
100+
The new behavioral semantics are explained in more detail in the
101+
:ref:`user guide about Copy-on-Write <copy_on_write>`.
102+
103+
A secondary goal is to improve performance by avoiding unnecessary copies. As
104+
mentioned above, every new DataFrame or Series returned from an indexing
105+
operation or method *behaves* as a copy, but under the hood pandas will use
106+
views as much as possible, and only copy when needed to guarantee the "behaves
107+
as a copy" behaviour (this is the actual "copy-on-write" mechanism used as an
108+
implementation detail).
109+
110+
Some of the behaviour changes described above are breaking changes in pandas
111+
3.0. When upgrading to pandas 3.0, it is recommended to first upgrade to pandas
112+
2.3 to get deprecation warnings for a subset of those changes. The
113+
:ref:`migration guide <copy_on_write.migration_guide>` explains the upgrade
114+
process in more detail.
115+
116+
.. seealso::
117+
118+
`PDEP-7: Consistent copy/view semantics in pandas with Copy-on-Write <https://pandas.pydata.org/pdeps/0007-copy-on-write.html>`__
68119

69120
.. _whatsnew_300.enhancements.enhancement2:
70121

0 commit comments

Comments
 (0)