@@ -10,6 +10,100 @@ including other versions of pandas.
1010
1111.. ---------------------------------------------------------------------------
1212
13+ .. _whatsnew_220.upcoming_changes :
14+
15+ Upcoming changes in pandas 3.0
16+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
17+
18+ pandas 3.0 will bring two bigger changes to the default behavior of pandas.
19+
20+ Dedicated string data type by default
21+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22+
23+ Historically, pandas represented string columns with NumPy ``object `` data type.
24+ This representation has numerous problems: it is not specific to strings (any
25+ Python object can be stored in an ``object ``-dtype array, not just strings) and
26+ it is often not very efficient (both performance wise and for memory usage).
27+
28+ Starting with the upcoming pandas 3.0 release, a dedicated string data type will
29+ be enabled by default (backed by PyArrow under the hood, if installed, otherwise
30+ falling back to NumPy). This means that pandas will start inferring columns
31+ containing string data as the new ``str `` data type when creating pandas
32+ objects, such as in constructors or IO functions.
33+
34+ Old behavior:
35+
36+ .. code-block :: python
37+ >> > ser = pd.Series([" a" , " b" ])
38+ 0 a
39+ 1 b
40+ dtype: object
41+ New behavior:
42+
43+ .. code-block :: python
44+ >> > ser = pd.Series([" a" , " b" ])
45+ 0 a
46+ 1 b
47+ dtype: str
48+
49+ The string data type that is used in these scenarios will mostly behave as NumPy
50+ object would, including missing value semantics and general operations on these
51+ columns.
52+
53+ However, the introduction of a new default dtype will also have some breaking
54+ consequences your code (for example when checking for the ``.dtype `` being
55+ object dtype). To allow testing it in advance of the pandas 3.0 release, this
56+ future dtype inference logic can be enabled in pandas 2.3 with:
57+
58+ .. code-block :: ipython
59+
60+ pd.options.future.infer_string = True
61+
62+ TODO add link to migration guide
63+
64+ Copy-on-Write
65+ ^^^^^^^^^^^^^
66+
67+ The currently optional mode Copy-on-Write will be enabled by default in pandas 3.0. There
68+ won't be an option to keep the current behavior enabled.
69+
70+ In summary, the new "copy-on-write" behaviour will bring changes in behavior in
71+ how pandas operates with respect to copies and views.
72+
73+ 1. The result of *any * indexing operation (subsetting a DataFrame or Series in any way,
74+ i.e. including accessing a DataFrame column as a Series) or any method returning a
75+ new DataFrame or Series, always *behaves as if * it were a copy in terms of user
76+ API.
77+ 2. As a consequence, if you want to modify an object (DataFrame or Series), the only way
78+ to do this is to directly modify that object itself.
79+
80+ Because every single indexing step now behaves as a copy, this also means that
81+ "chained assignment" (updating a DataFrame with multiple setitem steps) will
82+ stop working. Because this now consistently never works, the
83+ ``SettingWithCopyWarning `` will be removed.
84+
85+ The new behavioral semantics are explained in more detail in the
86+ :ref: `user guide about Copy-on-Write <copy_on_write >`.
87+
88+ The new behavior can be enabled since pandas 2.0 with the following option:
89+
90+ .. code-block :: ipython
91+
92+ pd.options.mode.copy_on_write = True
93+
94+ Some of the behaviour changes allow a clear deprecation, like the changes in
95+ chained assignment. Other changes are more subtle and thus, the warnings are
96+ hidden behind an option that can be enabled since pandas 2.2:
97+
98+ .. code-block :: ipython
99+
100+ pd.options.mode.copy_on_write = "warn"
101+
102+ This mode will warn in many different scenarios that aren't actually relevant to
103+ most queries. We recommend exploring this mode, but it is not necessary to get rid
104+ of all of these warnings. The :ref: `migration guide <copy_on_write.migration_guide >`
105+ explains the upgrade process in more detail.
106+
13107.. _whatsnew_230.enhancements :
14108
15109Enhancements
0 commit comments