@@ -10,6 +10,100 @@ including other versions of pandas.
10
10
11
11
.. ---------------------------------------------------------------------------
12
12
13
+ .. _whatsnew_220.upcoming_changes :
14
+
15
+ Upcoming changes in pandas 3.0
16
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
17
+
18
+ pandas 3.0 will bring two bigger changes to the default behavior of pandas.
19
+
20
+ Dedicated string data type by default
21
+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22
+
23
+ Historically, pandas represented string columns with NumPy ``object `` data type.
24
+ This representation has numerous problems: it is not specific to strings (any
25
+ Python object can be stored in an ``object ``-dtype array, not just strings) and
26
+ it is often not very efficient (both performance wise and for memory usage).
27
+
28
+ Starting with the upcoming pandas 3.0 release, a dedicated string data type will
29
+ be enabled by default (backed by PyArrow under the hood, if installed, otherwise
30
+ falling back to NumPy). This means that pandas will start inferring columns
31
+ containing string data as the new ``str `` data type when creating pandas
32
+ objects, such as in constructors or IO functions.
33
+
34
+ Old behavior:
35
+
36
+ .. code-block :: python
37
+ >> > ser = pd.Series([" a" , " b" ])
38
+ 0 a
39
+ 1 b
40
+ dtype: object
41
+ New behavior:
42
+
43
+ .. code-block :: python
44
+ >> > ser = pd.Series([" a" , " b" ])
45
+ 0 a
46
+ 1 b
47
+ dtype: str
48
+
49
+ The string data type that is used in these scenarios will mostly behave as NumPy
50
+ object would, including missing value semantics and general operations on these
51
+ columns.
52
+
53
+ However, the introduction of a new default dtype will also have some breaking
54
+ consequences your code (for example when checking for the ``.dtype `` being
55
+ object dtype). To allow testing it in advance of the pandas 3.0 release, this
56
+ future dtype inference logic can be enabled in pandas 2.3 with:
57
+
58
+ .. code-block :: ipython
59
+
60
+ pd.options.future.infer_string = True
61
+
62
+ TODO add link to migration guide
63
+
64
+ Copy-on-Write
65
+ ^^^^^^^^^^^^^
66
+
67
+ The currently optional mode Copy-on-Write will be enabled by default in pandas 3.0. There
68
+ won't be an option to keep the current behavior enabled.
69
+
70
+ In summary, the new "copy-on-write" behaviour will bring changes in behavior in
71
+ how pandas operates with respect to copies and views.
72
+
73
+ 1. The result of *any * indexing operation (subsetting a DataFrame or Series in any way,
74
+ i.e. including accessing a DataFrame column as a Series) or any method returning a
75
+ new DataFrame or Series, always *behaves as if * it were a copy in terms of user
76
+ API.
77
+ 2. As a consequence, if you want to modify an object (DataFrame or Series), the only way
78
+ to do this is to directly modify that object itself.
79
+
80
+ Because every single indexing step now behaves as a copy, this also means that
81
+ "chained assignment" (updating a DataFrame with multiple setitem steps) will
82
+ stop working. Because this now consistently never works, the
83
+ ``SettingWithCopyWarning `` will be removed.
84
+
85
+ The new behavioral semantics are explained in more detail in the
86
+ :ref: `user guide about Copy-on-Write <copy_on_write >`.
87
+
88
+ The new behavior can be enabled since pandas 2.0 with the following option:
89
+
90
+ .. code-block :: ipython
91
+
92
+ pd.options.mode.copy_on_write = True
93
+
94
+ Some of the behaviour changes allow a clear deprecation, like the changes in
95
+ chained assignment. Other changes are more subtle and thus, the warnings are
96
+ hidden behind an option that can be enabled since pandas 2.2:
97
+
98
+ .. code-block :: ipython
99
+
100
+ pd.options.mode.copy_on_write = "warn"
101
+
102
+ This mode will warn in many different scenarios that aren't actually relevant to
103
+ most queries. We recommend exploring this mode, but it is not necessary to get rid
104
+ of all of these warnings. The :ref: `migration guide <copy_on_write.migration_guide >`
105
+ explains the upgrade process in more detail.
106
+
13
107
.. _whatsnew_230.enhancements :
14
108
15
109
Enhancements
0 commit comments