@@ -121,93 +121,94 @@ inplace (it will remove the values of the column being set, and insert new value
121121| `` bfill `` |
122122| `` clip `` |
123123
124- These methods don't operate inplace by default, but have the option to specify ` inlace =True` . All those methods leave
124+ These methods don't operate inplace by default, but can be done inplace with ` inplace =True` . All those methods leave
125125the structure of the DataFrame or Series intact (shape, row/column labels), but can mutate some elements of the data of
126126the DataFrame or Series.
127127
128128** Group 3: Methods that modify the DataFrame/Series object, but not the pre-existing values**
129129
130- | Method Name |
131- | :----------------------------|
132- | `` drop `` (dropping columns) |
133- | `` eval `` |
134- | `` rename `` |
135- | `` rename_axis `` |
136- | `` reset_index `` |
137- | `` set_index `` |
138- | `` astype `` |
139- | `` infer_objects `` |
140- | `` set_axis `` |
141- | `` set_flags `` |
142- | `` to_period `` |
143- | `` to_timestamp `` |
144- | `` tz_localize `` |
145- | `` tz_convert `` |
146- | `` swaplevel `` |
147- | `` concat `` |
130+ | Method Name | Keyword |
131+ | :----------------------------| -----------------------|
132+ | `` drop `` (dropping columns) | `` inplace `` |
133+ | `` rename `` | `` inplace `` , `` copy `` |
134+ | `` rename_axis `` | `` inplace `` , `` copy `` |
135+ | `` reset_index `` | `` inplace `` |
136+ | `` set_index `` | `` inplace `` |
137+ | `` astype `` | `` copy `` |
138+ | `` infer_objects `` | `` copy `` |
139+ | `` set_axis `` | `` copy `` |
140+ | `` set_flags `` | `` copy `` |
141+ | `` to_period `` | `` copy `` |
142+ | `` to_timestamp `` | `` copy `` |
143+ | `` tz_localize `` | `` copy `` |
144+ | `` tz_convert `` | `` copy `` |
145+ | `` Series.swaplevel `` * | `` copy `` |
146+ | `` concat `` | `` copy `` |
147+
148+ \* The ` copy ` keyword is only available for ` Series.swaplevel ` and not for ` DataFrame.swaplevel ` .
148149
149150These methods can change the structure of the DataFrame or Series, such as changing the shape by adding or removing
150151columns, or changing the row/column labels (changing the index/columns attributes), but don't modify the existing
151152underlying data of the object.
153+
152154All those methods (except for ` set_flags ` ) make a copy of the full data by default, but can be performed inplace with
153155avoiding copying all data (currently enabled with the ` inplace ` or ` copy ` keyword).
154156
155157Some of these methods only have a ` copy ` keyword instead of an ` inplace `
156- keyword: ` astype ` , ` infer_objects ` , ` set_axis ` , ` set_flags ` , ` to_period ` , ` to_timestamp ` , ` tz_localize ` , ` tz_convert ` , ` swaplevel ` , ` concat `
157- and ` merge ` .
158- These allow the user to avoid a copy, but don't update the original object inplace and instead return a new object
159- referencing the same data.
158+ keyword. These allow the user to avoid a copy, but don't update the original object inplace and instead return a
159+ new object referencing the same data.
160160
161- Two methods also have both keywords: ` rename ` , ` rename_axis ` .
161+ Two methods also have both keywords: ` rename ` , ` rename_axis ` , with the ` inplace ` keyword overriding ` copy ` .
162162
163163** Group 4: Methods that can never operate inplace**
164164
165- | Method Name |
166- | :-------------------------|
167- | `` drop `` (dropping rows) |
168- | `` dropna `` |
169- | `` drop_duplicates `` |
170- | `` sort_values `` |
171- | `` sort_index `` |
172- | `` query `` |
173- | `` transpose `` |
174- | `` swapaxes `` |
175- | `` align `` |
176- | `` reindex `` |
177- | `` reindex_like `` |
178- | `` truncate `` |
179-
180- These methods can never operate inplace because the nature of the operation requires copying (such as reordering or
181- dropping rows). For those methods, ` inplace=True ` is essentially just synctactic sugar for reassigning the new result
182- to ` self ` (the calling DataFrame).
165+ | Method Name | Keyword |
166+ | :-------------------------| -------------|
167+ | ` drop ` (dropping rows) | ` inplace ` |
168+ | ` dropna ` | ` inplace ` |
169+ | ` drop_duplicates ` | ` inplace ` |
170+ | ` sort_values ` | ` inplace ` |
171+ | ` sort_index ` | ` inplace ` |
172+ | ` eval ` | ` inplace ` |
173+ | ` query ` | ` inplace ` |
174+ | ` transpose ` | ` copy ` |
175+ | ` swapaxes ` | ` copy ` |
176+ | ` align ` | ` copy ` |
177+ | ` reindex ` | ` copy ` |
178+ | ` reindex_like ` | ` copy ` |
179+ | ` truncate ` | ` copy ` |
180+
181+ Although all of these methods either ` inplace ` or ` copy ` , they can never operate inplace because the nature of the
182+ operation requires copying (such as reordering or dropping rows). For those methods, ` inplace=True ` is essentially just
183+ syntactic sugar for reassigning the new result to ` self ` (the calling DataFrame).
183184
184185Note: in the case of a "no-op" (for example when sorting an already sorted DataFrame), some of those methods might not
185- need to perform a copy. This currently happens with Copy-on-Write (regardless of `` inplace ` ` ), but this is considered an
186+ need to perform a copy. This currently happens with Copy-on-Write (regardless of ` inplace ` ), but this is considered an
186187implementation detail for the purpose of this PDEP.
187188
188189### Proposed changes and reasoning
189190
190191The methods from group 1 won't change behavior, and will remain always inplace.
191192
192- Methods in groups 3 and 4 will lose their `` copy `` and `` inplace ` ` keywords. Under Copy-on-Write, every operation will
193+ Methods in groups 3 and 4 will lose their ` copy ` and ` inplace ` keywords. Under Copy-on-Write, every operation will
193194potentially return a shallow copy of the input object, if the performed operation does not require a copy. This is
194- equivalent to behavior with `` copy=False `` and/or `` inplace=True ` ` for those methods. If users want to make a hard
195- copy(`` copy=True ` ` ), they can do:
195+ equivalent to behavior with ` copy=False ` and/or ` inplace=True ` for those methods. If users want to make a hard
196+ copy(` copy=True ` ), they can do:
196197
197198 :::python
198199 df = df.func().copy()
199200
200201Therefore, there is no benefit of keeping the keywords around for these methods.
201202
202- User can emulate behavior of the `` inplace ` ` keyword by assigning the result of an operation to the same variable:
203+ User can emulate behavior of the ` inplace ` keyword by assigning the result of an operation to the same variable:
203204
204205 :::python
205206 df = pd.DataFrame({"foo": [1, 2, 3]})
206207 df = df.reset_index()
207208 df.iloc[0, 1] = ...
208209
209- All references to the original object will go out of scope when the result of the `` reset_index ` ` operation is assigned
210- to `` df `` . As a consequence, `` iloc ` ` will continue to operate inplace, and the underlying data will not be copied.
210+ All references to the original object will go out of scope when the result of the ` reset_index ` operation is assigned
211+ to ` df ` . As a consequence, ` iloc ` will continue to operate inplace, and the underlying data will not be copied.
211212
212213The methods in group 2 behave different compared to the first three groups. These methods are actually able to operate
213214inplace because they only modify the underlying data.
@@ -220,7 +221,7 @@ If we follow the rules of Copy-on-Write[^1] where "any subset or returned series
220221the original, and thus never modifies the original", then there is no way of doing this operation inplace by default.
221222The original object would be modified before the reference goes out of scope.
222223
223- To avoid triggering a copy when a value would actually get replaced, we will keep the `` inplace ` ` argument for those
224+ To avoid triggering a copy when a value would actually get replaced, we will keep the ` inplace ` argument for those
224225methods.
225226
226227### Open Questions
@@ -238,7 +239,7 @@ For example,
238239
239240can be performed inplace.
240241
241- This is only true if `` df ` ` does not share the values it stores with another pandas object. For example, the following
242+ This is only true if ` df ` does not share the values it stores with another pandas object. For example, the following
242243operations
243244
244245 :::python
@@ -255,8 +256,8 @@ would be incompatible with the Copy-on-Write rules when actually done inplace. I
255256
256257Raising an error here is problematic since oftentimes users do not have control over whether a method would cause a "
257258lazy copy" to be triggered under Copy-on-Write. It is also hard to fix, adding a ` copy() ` before calling a method
258- with `` inplace=True ` ` might actually be worse than triggering the copy under the hood. We would only copy columns that
259- share data with another object, not the whole object like `` .copy() ` ` would.
259+ with ` inplace=True ` might actually be worse than triggering the copy under the hood. We would only copy columns that
260+ share data with another object, not the whole object like ` .copy() ` would.
260261
261262There is another possible variant, which would be to trigger the copy (like the first option), but have an option to
262263raise a warning whenever this happens.
@@ -305,13 +306,13 @@ was not inplace, since it is possible to go out of memory because of this.
305306The downsides of keeping the ` inplace=True ` option for certain methods, are that the return type of those methods will
306307now depend on the value of ` inplace ` , and that method chaining will no longer work.
307308
308- One way around this is to have the method return the original object that was operated on inplace when `` inplace=True ` ` .
309+ One way around this is to have the method return the original object that was operated on inplace when ` inplace=True ` .
309310
310311Advantages:
311312
312313- It enables to use inplace operations in a method chain
313314- It simplifies type annotations
314- - It enables to change the default for `` inplace ` ` to True under Copy-on-Write
315+ - It enables to change the default for ` inplace ` to True under Copy-on-Write
315316
316317Disadvantages:
317318
@@ -320,7 +321,7 @@ Disadvantages:
320321 returned (` df2 = df.method(inplace=True); assert df2 is df ` )
321322- It would change the behaviour of the current ` inplace=True `
322323
323- Given that `` inplace ` ` is already widely used by the pandas community, we would like to collect feedback about what the
324+ Given that ` inplace ` is already widely used by the pandas community, we would like to collect feedback about what the
324325expected return type should be. Therefore, we will defer a decision on this until a later revision of this PDEP.
325326
326327## Backward compatibility
@@ -339,11 +340,11 @@ proposal[^1].
339340
340341### Remove the ` inplace ` keyword altogether
341342
342- In the past, it was considered to remove the `` inplace ` ` keyword entirely. This was because many operations that had
343- the `` inplace ` ` keyword did not actually operate inplace, but made a copy and re-assigned the underlying values under
343+ In the past, it was considered to remove the ` inplace ` keyword entirely. This was because many operations that had
344+ the ` inplace ` keyword did not actually operate inplace, but made a copy and re-assigned the underlying values under
344345the hood, causing confusion and providing no real benefit to users.
345346
346- Because a majority of the methods supporting `` inplace ` ` did not operate inplace, it was considered at the time to
347+ Because a majority of the methods supporting ` inplace ` did not operate inplace, it was considered at the time to
347348deprecate and remove inplace from all methods, and add back the keyword as necessary.[ ^ 3 ]
348349
349350For the subset of methods where the operation actually _ can_ be done inplace (group 2), however, removing the ` inplace `
@@ -352,7 +353,7 @@ DataFrames. Therefore, we decided to keep the `inplace` keyword for this small s
352353
353354### Standardize on the ` copy ` keyword instead of ` inplace `
354355
355- It may seem more natural to standardize on the ` copy ` keyword instead of the ` inplace ` keyword, since the `` copy ` `
356+ It may seem more natural to standardize on the ` copy ` keyword instead of the ` inplace ` keyword, since the ` copy `
356357keyword already returns a new object instead of None (enabling method chaining) when it is set to ` True ` .
357358
358359However, the ` copy ` keyword is not supported in any of the values-mutating methods listed in Group 2 above
@@ -366,27 +367,27 @@ currently used.
366367
367368Currently, for methods where it is supported, when the ` copy ` keyword is ` False ` , a new pandas object (same
368369as ` copy=True ` ) is returned as the result of a method call, with the values backing the object being shared when
369- possible. With the proposed inplace behavior, current behavior of `` copy=False ` ` would return a new pandas object with
370+ possible. With the proposed inplace behavior, current behavior of ` copy=False ` would return a new pandas object with
370371identical values as the original object(that was modified inplace), which may be confusing for users, and lead to
371372ambiguity with Copy on Write rules.
372373
373374## History
374375
375- The future of the `` inplace ` ` keyword is something that has been debated a lot over the years.
376+ The future of the ` inplace ` keyword is something that has been debated a lot over the years.
376377
377378It may be helpful to review those discussions (see links) [ ^ 2 ] [ ^ 3 ] [ ^ 4 ] to better understand this PDEP.
378379
379380## Timeline
380381
381382Copy-on-Write is a relatively new feature (added in version 1.5) and some methods are missing the "lazy copy"
382- optimization (equivalent to `` copy=False ` ` ).
383+ optimization (equivalent to ` copy=False ` ).
383384
384- Therefore, we will start showing deprecation warnings for the `` copy `` and `` inplace ` ` parameters in pandas 2.1, to
385+ Therefore, we will start showing deprecation warnings for the ` copy ` and ` inplace ` parameters in pandas 2.1, to
385386allow for bugs with Copy-on-Write to be addressed and for more optimizations to be added.
386387
387388Hopefully, users will be able to switch to Copy-on-Write to keep the no-copy behavior and to silence the warnings.
388389
389- The full removal of the `` copy `` parameter and `` inplace ` ` (where necessary) is set for pandas 3.0, which will coincide
390+ The full removal of the ` copy ` parameter and ` inplace ` (where necessary) is set for pandas 3.0, which will coincide
390391with the enablement of Copy-on-Write for pandas by default.
391392
392393## PDEP History
0 commit comments