You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: web/pandas/pdeps/0008-inplace-methods-in-pandas.md
+15-11Lines changed: 15 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -255,7 +255,7 @@ Summarizing for the `inplace` keyword, we propose to:
255
255
(group 4) or only update the object (group 3, "object-inplace", which can be emulated
256
256
with reassigning).
257
257
258
-
### Open Questions
258
+
### Other design questions
259
259
260
260
#### With `inplace=True`, should we silently copy or raise an error if the data has references?
261
261
@@ -290,12 +290,13 @@ lazy copy" to be triggered under Copy-on-Write. It is also hard to fix, adding a
290
290
with `inplace=True` might actually be worse than triggering the copy under the hood. We would only copy columns that
291
291
share data with another object, not the whole object like `.copy()` would.
292
292
293
-
There is another possible variant, which would be to trigger the copy (like the first option), but have an option to
294
-
raise a warning whenever this happens.
293
+
**Therefore, we propose to silently copy when needed.** The `inplace=True` option would thus mean "try inplace whenever possible", and not guarantee it is actually done inplace.
294
+
295
+
In the future, if there is demand for it, it could still be possible to add to option to raise a warning whenever this happens.
295
296
This would be useful in an IPython shell/Jupyter Notebook setting, where the user would have the opportunity to delete
296
297
unused references that are causing the copying to be triggered.
297
298
298
-
For example,
299
+
For example:
299
300
300
301
:::ipython
301
302
In [1]: import pandas as pd
@@ -334,16 +335,16 @@ was not inplace, since it is possible to go out of memory because of this.
334
335
335
336
#### Return the calling object (`self`) also when using `inplace=True`?
336
337
337
-
The downsides of keeping the `inplace=True` option for certain methods, are that the return type of those methods will
338
-
now depend on the value of `inplace`, and that method chaining will no longer work.
339
-
340
-
One way around this is to have the method return the original object that was operated on inplace when `inplace=True`.
338
+
One of the downsides of the `inplace=True` option is that the return type of those methods
339
+
depends on the value of `inplace`, and that method chaining does not work.
340
+
Those downsides are still relevant for the cases where we keep `inplace=True`.
341
+
To address this, we can have those methods return the object that was operated on
342
+
inplace when `inplace=True`.
341
343
342
344
Advantages:
343
345
344
346
- It enables to use inplace operations in a method chain
345
347
- It simplifies type annotations
346
-
- It enables to change the default for `inplace` to True under Copy-on-Write
347
348
348
349
Disadvantages:
349
350
@@ -352,8 +353,11 @@ Disadvantages:
352
353
returned (`df2 = df.method(inplace=True); assert df2 is df`)
353
354
- It would change the behaviour of the current `inplace=True`
354
355
355
-
Given that `inplace` is already widely used by the pandas community, we would like to collect feedback about what the
356
-
expected return type should be. Therefore, we will defer a decision on this until a later revision of this PDEP.
356
+
We generally assume that changing to return `self` should not give much problems for
357
+
existing usage (typically, the current return value of `None` is not actively used).
358
+
Further, we think the advantages of simplifing return types and enabling methods chains
359
+
outweighs the special case of returning an identical object.
360
+
**Therefore, we propose that for those methods with an `inplace=True` option, the calling object (`self`) gets returned.**
0 commit comments