Skip to content

Conversation

sinhrks
Copy link
Member

@sinhrks sinhrks commented Apr 2, 2014

Same background as #6736.

Currently,

  • When a function is passed, any error caused by the function results in rename error.
  • When a dict is passed, label which is not included in the axis will be silently skipped. Even though drop raises ValueError in such a case.
>>> df = pd.DataFrame({1: [1, 2], 'B': pd.to_datetime(['2010-01-01', np.nan])})
>>> renamed_func = df.rename(columns=lambda x: x + 1)
TypeError: cannot concatenate 'str' and 'int' objects

>>> renamed_dict = df.rename(columns={'B':'C', 'D':'E'})
Index([1, u'C'], dtype='object')

I think it is nice if rename also has errors keyword to:

  • Suppress error raised by the function and rename only non-problematic labels (errors='ignore'), or raise error whatever derived from the function (errors='raise').
  • Suppress error if label is not found in the target axis, and rename only non-problematic labels (errors='ignore'), or raise error if any of label is not included in the target axis (errors='raise').

I feel the default should be errors='raise' in the future version based on the other functions behavior. This doesn't affect to the current behavior when a function is passed, but affects to when a dict is passed.

In this version, rename raise FutureWarning for future precaution if it is called with non-existing label. And it is possible to force rename to raise ValueError in such a case by specifying errors='raise'.

>>> renamed_func = df.rename(columns=lambda x: x + 1, errors='ignore')
Index([2, u'B'], dtype='object')
>>> renamed_dict = df.rename(columns={'B':'C', 'D':'E', 'F':'G'}, errors='ignore')
Index([1, u'C'], dtype='object')

>>> renamed_func = df.rename(columns=lambda x: x + 1, errors='raise')
TypeError: cannot concatenate 'str' and 'int' objects
>>> renamed_dict = df.rename(columns={'B':'C', 'D':'E', 'F':'G'}, errors='raise')
ValueError: labels ['D' 'F'] not contained in axis

@jreback jreback added this to the 0.14.0 milestone Apr 2, 2014
@jreback
Copy link
Contributor

jreback commented Apr 2, 2014

so effectively if a function is passed now errors='raise' while a dict is errors='ignore'? (before this PR)

@sinhrks
Copy link
Member Author

sinhrks commented Apr 2, 2014

Yes, it looks so.

@jreback
Copy link
Contributor

jreback commented Apr 2, 2014

ok...can you emulate that (with a deprecation warning) until can be changed (in next version)

@jtratner
Copy link
Contributor

jtratner commented Apr 5, 2014

To be honest, I don't see a compelling case for this - it's fundamentally just sugar that hides errors from the user. If you pass a function, you should be responsible for handling errors and try/catching as necessary (e.g., deciding how to handle unclear values, etc.). The flip side - raising errors when you specify a label to rename that's not in the index - does not seem helpful and would just make the function more brittle.

So in the case you've cited, the better answer would be to do:

def add1(x):
    try:
        return x + 1
    except:
        return x

At this point you might modify the function to try to convert the string (or whatever object) to an integer, or default None to 0, etc.

I don't think this merits the extra complexity it would add, nor the break in backwards compatibility, so I'm going to close this PR for now.

@jtratner jtratner closed this Apr 5, 2014
@sinhrks sinhrks deleted the rename branch November 14, 2015 05:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

API Design Indexing Related to indexing on series/frames, not to indexes themselves

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants