Skip to content

Conversation

amilbourne
Copy link
Contributor

  • closes #xxxx
  • tests added / passed
  • passes black pandas
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

This is really an extension of #31435, which added index output to the messages produced when series (and DataFrames) are compared. I failed to notice that categorical values and datetime values were evaluated through a different code path and didn't add the index output for these data types. Since the original problem (index reordering) could just as easily happen for these types the functionality should be there as well. Also, it makes the output more consistent.

I added a couple of new tests to surface and check the additional output.

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comment but not a huge blocker. Otherwise lgtm @jreback

elif is_extension_array_dtype(left.dtype) and is_extension_array_dtype(right.dtype):
assert_extension_array_equal(left._values, right._values)
assert_extension_array_equal(
left._values, right._values, index_values=np.asarray(left.index)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have to use asarray here or could you just pass back the DatetimeArray? Would make for nicer output if the date time values were shown

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a good point which I hadn't thought of. However, it isn't quite as simple as that because we only know that the data is an extension array at this point - the index may not be. Also, my understanding of extension array is very poor, but I thought it could be used for things other than datetimes?

The main reason though, is that the data is currently dislayed as numeric timestamps, so it seems reasonable to fix the index display at the same time as the data. And that sounds like a seperate PR to me. I am looking at doing it (when I get time), but as mentioned above, I need to figure out extension arrays first.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And thanks for looking at the PR. Still not sure if I'm supposed to click 'Resolve conversation' now.

@WillAyd WillAyd added the Testing pandas testing functions or related to the test suite label Apr 22, 2020
@jreback jreback added this to the 1.1 milestone Apr 23, 2020
@jreback jreback merged commit c9faed4 into pandas-dev:master Apr 26, 2020
@jreback
Copy link
Contributor

jreback commented Apr 26, 2020

thanks @amilbourne

rhshadrach pushed a commit to rhshadrach/pandas that referenced this pull request May 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Testing pandas testing functions or related to the test suite

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants