Skip to content

Conversation

afonso-antunes
Copy link

@afonso-antunes afonso-antunes commented Apr 1, 2025

Fix Summary:

Previously, the _make_concat_multiindex method could silently downgrade extension dtypes (e.g., to object) when creating levels. This PR ensures that the _concat_indexes helper uses the correct dtype-aware construction (array(..., dtype=...)) to preserve the original dtype of the first index.

Test added:

Added a test in pandas/tests/frame/methods/test_concat_arrow_index.py that covers the preservation of extension dtypes when using pd.concat with keys= that triggers MultiIndex creation.

The test creates two DataFrames with timestamp[pyarrow] indices, then concatenates them with pd.concat(..., keys=...) and asserts that:

  • The resulting index is a MultiIndex
  • The second level (levels[1]) retains the ArrowDtype('timestamp[us][pyarrow]') instead of being downgraded to object.

This ensures the dtype preservation fix is validated and regressed against.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG: Index[timestamp[pyarrow]].union with itself return object type

1 participant