Skip to content

Conversation

eicchen
Copy link
Contributor

@eicchen eicchen commented Jul 26, 2025

This is just the implementation for using usecols order for read_csv that I wanted to have people look at before moving to apply it to other places like read_excel and read_clipboard. If it all looks good, I'll go back and add all necessary documentation about future deprecation along with a popup when using usecols. This is mainly just for checking that the implementation doesn't have any glaring issues.

I ran the entire test suite just to be safe and it all looks good. The only thing the errored were some datetime tests that had nothing to do with the changes that I could find.

Oh it is also worth noting that pyarrow already uses the usecols order by default so that's probably worth adding to the documentation regardless

@eicchen eicchen changed the title Enh #61386 usecols ENH: usecols takes input order for read_csv implementation review Jul 26, 2025
False,
": bool\n "
"Whether usecols parameter will use order of input when "
"making a DataFrame. \n This feature will be default in pandas 3.0"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think if this option is being introduced in 3.0 it won't be enforced until 4.0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, is that the only concern with the implementation? If so I'll go ahead and apply it to other functions and update the docs.

I can also add modifying the flag to 4.0 milestones too, idk if there is a timeline for it just yet but figured better to add it while it's fresh

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we said this should default to "warn"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, have it pop up and warn users that this was going to be a future change coming in 4.0. Unless you're talking about something else?

@pytest.mark.parametrize("usecols", [(3, 0, 2), ("d", "a", "c")])
@pytest.mark.parametrize("usecols_use_order", (True, False))
def test_usecols_order(all_parsers, usecols, usecols_use_order):
# TODOE add portion in doc for 3.0 transition
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# TODOE add portion in doc for 3.0 transition
# TODO: add portion in doc for 3.0 transition

Copy link
Contributor

github-actions bot commented Sep 5, 2025

This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this.

@github-actions github-actions bot added the Stale label Sep 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ENH: read_csv with usecols shouldn't change column order

3 participants