Skip to content

Handle duplicate-column suffix collisions in pandas utils#347

Closed
neuralsorcerer wants to merge 2 commits intofacebookresearch:mainfrom
neuralsorcerer:dup
Closed

Handle duplicate-column suffix collisions in pandas utils#347
neuralsorcerer wants to merge 2 commits intofacebookresearch:mainfrom
neuralsorcerer:dup

Conversation

@neuralsorcerer
Copy link
Collaborator

No description provided.

Copilot AI review requested due to automatic review settings February 25, 2026 06:28
@meta-cla meta-cla bot added the cla signed label Feb 25, 2026
@neuralsorcerer neuralsorcerer added this to the balance 0.17.0 milestone Feb 25, 2026
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a bug in _make_df_column_names_unique() where duplicate column renaming could collide with existing suffixed column names (e.g., when columns ["a", "a_1", "a"] exist, the second a would be renamed to a_1, colliding with the existing a_1). The fix introduces collision detection that increments the suffix until a unique name is found.

Changes:

  • Enhanced _make_df_column_names_unique() to track used names and avoid suffix collisions
  • Added comprehensive test coverage for collision scenarios, non-string columns, and the no-op path
  • Added test coverage for object dtype handling in _safe_fillna_and_infer()
  • Removed obsolete TODO comment about inconsistent behavior

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
balance/utils/pandas_utils.py Added used_names set tracking and collision-avoidance loop in _make_df_column_names_unique()
tests/test_util_pandas_utils.py Added three new tests for collision handling, non-string columns, and already-unique columns; removed TODO comment; added object dtype test for _safe_fillna_and_infer()
CHANGELOG.md Documented the bug fix under "Bug Fixes" section

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

@meta-codesync
Copy link

meta-codesync bot commented Feb 25, 2026

@talgalili has imported this pull request. If you are a Meta employee, you can view this in D94329024.

@meta-codesync
Copy link

meta-codesync bot commented Feb 25, 2026

@talgalili merged this pull request in bd1c5bc.

@neuralsorcerer neuralsorcerer deleted the dup branch February 25, 2026 09:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants