Skip to content
Draft
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions AUTHORS.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,3 +109,4 @@ Contributors
- [@ethompsy](https://github.com/ethompsy) | [contributions](https://github.com/pyjanitor-devs/pyjanitor/issues?q=is%3Aclosed+mentions%3Aethompsy)
- [@apatao](https://github.com/apatao) | [contributions](https://github.com/pyjanitor-devs/pyjanitor/issues?q=is%3Aclosed+mentions%3Aapatao)
- [@OdinTech3](https://github.com/OdinTech3) | [contributions](https://github.com/pyjanitor-devs/pyjanitor/pull/1094)
- [@Fu-Jie](https://github.com/Fu-Jie) | [contributions](https://github.com/pyjanitor-devs/pyjanitor/pulls?q=is%3Aclosed+mentions%3AFu-Jie)
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
- [ENH] Enable `encode_categorical` handle 2 (or more ) dimensions array. PR #1153 @Zeroto521
- [ENH] Faster computation for a single non-equi join, with a numba engine. Issue #1102 @samukweku
- [INF] Cancel old workflow runs via Github Action `concurrency`. PR #1161 @Zeroto521
- [BUG] Modify ignore_empty output in `concatenate_columns`. PR #1164 @Fu-Jie

## [v0.23.1] - 2022-05-03

Expand Down
6 changes: 5 additions & 1 deletion janitor/functions/concatenate_columns.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,8 +51,12 @@ def concatenate_columns(
if len(column_names) < 2:
raise JanitorError("At least two columns must be specified")

df = df.copy() # avoid mutating original data
df[new_column_name] = (
df[column_names].astype(str).fillna("").agg(sep.join, axis=1)
df[column_names]
.astype("string")
.fillna("")
.agg(sep.join, axis=1)
)

if ignore_empty:
Expand Down
2 changes: 1 addition & 1 deletion tests/functions/test_concatenate_columns.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ def test_concatenate_columns_null_values(missingdata_df):
new_column_name="index",
ignore_empty=True,
)
expected_values = ["1.0-1", "2.0-2", "nan-3"] * 3
expected_values = ["1.0-1", "2.0-2", "3"] * 3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also update the docstrings for the test.

assert expected_values == df["index"].tolist()


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think it might be worth writing a test merging a custom dataframe with a float column (NaN), a datetime column (NaT) and a string column (None/NA?).

And assert the expected output accordingly.

Then, mention this PR or the attached issue in the test docstring as well, please.

Expand Down