Skip to content

BUG: mask in test_mask_stringdtype would always return the same result regardless of cond #15

@Nadav-Zilberberg

Description

@Nadav-Zilberberg

Reproducible Example
import pandas as pd

test_mask_stringdtype

obj = pd.DataFrame(
{"A": ["foo", "bar", "baz", pd.NA]},
index=["id1", "id2", "id3", "id4"],
dtype=pd.StringDtype(),
)
filtered_obj = pd.DataFrame(
{"A": ["this", "that"]}, index=["id2", "id3"], dtype=pd.StringDtype()
)
expected = pd.DataFrame(
{"A": [pd.NA, "this", "that", pd.NA]},
index=["id1", "id2", "id3", "id4"],
dtype=pd.StringDtype(),
)

filter_ser = pd.Series([False, True, True, False])
obj.mask(filter_ser, filtered_obj)

A

id1

id2 this

id3 that

id4

filter_ser = pd.Series([True, False, False, True])
obj.mask(filter_ser, filtered_obj)

A

id1

id2 this

id3 that

id4

filter_ser = pd.Series([False, False, False, False])
obj.mask(filter_ser, filtered_obj)

A

id1

id2 this

id3 that

id4

filter_ser = pd.Series([True, True, True, True])
obj.mask(filter_ser, filtered_obj)

A

id1

id2 this

id3 that

id4

Issue Description
Found during pandas-dev#60772 .
I suppose the purpose of this test is to check if mask works as expected with pd.StringDtype() (See pandas-dev#40824 ), but the test seems to return the same result regardless of cond since it fails to align in _where.

If we want to check if mask replaces with other only where cond is True and let cond propagate where cond is False, I think filter_ser should have index so that mask can recognize the corresponding other value.

Expected Behavior
filter_ser = pd.Series([False, True, True, False], index=["id1", "id2", "id3", "id4"])
obj.mask(filter_ser, filtered_obj)

A

id1 foo

id2 this

id3 that

id4

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions