Skip to content

ENH: .isin() method should use __contains__ rather than __iter__ for user-defined classes to determine presence.Β #59041

@f3ss1

Description

@f3ss1

Feature Type

  • Adding new functionality to pandas

  • Changing existing functionality in pandas

  • Removing existing functionality in pandas

Problem Description

Right now, if you would define a user class:

class MyClass:
    def __init__(self):
        self.collection = [1, 2, 3]
        self.another_collection = [4, 5, 6]
    
    def __contains__(self, item):
        return item in self.collection
    
    def __iter__(self):
        yield from self.another_collection

and would then initialize a pandas dataframe like this:

example_dataframe = pd.DataFrame(
    {
        'column_name': [3, 1, 4, 6, 13],
        'another_column_name': ['tolly', 'trolly', 'telly', 'belly', 'nelly']
    }
)

and would then call the .isin() method like this:

class_instance = MyClass()
example_dataframe['column_name'].isin(class_instance)

you would actually get this output:

False
False
True
True
False

which is if the values from self.another_collections specified in __iter__ are checked, rather than self.collection from __contains__. I do realize that this might stem from compatibility with other libraries, but this seems counter-intuitive.

Feature Description

A solution I suggest is either to change the behavior (which might result into ruining some peoples code, I believe), or adding a flag (which would lead to more complexity, I guess).

Alternative Solutions

See above.

Additional Context

No response

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions