Skip to content
This repository was archived by the owner on Jul 27, 2024. It is now read-only.

ProtoFromDataFrames fails for dataframes with categorical columnsΒ #237

@ysayeed

Description

@ysayeed

When attempting to create the proto for facets-overview, if any of the columns are categorical, the operation will fail with an attribute error. I would expect it to properly parse the dataframe, treating the category dtype as a string and displaying it in the "Categorical Features" section in the same way.

Below is example code to produce this error and the traceback:

from facets_overview.generic_feature_statistics_generator import GenericFeatureStatisticsGenerator  
import pandas as pd  
df = pd.DataFrame({'col1': pd.Categorical(['a', 'b', 'c', 'a', 'b', 'c'])})  
proto = GenericFeatureStatisticsGenerator().ProtoFromDataFrames([{'name': 'test', 'table': df}])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File ".../facets_overview/base_generic_feature_statistics_generator.py", line 54, in ProtoFromDataFrames
    table_entries[col] = self.NdarrayToEntry(table[col])
  File ".../facets_overview/base_generic_feature_statistics_generator.py", line 119, in NdarrayToEntry
    data_type = self.DtypeToType(x.dtype)
  File ".../facets_overview/base_generic_feature_statistics_generator.py", line 66, in DtypeToType
    if dtype.char in np.typecodes['AllFloat']:
AttributeError: 'CategoricalDtype' object has no attribute 'char'

This is using facets-overview 1.0.0 and pandas 1.1.4.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions