Skip to content

DataFrame.from_dict sets DataFrame values to NaN if original dict key tuples contain None #28390

@dmparker0

Description

@dmparker0

Code Sample

import pandas as pd

d = {('East',1):{'Home':'MIL','Away':'BOS','Winner':'MIL'},
     ('East',2):{'Home':'PHI','Away':'IND','Winner':'PHI'},
     ('East',3):{'Home':'MIL','Away':'PHI','Winner':'PHI'},
     ('West',1):{'Home':'HOU','Away':'UTA','Winner':'HOU'},
     ('West',2):{'Home':'LAC','Away':'LAL','Winner':'LAC'},
     ('West',3):{'Home':'HOU','Away':'LAC','Winner':'HOU'},
     (None,1):{'Home':'HOU','Away':'PHI','Winner':'PHI'}}

df = pd.DataFrame.from_dict(d, orient='index')
print(df)

Output

        Home Away   Winner
East 1  MIL  BOS    MIL
     2  PHI  IND    PHI
     3  MIL  PHI    PHI 
West 1  HOU  UTA    HOU
     2  LAC  LAL    LAC
     3  HOU  LAC    HOU
NaN  1  NaN  NaN    NaN

Problem description

When constructing a DataFrame from a dictionary with tuple keys, values for all columns are set to NaN if the index tuple contains None.

I only started encountering this bug when I upgraded from pandas 0.22 to pandas 0.25.

Possibly related: #19993

Expected Output

        Home Away   Winner
East 1  MIL  BOS    MIL
     2  PHI  IND    PHI
     3  MIL  PHI    PHI 
West 1  HOU  UTA    HOU
     2  LAC  LAL    LAC
     3  HOU  LAC    HOU
NaN  1  HOU  PHI    PHI

Output of pd.show_versions()

INSTALLED VERSIONS

commit : None
python : 3.6.7.final.0
python-bits : 64
OS : Windows
OS-release : 10
machine : AMD64
processor : Intel64 Family 6 Model 45 Stepping 7, GenuineIntel
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : None.None

pandas : 0.25.1
numpy : 1.14.3
pytz : 2017.3
dateutil : 2.7.3
pip : 10.0.1
setuptools : 39.0.1
Cython : 0.28.2
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.2.4
html5lib : 1.0.1
pymysql : None
psycopg2 : None
jinja2 : None
IPython : None
pandas_datareader: None
bs4 : 4.6.3
bottleneck : None
fastparquet : None
gcsfs : None
lxml.etree : 4.2.4
matplotlib : 3.0.3
numexpr : None
odfpy : None
openpyxl : 2.4.9
pandas_gbq : None
pyarrow : None
pytables : None
s3fs : None
scipy : 1.1.0
sqlalchemy : 1.1.15
tables : None
xarray : None
xlrd : 1.1.0
xlwt : None
xlsxwriter : None
None

Metadata

Metadata

Assignees

No one assigned

    Labels

    Missing-datanp.nan, pd.NaT, pd.NA, dropna, isnull, interpolate

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions