-
-
Notifications
You must be signed in to change notification settings - Fork 19.1k
Description
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
(optional) I have confirmed this bug exists on the master branch of pandas.
Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.
Code Sample, a copy-pastable example
import pandas as pd
df = pd.DataFrame({"a": [None, None]})
df.loc[0, "a"] = float(1)
df.loc[1, "a"] = float(2)
hdf = pd.HDFStore("test.h5", write_mode="w")
hdf.put("table", df, format="table")
This causes following error:
...
File "/opt/venv/lib64/python3.6/site-packages/pandas/io/pytables.py", line 1042, in put
errors=errors,
File "/opt/venv/lib64/python3.6/site-packages/pandas/io/pytables.py", line 1709, in _write_to_group
data_columns=data_columns,
File "/opt/venv/lib64/python3.6/site-packages/pandas/io/pytables.py", line 4143, in write
data_columns=data_columns,
File "/opt/venv/lib64/python3.6/site-packages/pandas/io/pytables.py", line 3813, in _create_axes
errors=self.errors,
File "/opt/venv/lib64/python3.6/site-packages/pandas/io/pytables.py", line 4800, in _maybe_convert_for_string_atom
for i in range(len(block.shape[0])):
TypeError: object of type 'int' has no len()
Problem description
After initial creation of DataFrame the dtype is of object
dtype. After putting float in the a
column I would expect that the dtype of the a
column will change to float64
dtype, but it remains object
dtype. The problem is that the type of df.loc[0, "a"]
is float
during saving the DataFrame, which causes the problem pasted above.
Expected Output
I would expect one of the following:
- Implicit conversion of the column to
float
dtype - Conversion during
hdf.put()
- Proper exception saying that I am saving mixed typed column
There's a pretty big chance that I am wrong and this is expected behaviour. If that's the case, please, can you explain me why, or point me to somewhere, so that I can read something about it?
Maybe it's linked with this issue #34274
Output of pd.show_versions()
commit : None
python : 3.6.8.final.0
python-bits : 64
OS : Linux
OS-release : 4.18.0-147.5.1.el8_1.x86_64
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 1.0.4
numpy : 1.16.4
pytz : 2018.7
dateutil : 2.8.1
pip : 19.3.1
setuptools : 46.4.0
Cython : 0.29.2
pytest : 5.1.2
hypothesis : None
sphinx : 1.8.4
blosc : None
feather : None
xlsxwriter : 1.1.2
lxml.etree : 4.5.0
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 2.10.3
IPython : 7.9.0
pandas_datareader: None
bs4 : 4.8.0
bottleneck : None
fastparquet : None
gcsfs : None
lxml.etree : 4.5.0
matplotlib : 2.0.0
numexpr : 2.6.8
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pytables : None
pytest : 5.1.2
pyxlsb : None
s3fs : None
scipy : 1.4.1
sqlalchemy : None
tables : 3.6.1
tabulate : None
xarray : None
xlrd : 1.2.0
xlwt : None
xlsxwriter : 1.1.2
numba : None