Loading of large pickled dataframes fails

I tried pickling a very large dataframe (20GB or so) and that succeeded to write to disk, but when I try to read it, it fails with: ValueError: buffer size does not match array size

Now I did a bit of research and found the following:

http://stackoverflow.com/questions/12060932/unable-to-load-a-previously-dumped-pickle-file-of-large-size-in-python

http://bugs.python.org/issue13555

I am thinking this is a numpy/python issue, but it does cause me pretty big pain when I want to back up a dataframe that took a long time to join together, and I want all the dtypes stored (namely what columns are datetimes).  Perhaps a solution would be a csv file that keeps the dtypes somewhere (otherwise I'll have to figure out what columns are serialized dates).  Any workarounds would be appreciated.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Loading of large pickled dataframes fails #2705

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Loading of large pickled dataframes fails #2705

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions