Skip to content

Significant (10x +) Speedups Using extractall Rather than Extract_Stream #31

@zthatch

Description

@zthatch

This implementation decodes the zip file line by line rather than extracting it in its entirety and then reading it as a csv. This is a significant bottleneck in the code and can be improved by using extractall to extract the zip file to a temporary directory, and then use the csv reader to iterate through the rows, which then requires changes to the line processing the use of stringIO (rather than bytesIO) to load the table into a dataframe.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions