Skip to content

ENH: Disable Numpy memory allocation while concat #59956

@sandeyshc

Description

@sandeyshc

Feature Type

  • Adding new functionality to pandas

  • Changing existing functionality in pandas

  • Removing existing functionality in pandas

Problem Description

We have sparse data with many null values, and while reading it using Pandas with PyArrow, it doesn't consume much memory because of pandas internal compression logic. However, during concatenation, NumPy allocates memory that isn't actually used, causing our Python script to fail due to memory allocation issues. Can you provide an option to disable NumPy memory allocation when concatenating DataFrames along axis=1?

Feature Description

pd.concat(df_list,axis=1,numpy_allocation=False)

Alternative Solutions

Atleast can you provide how can we change C++ script internally and use it for our purpose

Additional Context

Please let me know if i am wrong.

Metadata

Metadata

Assignees

No one assigned

    Labels

    EnhancementNeeds InfoClarification about behavior needed to assess issuePerformanceMemory or execution speed performance

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions