Skip to content

DataFrame dream / long term future #6

@maartenbreddels

Description

@maartenbreddels

What does the long term future look like in the Python/PyData landscape (say 2025)? What would be the ideal 'dream' dataframe library? E.g, what are the issues we need to tackle?

For instance, vaex solves most of the 2017 issues mentioned by Wes: https://wesmckinney.com/blog/apache-arrow-pandas-internals/

Also think about:

  • Sizes of datasets (e.g rows and/or column counts), compared to current hardware+Moore's Law.
  • Kinds of data, more unstructured?
  • Expectation on the hardware, more cores, more GPU?
  • Distributed vs cloud vs single computer
  • API, (e.g. expose laziness or not?)

Are we going in the right direction, also taking into account the convergence/divergence of dataframe libraries?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions