Skip to content

Apply filters to non-selected columns #8

@hombit

Description

@hombit

With PyArrow and Pandas, a user can select a set of columns to return and filter by a column that is not in this set. For example:

import pandas as pd

frame = pd.read_parquet('Npix=0.parquet', columns=['ra', 'dec'], filters=[('yMeanPSFMag', '<', 15)])

However, when I try to do the same with lsdb.read_hats and the HATS server, I always get an empty DataFrame.

Could we make column selection a two-step process, where we first load the columns specified both by columns and filters, apply the filter, and then remove the columns that appeared only in filters?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions