Skip to content

Universal table transformer combining univariate transformations dispatched on schema Β #49

@ablaom

Description

@ablaom

It has been proposed on Slack that it be possible to have a single table transformer that transforms individual columns according to user-specified univariate transformations. This sounds like a good idea, which would also force some uniformity that's a little bit lacking in the current collection of table transformers.

  1. In the most general case I can imagine implementing, the univariate transformer that applies to a particular column is defined by a function that operates on both the name and scitype of the the column (as encoded in the table schema). This has the disadvantage that the user must specify a function with two arguments - or interact through some other complicated interface.

  2. The alternative would be a compositional approach. Each tabular transformer only carries out a single univariate transformer, applying to all specified names and scitypes (or "not"-names and "not"-scitypes, through ignore Boolean parameter), which would cover all conceivable use-cases. (columns not referred to are left alone). However, as we are currently locked into Tables.jl (which are non-mutable in general) we get a lot more copying of data.

Thoughts anyone?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions