-
Notifications
You must be signed in to change notification settings - Fork 0
Description
It has been proposed on Slack that it be possible to have a single table transformer that transforms individual columns according to user-specified univariate transformations. This sounds like a good idea, which would also force some uniformity that's a little bit lacking in the current collection of table transformers.
-
In the most general case I can imagine implementing, the univariate transformer that applies to a particular column is defined by a function that operates on both the
name
andscitype
of the the column (as encoded in the tableschema
). This has the disadvantage that the user must specify a function with two arguments - or interact through some other complicated interface. -
The alternative would be a compositional approach. Each tabular transformer only carries out a single univariate transformer, applying to all specified
names
andscitypes
(or "not"-names and "not"-scitypes, throughignore
Boolean parameter), which would cover all conceivable use-cases. (columns not referred to are left alone). However, as we are currently locked into Tables.jl (which are non-mutable in general) we get a lot more copying of data.
Thoughts anyone?