-
Notifications
You must be signed in to change notification settings - Fork 0
Description
In pipelines the output of a supervised model that gets propagated to the next component in the pipeline is the output of predict
. However, some supervised models also learn a transformation. For example, MLJFlux's NeuralNetworkClassifier and NeuralNetworkRegressor learn entity embeddings to handle categorical inputs, and transform
gives access just to these embeddings. We want to use these embeddings as a preprocessing step for some other supervised learner, as in
NeuralNetworkClassifier |> LogisticClassifier
but of course this doesn't work, because the first model is propagating the output of predict
instead of transform
, because the pipeline apparatus identifies NeuralNetworkClassifier
as a Supervised
model.
We actually solved this problem in MLJFlux by introducing the EntityEmbedder
wrapper, so that the following works:
EntityEmbedder(NeuralNetworkClassifier) |> LogisticClassifier
However, it has struck me rather late, that this wrapper likely works (or should in principle work) for any Supervised
model with a transform
. So we should really call EntityEmbedder
something like Transformer
, and perhaps make it immediately available (e.g. by moving it to MLJTransforms.jl).
Thoughts anyone?