-
Notifications
You must be signed in to change notification settings - Fork 0
Description
In pipelines the output of a supervised model that gets propagated to the next component in the pipeline is the output of predict. However, some supervised models also learn a transformation. For example, MLJFlux's NeuralNetworkClassifier and NeuralNetworkRegressor learn entity embeddings to handle categorical inputs, and transform gives access just to these embeddings. We want to use these embeddings as a preprocessing step for some other supervised learner, as in
NeuralNetworkClassifier |> LogisticClassifierbut of course this doesn't work, because the first model is propagating the output of predict instead of transform, because the pipeline apparatus identifies NeuralNetworkClassifier as a Supervised model.
We actually solved this problem in MLJFlux by introducing the EntityEmbedder wrapper, so that the following works:
EntityEmbedder(NeuralNetworkClassifier) |> LogisticClassifierHowever, it has struck me rather late, that this wrapper likely works (or should in principle work) for any Supervised model with a transform. So we should really call EntityEmbedder something like Transformer, and perhaps make it immediately available (e.g. by moving it to MLJTransforms.jl).
Thoughts anyone?