|
1 | | ---- |
2 | | -title: Model Pipes |
3 | | ---- |
| 1 | +!!! summary |
| 2 | + Model pipes define pipelines which involve predictive models. |
4 | 3 |
|
| 4 | +!!! note |
| 5 | + The classes described here exist in the [`dynaml.modelpipe`](https://transcendent-ai-labs.github.io/api_docs/DynaML/recent/dynaml-core/#io.github.mandar2812.dynaml.modelpipe.package) package of the [`dynaml-core`](https://transcendent-ai-labs.github.io/api_docs/DynaML/recent/dynaml-core/#package) module. Although they are not strictly part of the pipes module, they are included here for clarity and continuity. |
5 | 6 |
|
6 | | -## DynaML Model Pipes |
| 7 | +The pipes module gives the user the ability to create workflows of arbitrary complexity. In order to enable end to end machine learning, we need pipelines which involve predictive models. These pipelines can be of two types. |
7 | 8 |
|
| 9 | + - Pipelines which take data as input and output a predictive model. |
8 | 10 |
|
9 | | -We saw in the previous section that certain operations like training/tuning of models are expressed as pipes which take input the relevant model and perform an operation on it. But it is evident that the model creation itself is a common step in the data analysis workflow, therefore one needs library pipes which instantiate DynaML machine learning models given the training data and other relevant inputs. Model creation pipes are not in the `#!scala DynaMLPipe` object but exist as an independent class hierarchy. Below we explore a section of it. |
| 11 | + It is evident that the model creation itself is a common step in the data analysis workflow, therefore one needs library pipes which create machine learning models given the training data and other relevant inputs. |
10 | 12 |
|
| 13 | + - Pipelines which encapsulate predictive models and generate predictions for test data splits. |
11 | 14 |
|
12 | | -### Generalized Linear Model pipe |
| 15 | + Once a model has been tuned/trained, it can be a part of a pipeline which generates predictions for previously unobserved data. |
| 16 | + |
| 17 | + |
| 18 | +## Model Creation |
| 19 | + |
| 20 | +All pipelines which return predictive models as outputs extend the [`#!scala ModelPipe`](https://transcendent-ai-labs.github.io/api_docs/DynaML/recent/dynaml-core/#io.github.mandar2812.dynaml.modelpipe.ModelPipe) trait. |
| 21 | + |
| 22 | +### Generalized Linear Model Pipe |
13 | 23 |
|
14 | 24 | ```scala |
15 | | -GLMPipe[T, Source]( |
16 | | - pre: (Source) => Stream[(DenseVector[Double], Double)], |
17 | | - map: (DenseVector[Double]) => (DenseVector[Double]) = identity _, |
18 | | - task: String = "regression", modelType: String = "") |
| 25 | +//Pre-process data |
| 26 | +val pre: (Source) => Stream[(DenseVector[Double], Double)] = _ |
| 27 | +val feature_map: (DenseVector[Double]) => (DenseVector[Double]) = _ |
| 28 | + |
| 29 | +val glm_pipe = |
| 30 | + GLMPipe[(DenseMatrix[Double], DenseVector[Double]), Source]( |
| 31 | + pre, map, task = "regression", |
| 32 | + modelType = "") |
| 33 | + |
| 34 | +val dataSource: Source = _ |
| 35 | + |
| 36 | +val glm_model = glm_pipe(dataSource) |
19 | 37 | ``` |
20 | 38 |
|
21 | 39 | * _Type_: `#!scala DataPipe[Source, GeneralizedLinearModel[T]]` |
22 | | -* _Result_: Takes as input a data of type `#!scala Source` and outputs a _Generalized Linear Model_. |
| 40 | +* _Result_: Takes as input a data of type `#!scala Source` and outputs a [_Generalized Linear Model_](/core/core_glm.md). |
| 41 | + |
| 42 | +### Generalized Least Squares Model Pipe |
| 43 | + |
| 44 | +```scala |
| 45 | +val kernel: LocalScalarKernel[DenseVector[Double]] |
| 46 | +val gls_pipe2 = GeneralizedLeastSquaresPipe2(kernel) |
| 47 | + |
| 48 | +val featuremap: (DenseVector[Double]) => (DenseVector[Double]) = _ |
| 49 | +val data: Stream[(DenseVector[Double], Double)] = _ |
| 50 | + |
| 51 | +val gls_model = gls_pipe2(data, featuremap) |
| 52 | +``` |
| 53 | + |
| 54 | +* _Type_: `#!scala DataPipe2[Stream[(DenseVector[Double], Double)], DataPipe[DenseVector[Double], DenseVector[Double]], GeneralizedLeastSquaresModel]]` |
| 55 | +* _Result_: Takes as inputs data and a feature mapping and outputs a [_Generalized Least Squares Model_](/core/core_gls.md). |
| 56 | + |
23 | 57 |
|
24 | 58 | ### Gaussian Process Regression Model Pipe |
25 | 59 |
|
26 | 60 | ```scala |
27 | | -GPRegressionPipe[ |
28 | | -M <: AbstractGPRegressionModel[Seq[(DenseVector[Double], Double)], DenseVector[Double]], |
29 | | -Source]( |
30 | | - pre: (Source) => Seq[(DenseVector[Double], Double)], |
31 | | - cov: LocalScalarKernel[DenseVector[Double]], |
32 | | - n: LocalScalarKernel[DenseVector[Double]], |
| 61 | + |
| 62 | +//Pre-process data |
| 63 | +val pre: (Source) => Stream[(DenseVector[Double], Double)] = _ |
| 64 | +//Declare kernel and noise |
| 65 | +val kernel: LocalScalarKernel[DenseVector[Double]] = _ |
| 66 | +val noise: LocalScalarKernel[DenseVector[Double]] = _ |
| 67 | + |
| 68 | +GPRegressionPipe( |
| 69 | + pre, kernel, noise, |
33 | 70 | order: Int = 0, ex: Int = 0) |
34 | 71 | ``` |
35 | 72 |
|
36 | 73 | * _Type_: `#!scala DataPipe[Source, M]` |
37 | | -* _Result_: Takes as input data of type `#!scala Source` and intializes a _Gaussian Process_ regression model as the output. |
| 74 | +* _Result_: Takes as input data of type `#!scala Source` and outputs a [_Gaussian Process_ regression](/core/core_gp.md) model as the output. |
38 | 75 |
|
39 | 76 | ### Dual LS-SVM Model Pipe |
40 | 77 |
|
41 | 78 | ```scala |
42 | | -DLSSVMPipe[Source]( |
43 | | - pre: (Source) => Stream[(DenseVector[Double], Double)], |
44 | | - cov: LocalScalarKernel[DenseVector[Double]], |
45 | | - task: String = "regression") |
| 79 | +//Pre-process data |
| 80 | +val pre: (Source) => Stream[(DenseVector[Double], Double)] = _ |
| 81 | +//Declare kernel |
| 82 | +val kernel: LocalScalarKernel[DenseVector[Double]] = _ |
| 83 | + |
| 84 | +DLSSVMPipe(pre, kernel, task = "regression") |
46 | 85 | ``` |
47 | 86 |
|
48 | 87 | * _Type_: `#!scala DataPipe[Source, DLSSVM]` |
49 | | -* _Result_: Takes as input data of type `#!scala Source` and intializes a _LS-SVM_ regression/classification model as the output. |
| 88 | +* _Result_: Takes as input data of type `#!scala Source` and outputs a [_LS-SVM_](/core/core_lssvm.md) regression/classification model as the output. |
| 89 | + |
| 90 | + |
| 91 | +## Model Prediction |
| 92 | + |
| 93 | +Prediction pipelines encapsulate predictive models, the [`#!scala ModelPredictionPipe`](https://transcendent-ai-labs.github.io/api_docs/DynaML/recent/dynaml-core/#io.github.mandar2812.dynaml.modelpipe.ModelPredictionPipe) class provides an expressive API for creating prediction pipelines. |
| 94 | + |
| 95 | +```scala |
| 96 | + |
| 97 | +//Any model |
| 98 | +val model: Model[T, Q, R] = _ |
| 99 | + |
| 100 | +//Data pre and post processing |
| 101 | +val preprocessing: DataPipe[P, Q] = _ |
| 102 | +val postprocessing: DataPipe[R, S] = _ |
| 103 | + |
| 104 | +val prediction_pipeline = ModelPredictionPipe( |
| 105 | + preprocessing, |
| 106 | + model, |
| 107 | + postprocessing) |
| 108 | + |
| 109 | +//In case no pre or post processing is done. |
| 110 | +val prediction_pipeline2 = ModelPredictionPipe(model) |
| 111 | + |
| 112 | +//Incase feature and target scaling is performed |
| 113 | + |
| 114 | +val featureScaling: ReversibleScaler[Q] = _ |
| 115 | +val targetScaling: ReversibleScaler[R] = _ |
| 116 | + |
| 117 | +val prediction_pipeline3 = ModelPredictionPipe( |
| 118 | + featureScaling, |
| 119 | + model, |
| 120 | + targetScaling) |
| 121 | + |
| 122 | +``` |
0 commit comments