# Model Pipes

Summary

Model pipes define pipelines which involve predictive models.

Note

The classes described here exist in the dynaml.modelpipe package of the dynaml-core module. Although they are not strictly part of the pipes module, they are included here for clarity and continuity.

The pipes module gives the user the ability to create workflows of arbitrary complexity. In order to enable end to end machine learning, we need pipelines which involve predictive models. These pipelines can be of two types.

• Pipelines which take data as input and output a predictive model.

It is evident that the model creation itself is a common step in the data analysis workflow, therefore one needs library pipes which create machine learning models given the training data and other relevant inputs.

• Pipelines which encapsulate predictive models and generate predictions for test data splits.

Once a model has been tuned/trained, it can be a part of a pipeline which generates predictions for previously unobserved data.

## Model Creation¶

All pipelines which return predictive models as outputs extend the ModelPipe trait.

### Generalized Linear Model Pipe¶

  1 2 3 4 5 6 7 8 9 10 11 12 //Pre-process data val pre: (Source) => Stream[(DenseVector[Double], Double)] = _ val feature_map: (DenseVector[Double]) => (DenseVector[Double]) = _ val glm_pipe = GLMPipe[(DenseMatrix[Double], DenseVector[Double]), Source]( pre, map, task = "regression", modelType = "") val dataSource: Source = _ val glm_model = glm_pipe(dataSource) 
• Type: DataPipe[Source, GeneralizedLinearModel[T]]
• Result: Takes as input a data of type Source and outputs a Generalized Linear Model.

### Generalized Least Squares Model Pipe¶

 1 2 3 4 5 6 7 val kernel: LocalScalarKernel[DenseVector[Double]] val gls_pipe2 = GeneralizedLeastSquaresPipe2(kernel) val featuremap: (DenseVector[Double]) => (DenseVector[Double]) = _ val data: Stream[(DenseVector[Double], Double)] = _ val gls_model = gls_pipe2(data, featuremap) 
• Type: DataPipe2[Stream[(DenseVector[Double], Double)], DataPipe[DenseVector[Double], DenseVector[Double]], GeneralizedLeastSquaresModel]]
• Result: Takes as inputs data and a feature mapping and outputs a Generalized Least Squares Model.

### Gaussian Process Regression Model Pipe¶

 1 2 3 4 5 6 7 8 9 //Pre-process data val pre: (Source) => Stream[(DenseVector[Double], Double)] = _ //Declare kernel and noise val kernel: LocalScalarKernel[DenseVector[Double]] = _ val noise: LocalScalarKernel[DenseVector[Double]] = _ GPRegressionPipe( pre, kernel, noise, order: Int = 0, ex: Int = 0) 
• Type: DataPipe[Source, M]
• Result: Takes as input data of type Source and outputs a Gaussian Process regression model as the output.

### Dual LS-SVM Model Pipe¶

 1 2 3 4 5 6 //Pre-process data val pre: (Source) => Stream[(DenseVector[Double], Double)] = _ //Declare kernel val kernel: LocalScalarKernel[DenseVector[Double]] = _ DLSSVMPipe(pre, kernel, task = "regression") 
• Type: DataPipe[Source, DLSSVM]
• Result: Takes as input data of type Source and outputs a LS-SVM regression/classification model as the output.

## Model Prediction¶

Prediction pipelines encapsulate predictive models, the ModelPredictionPipe class provides an expressive API for creating prediction pipelines.

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 //Any model val model: Model[T, Q, R] = _ //Data pre and post processing val preprocessing: DataPipe[P, Q] = _ val postprocessing: DataPipe[R, S] = _ val prediction_pipeline = ModelPredictionPipe( preprocessing, model, postprocessing) //In case no pre or post processing is done. val prediction_pipeline2 = ModelPredictionPipe(model) //Incase feature and target scaling is performed val featureScaling: ReversibleScaler[Q] = _ val targetScaling: ReversibleScaler[R] = _ val prediction_pipeline3 = ModelPredictionPipe( featureScaling, model, targetScaling)