Skip to content

Performance Evaluation

Model evaluation is the litmus test for knowing if your modeling effort is headed in the right direction and for comparing various alternative models (or hypothesis) attempting to explain a phenomenon. The evaluation package contains classes and traits to calculate performance metrics for DynaML models.

Classes which implement model performance calculation can extend the Metrics[P] trait. The Metrics trait requires that its sub-classes implement three methods or behaviors.

  • Print out the performance metrics (whatever they may be) to the screen i.e. print method.
  • Return the key performance indicators in the form of a breeze DenseVector[Double], i.e. the kpi() method.

Regression Models

Regression models are generally evaluated on a few standard metrics such as mean square error, mean absolute error, coefficient of determination (R^2), etc. DynaML has implementations for single output and multi-output regression models.

Single Output

Small Test Set

The RegressionMetrics class takes as input a scala list containing the predictions and actual outputs and calculates the following metrics.

  • Mean Absolute Error (mae)
  • Root Mean Square Error (rmse)
  • Correlation Coefficient (\rho_{y \hat{y}})
  • Coefficient of Determination (R^2)
1
2
3
4
5
6
7
//Predictions computed by any model.
val predictionAndOutputs: List[(Double, Double)] = ...

val metrics = new RegressionMetrics(predictionAndOutputs, predictionAndOutputs.length)

//Print results on screen
metrics.print

Large Test Set

The RegressionMetricsSpark class takes as input an Apache Spark RDD containing the predictions and actual outputs and calculates the same metrics as above.

1
2
3
4
5
6
7
//Predictions computed by any model.
val predictionAndOutputs: RDD[(Double, Double)] = ...

val metrics = new RegressionMetricsSpark(predictionAndOutputs, predictionAndOutputs.length)

//Print results on screen
metrics.print

Multiple Outputs

The MultiRegressionMetrics class calculates regression performance for multi-output models.

1
2
3
4
5
6
7
//Predictions computed by any model.
val predictionAndOutputs: List[(DenseVector[Double], DenseVector[Double])] = ...

val metrics = new MultiRegressionMetrics(predictionAndOutputs, predictionAndOutputs.length)

//Print results on screen
metrics.print

Classification Models

Currently (as of v1.4) there is only a binary classification implementation for calculating model performance.

Binary Classification

Small Test Sets

The BinaryClassificationMetrics class calculates the following performance indicators.

  • Classification accuracy
  • F-measure
  • Precision-Recall Curve (and area under it).
  • Receiver Operating Characteristic (and area under it)
  • Matthew's Correlation Coefficient
1
2
3
4
5
6
7
8
9
val scoresAndLabels: List[(Double, Double)] = ...

//Set logisticFlag = true in case outputs are produced via logistic regression
val metrics = new BinaryClassificationMetrics(
          scoresAndLabels,
          scoresAndLabels.length,
          logisticFlag = true)

metrics.print

Large Test Sets

The BinaryClassificationMetricsSpark class takes as input an Apache Spark RDD containing the predictions and actual labels and calculates the same metrics as above.

Comments