It is important for us to quantify the performance of a model to use it as a feedback and comparison. In this tutorial we have used one of the most popular error metric root mean squared error. There are various other error metrics available. This chapter discusses them in brief.
It is the average of square of difference between the predicted values and true values. Sklearn provides it as a function. It has the same units as the true and predicted values squared and is always positive.
$$MSE = \frac{1}{n} \displaystyle\sum\limits_{t=1}^n \lgroup y'_{t}\:-y_{t}\rgroup^{2}$$
Where $y'_{t}$ is the predicted value,
$y_{t}$ is the actual value, and
n is the total number of values in test set.
It is clear from the equation that MSE is more penalizing for larger errors, or the outliers.
It is the square root of the mean square error. It is also always positive and is in the range of the data.
$$RMSE = \sqrt{\frac{1}{n} \displaystyle\sum\limits_{t=1}^n \lgroup y'_{t}-y_{t}\rgroup ^2}$$
Where, $y'_{t}$ is predicted value
$y_{t}$ is actual value, and
n is total number of values in test set.
It is in the power of unity and hence is more interpretable as compared to MSE. RMSE is also more penalizing for larger errors. We have used RMSE metric in our tutorial.
It is the average of absolute difference between predicted values and true values. It has the same units as predicted and true value and is always positive.
$$MAE = \frac{1}{n}\displaystyle\sum\limits_{t=1}^{t=n} | y'{t}-y_{t}\lvert$$
Where, $y'_{t}$ is predicted value,
$y_{t}$ is actual value, and
n is total number of values in test set.
It is the percentage of average of absolute difference between predicted values and true values, divided by the true value.
$$MAPE = \frac{1}{n}\displaystyle\sum\limits_{t=1}^n\frac{y'_{t}-y_{t}}{y_{t}}*100\: \%$$
Where, $y'_{t}$ is predicted value,
$y_{t}$ is actual value and n is total number of values in test set.
However, the disadvantage of using this error is that the positive error and negative errors can offset each other. Hence mean absolute percentage error is used.
It is the percentage of average of absolute difference between predicted values and true values, divided by the true value.
$$MAPE = \frac{1}{n}\displaystyle\sum\limits_{t=1}^n\frac{|y'_{t}-y_{t}\lvert}{y_{t}}*100\: \%$$
Where $y'_{t}$ is predicted value
$y_{t}$ is actual value, and
n is total number of values in test set.