Classification Metrics

Confusion matrix

The confusion matrix is used to have a more complete picture when assessing the performance of a model and deal with skewed data. It is defined as follows:

Main metrics

Metric	Formula	Interpretaion
Accuracy	$\frac {TP + TN}{TP + TN + FP + FN}$	Overall performance of model
Precision	$\frac {TP}{TP+FP}$	How accurate the positive predictions are
Recall/ Sensitivity	$\frac{TP}{TP+FN}$	Converge of actual positive sample
Specificity	$\frac{TN}{TN+FP}$	Converge of actual negative sample
F1 score	$\frac{2TP}{2TP+FP+FN}$	Hybrid metric useful for unbalanced classes

ROC

The receiver operating curce, also noted ROC, is the plot of TPR versus FPR by varying the threshold. These metrics are summed up below:

Metric	Formula	Equivalent
True Positive Rate, TPR	$\frac{TP}{TP+FN}$	Recall, sensitivity
False Positive Rate, FPR	$\frac{FP}{TN+FP}$	1-specificity

AUC

The area under the receiving operating curve, also noted AUC or AUROC, is the area below the ROC as shown in the following figure:

Last updated on Dec 3, 2019