Confusion matrix

Terminology and derivations
from a confusion matrix
condition positive (P) the number of real positive cases in the data condition negatives (N) the number of real negative cases in the data true positive (TP) eqv. with hit true negative (TN) eqv. with correct rejection false positive (FP) eqv. with false alarm, Type I error false negative (FN) eqv. with miss, Type II error sensitivity, recall, hit rate, or true positive rate (TPR) $\mathrm {TPR} ={\frac {\mathrm {TP} }{P}}={\frac {\mathrm {TP} }{\mathrm {TP} +\mathrm {FN} }}$ specificity or true negative rate (TNR) $\mathrm {TNR} ={\frac {\mathrm {TN} }{N}}={\frac {\mathrm {TN} }{\mathrm {FP} +\mathrm {TN} }}$ precision or positive predictive value (PPV) $\mathrm {PPV} ={\frac {\mathrm {TP} }{\mathrm {TP} +\mathrm {FP} }}$ negative predictive value (NPV) $\mathrm {NPV} ={\frac {\mathrm {TN} }{\mathrm {TN} +\mathrm {FN} }}$ fall-out or false positive rate (FPR) $\mathrm {FPR} ={\frac {\mathrm {FP} }{N}}={\frac {\mathrm {FP} }{\mathrm {FP} +\mathrm {TN} }}=1-\mathrm {TNR}$ false discovery rate (FDR) $\mathrm {FDR} ={\frac {\mathrm {FP} }{\mathrm {FP} +\mathrm {TP} }}=1-\mathrm {PPV}$ miss rate or false negative rate (FNR) $\mathrm {FNR} ={\frac {\mathrm {FN} }{P}}={\frac {\mathrm {FN} }{\mathrm {FN} +\mathrm {TP} }}=1-\mathrm {TPR}$ accuracy (ACC) $\mathrm {ACC} ={\frac {\mathrm {TP} +\mathrm {TN} }{P+N}}$ F1 score is the harmonic mean of precision and sensitivity $F_{1}={\frac {2\mathrm {TP} }{2\mathrm {TP} +\mathrm {FP} +\mathrm {FN} }}$ Matthews correlation coefficient (MCC) $\mathrm {MCC} ={\frac {\mathrm {TP} \times \mathrm {TN} -\mathrm {FP} \times \mathrm {FN} }{\sqrt {(\mathrm {TP} +\mathrm {FP} )(\mathrm {TP} +\mathrm {FN} )(\mathrm {TN} +\mathrm {FP} )(\mathrm {TN} +\mathrm {FN} )}}}$ Informedness or Bookmaker Informedness (BM) $\mathrm {BM} =\mathrm {TPR} +\mathrm {TNR} -1$ Markedness (MK) $\mathrm {MK} =\mathrm {PPV} +\mathrm {NPV} -1$ Sources: Fawcett (2006), Powers (2011), and Ting (2011) ^[1] ^[2] ^[3]

In the field of machine learning and specifically the problem of statistical classification, a confusion matrix, also known as an error matrix,^[4] is a specific table layout that allows visualization of the performance of an algorithm, typically a supervised learning one (in unsupervised learning it is usually called a matching matrix). Each column of the matrix represents the instances in a predicted class while each row represents the instances in an actual class (or vice versa).^[2] The name stems from the fact that it makes it easy to see if the system is confusing two classes (i.e. commonly mislabelling one as another).

It is a special kind of contingency table, with two dimensions ("actual" and "predicted"), and identical sets of "classes" in both dimensions (each combination of dimension and class is a variable in the contingency table).

Example

If a classification system has been trained to distinguish between cats, dogs and rabbits, a confusion matrix will summarize the results of testing the algorithm for further inspection. Assuming a sample of 27 animals — 8 cats, 6 dogs, and 13 rabbits, the resulting confusion matrix could look like the table below:

		Predicted
		Cat	Dog	Rabbit
Actual class	Cat	5	3	0
	Dog	2	3	1
	Rabbit	0	2	11

In this confusion matrix, of the 8 actual cats, the system predicted that three were dogs, and of the six dogs, it predicted that one was a rabbit and two were cats. We can see from the matrix that the system in question has trouble distinguishing between cats and dogs, but can make the distinction between rabbits and other types of animals pretty well. All correct guesses are located in the diagonal of the table, so it's easy to visually inspect the table for errors, as they will be represented by values outside the diagonal.

Table of confusion

In predictive analytics, a table of confusion (sometimes also called a confusion matrix), is a table with two rows and two columns that reports the number of false positives, false negatives, true positives, and true negatives. This allows more detailed analysis than mere proportion of correct guesses (accuracy). Accuracy is not a reliable metric for the real performance of a classifier, because it will yield misleading results if the data set is unbalanced (that is, when the number of samples in different classes vary greatly). For example, if there were 95 cats and only 5 dogs in the data set, the classifier could easily be biased into classifying all the samples as cats. The overall accuracy would be 95%, but in practice the classifier would have a 100% recognition rate for the cat class but a 0% recognition rate for the dog class.

Assuming the confusion matrix above, its corresponding table of confusion, for the cat class, would be:

5 true positives (actual cats that were correctly classified as cats)	3 false negatives (cats that were incorrectly marked as dogs)
2 false positives (dogs that were incorrectly labeled as cats)	17 true negatives (all the remaining animals, correctly classified as non-cats)

The final table of confusion would contain the average values for all classes combined.

Let us define an experiment from P positive instances and N negative instances for some condition. The four outcomes can be formulated in a 2×2 confusion matrix, as follows:

		Predicted condition
	Total population	Predicted Condition positive	Predicted Condition negative	Prevalence = $Σ Condition positive / Σ Total population$
True condition	condition positive	True positive	False Negative (Type II error)	True positive rate (TPR), Sensitivity, Recall, probability of detection = $Σ True positive / Σ Condition positive$	False negative rate (FNR), Miss rate = $Σ False negative / Σ Condition positive$
True condition	condition negative	False Positive (Type I error)	True negative	False positive rate (FPR), Fall-out, probability of false alarm = $Σ False positive / Σ Condition negative$	True negative rate (TNR), Specificity (SPC) = $Σ True negative / Σ Condition negative$
	Accuracy (ACC) = $Σ True positive + Σ True negative / Σ Total population$	Positive predictive value (PPV), Precision = $Σ True positive / Σ Test outcome positive$	False omission rate (FOR) = $Σ False negative / Σ Test outcome negative$	Positive likelihood ratio (LR+) = $TPR / FPR$	Diagnostic odds ratio (DOR) = $LR+ / LR-$
		False discovery rate (FDR) = $Σ False positive / Σ Test outcome positive$	Negative predictive value (NPV) = $Σ True negative / Σ Test outcome negative$	Negative likelihood ratio (LR−) = $FNR / TNR$	Diagnostic odds ratio (DOR) = $LR+ / LR-$

References

↑ Fawcett, Tom (2006). "An Introduction to ROC Analysis" (PDF). Pattern Recognition Letters. 27 (8): 861 – 874. doi:10.1016/j.patrec.2005.10.010.
1 2 Powers, David M W (2011). "Evaluation: From Precision, Recall and F-Measure to ROC, Informedness, Markedness & Correlation" (PDF). Journal of Machine Learning Technologies. 2 (1): 37–63.
↑ Ting, Kai Ming (2011). Encyclopedia of machine learning. Springer. ISBN 978-0-387-30164-8.
↑ Stehman, Stephen V. (1997). "Selecting and interpreting measures of thematic classification accuracy". Remote Sensing of Environment. 62 (1): 77–89. doi:10.1016/S0034-4257(97)00083-7.

External links

This article is issued from Wikipedia - version of the 10/11/2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.