Evaluation of a classifier by confusion matrix in data mining

How to evaluate a classifier?

The classifier can be evaluated by building the confusion matrix. Confusion matrix shows the total number of correct and wrong predictions.

Confusion Matrix for class label positive(+VE) and negative(-VE)is shown below;

  Actual Class(Target)  
  +VE -VE
Predicted Class (Model) +VE A  =

True +VE

B =

False -VE

+VE

prediction

P=A / (A+B)
-VE C =

False +VE

D =

True -VE

-VE

prediction

D / (C + D)
    Sensitivity Specificity Accuracy =

A + D / (A + B + C + D)

A / (A + C) D / (B + D)

[quads id=1]

Accuracy:

Accuracy is the proportion of the total number of correct predictions.

e.g

Accuracy = A + D / (A + B + C + D)

Error-Rate:

Error Rate = 1 – Accuracy

+VE predictions:

+VE predictions are the proportion of the total number of correct positive predictions.

+VE predictions = A / (A+B)

[quads id=2]

-VE predictions:

-VE predictions are the proportion of the total number of correct negative predictions.

-VE predictions = D / (C + D)

Precision:

Precision is the correctness that how much tuple are

  • +VE and classifier predicted them as +VE
  • -VE and classifier predicted them as -VE

Precision = A / P

Recall:

Recall = A / Real positive

Sensitivity (Recall):

Sensitive is the total True +VE rate.

The correction of the actual positive cases that are correctly identified.

Sensitivity (Recall) = A / (A + C)

F-Measure:

F-Measure is harmonic mean of recall and precision.

F-Measure = 2 * Precision * Recall / Precision + Recall 

Specificity:

Specificity is true -VE rate.

Specificity is the proportion of the actual -VE cases that are correctly identified.

Specificity = D / (B + D)

Note: Specificity of one class is same as the sensitivity of the other class.

Next Similar Tutorials

  1. Decision tree induction on categorical attributes  – Click Here
  2. Decision Tree Induction and Entropy in data mining – Click Here
  3. Overfitting of decision tree and tree pruning – Click Here
  4. Attribute selection Measures – Click Here
  5. Computing Information-Gain for Continuous-Valued Attributes in data mining – Click Here
  6. Gini index for binary variables – Click Here
  7. Bagging and Bootstrap in Data Mining, Machine Learning – Click Here
  8. Evaluation of a classifier by confusion matrix in data mining – Click Here
  9. Holdout method for evaluating a classifier in data mining – Click Here
  10. RainForest Algorithm / Framework – Click Here
  11. Boosting in data mining – Click Here
  12. Naive Bayes Classifier  – Click Here