Issue
I have a model that can predict 10 classes. The problem is that I have predicted 26 samples, but none of them belongs to class 3 (= 'jumping_jacks'). So, neither my labels y_test
, nor my predictions y_pred
contain this class. In this case I normally would expect the confusion matrix to show the row "True label jumping_jacks" full of zeros, as well as the column "Predicted Label jumping_jacks" full of zeros.
However, it does show predictions for class 3. Those predictions are actually the predictions for class 4 (='lateral_shoulder_raises'). So everything is shifted, starting from the third row/column, up until the end. This is also the reason why the matrix does not contain results for class 9 (= 'tricep_extensions'), although y_test
and y_pred
contain this class.
How can I fix this?
Reproducible Code:
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay
from sklearn.preprocessing import LabelEncoder
import matplotlib.pyplot as plt
ex_classes = {'Classes': ['bicep_curls', 'dumbbell_rows', 'dumbbell_shoulder_press', 'jumping_jacks',
'lateral_shoulder_raises', 'lunges', 'pushups', 'situps', 'squats',
'tricep_extensions']}
df_classes = pd.DataFrame(data=ex_classes)
label_enc = LabelEncoder()
label_enc.fit(df_classes['Classes'])
y_test = np.asarray([8, 8, 8, 6, 6, 6, 2, 2, 2, 5, 5, 5, 1, 1, 1, 7, 7, 7, 9, 9, 9, 0, 0, 0, 0, 4])
y_pred = np.asarray([8, 4, 4, 6, 6, 6, 2, 2, 2, 5, 5, 5, 1, 1, 1, 9, 7, 7, 9, 9, 9, 0, 0, 1, 0, 4])
cm = confusion_matrix(y_test, y_pred)
display = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels = label_enc.classes_)
fig, ax = plt.subplots(figsize=(10,10))
display.plot(ax=ax, xticks_rotation='vertical')
plt.show()
Solution
You need to specify labels
when calculating confusion matrix:
cm = confusion_matrix(y_test, y_pred, labels=np.arange(len(df_classes)))
No predictions or ground truth labels contain label 3
so sklearn internally shifts the labels:
# If labels are not consecutive integers starting from zero, then
# y_true and y_pred must be converted into index form
Results with specified labels confusion_matrix(..., labels=)
:
Full example:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from sklearn.metrics import ConfusionMatrixDisplay, confusion_matrix
from sklearn.preprocessing import LabelEncoder
ex_classes = {
"Classes": [
"bicep_curls",
"dumbbell_rows",
"dumbbell_shoulder_press",
"jumping_jacks",
"lateral_shoulder_raises",
"lunges",
"pushups",
"situps",
"squats",
"tricep_extensions",
]
}
df_classes = pd.DataFrame(data=ex_classes)
label_enc = LabelEncoder()
label_enc.fit(df_classes["Classes"])
y_test = np.asarray(
[8, 8, 8, 6, 6, 6, 2, 2, 2, 5, 5, 5, 1, 1, 1, 7, 7, 7, 9, 9, 9, 0, 0, 0, 0, 4]
)
y_pred = np.asarray(
[8, 4, 4, 6, 6, 6, 2, 2, 2, 5, 5, 5, 1, 1, 1, 9, 7, 7, 9, 9, 9, 0, 0, 1, 0, 4]
)
cm = confusion_matrix(y_test, y_pred, labels=np.arange(len(df_classes)))
display = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=label_enc.classes_)
fig, ax = plt.subplots(figsize=(10, 10))
display.plot(ax=ax, xticks_rotation="vertical")
plt.show()
Answered By - u1234x1234
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.