Issue
I want to evaluate with accuracy, precision, recall, f1 like this code but it show same result.
df = pd.read_csv(r'test.csv')
X = df.iloc[:,:10]
Y = df.iloc[:,10]
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.2)
clf = DecisionTreeClassifier()
clf = clf.fit(X_train,y_train)
predictions = clf.predict(X_test)
accuracy = accuracy_score(y_test, predictions)
precision = precision_score(y_test, predictions,average='micro')
recall = recall_score(y_test, predictions,average='micro')
f1 = f1_score(y_test, predictions,average='micro')
print("Accuracy: ", accuracy)
print("precision: ", precision)
print("recall: ", recall)
print("f1: ", f1)
It show output like this.
Accuracy: 0.8058823529411765
precision: 0.8058823529411765
recall: 0.8058823529411765
f1: 0.8058823529411765
The output is same value. How to fix it?
Solution
According to sklearn's documentation, the behavior is expected when using micro
as average and when dealing with a multiclass setting:
Note that if all labels are included, “micro”-averaging in a multiclass setting will produce precision, recall and F that are all identical to accuracy.
Here is a nice blog article describing why these scores can be equal (also with an intuitive example)
TL;DR
F1
equalsrecall
andprecision
ifrecall
==precision
- In the case of micro averaging, the number of false positive always equals the number of false negative. Thus,
recall
==precision
- Finally, note that
micro F1
always equalsaccuracy
. See here
Answered By - Jannik
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.