Sunday, January 16, 2022

[FIXED] Meaning of accuracy for sklearn

January 16, 2022 machine-learning, python, scikit-learn No comments

Issue

I am doing a project where I predict the outcomes of sporting events. For this, I predict the winner and losers for a single event. I then place a bet depending on this. If I consider all events my strategy makes a positive return on 59% of events.

I want to only place bets on events where I expect to win. For this, I used sklearn to categorize events into events where I can expect to make a profit and those that I would make a loss on. Then I will only place bets on events which are categorized as profitable events. My model has an accuracy of 0.60 and is produced and tested with the following code:

knn = KNeighborsClassifier(n_neighbors = 6)
knn.fit(df_classifier, data_indicator)
y = data_indicator.values
X = df_classifier.values
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.3, random_state = 21, stratify = y)
knn = KNeighborsClassifier(n_neighbors = 300)
knn.fit(X_train, y_train)
y_pred = knn.predict(X_test)
knn.score(X_test,y_test)

From my understanding, accuracy is the number of events it correctly predicts. Therefore, if the correct and incorrect prediction is equally distributed over each outcome (profitable and non-profitable events) then 20% of the profitable events would be incorrectly categorized and the same would be for losing events.

Would this mean that if I only place bets on only predicted profitable events would I would have increased my accuracy from 59% to (59+20)%=79% in making a return on my bet?

Furthermore, if my reasoning is correct is it possible to see the distribution of correct and incorrect in my outcomes of winning and losing events.

Solution

I don't understand your logic, but this doesn't sound right. It is more like a math problem instead of programing problem. But you can see the distribution of correct and incorrect outcome just by adding this two lines:

from sklearn.metrics import confusion_matrix
print(confusion_matrix(y_test, y_pred))

See more detail in : https://scikit-learn.org/stable/modules/model_evaluation.html#confusion-matrix Hope this help.

Answered By - jason wong

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Sunday, January 16, 2022

[FIXED] Meaning of accuracy for sklearn

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels