Sunday, May 22, 2022

[FIXED] Accuracy not increasing when running multiple LinearRegressions tests

May 22, 2022 linear-regression, machine-learning, numpy, pandas, python No comments

Issue

I made a very simple program, that takes columns of data from a csv file, here is a short preview of the file data:

,matchId,blue_win,blueGold,blueMinionsKilled,blueJungleMinionsKilled,blueAvgLevel,redGold,redMinionsKilled,redJungleMinionsKilled,redAvgLevel,blueChampKills,blueHeraldKills,blueDragonKills,blueTowersDestroyed,redChampKills,redHeraldKills,redDragonKills,redTowersDestroyed
0,3493250918.0,0,24575.0,349.0,89.0,8.6,25856.0,346.0,80.0,9.2,6.0,1.0,0.0,1.0,12.0,2.0,0.0,1.0
1,3464936341.0,0,27210.0,290.0,36.0,9.0,28765.0,294.0,92.0,9.4,20.0,0.0,0.0,0.0,19.0,2.0,0.0,0.0
2,3428425921.0,1,32048.0,346.0,92.0,9.4,25305.0,293.0,84.0,9.4,17.0,3.0,0.0,0.0,11.0,0.0,0.0,4.0
3,3428347390.0,0,20261.0,223.0,60.0,8.2,30429.0,356.0,107.0,9.4,7.0,0.0,0.0,3.0,16.0,3.0,0.0,0.0
4,3428350940.0,1,30217.0,376.0,110.0,9.8,23889.0,334.0,60.0,8.8,16.0,3.0,0.0,0.0,8.0,0.0,0.0,2.0
5,3494458885.0,1,25470.0,362.0,82.0,9.2,22856.0,319.0,86.0,8.8,9.0,1.0,0.0,0.0,7.0,1.0,0.0,0.0
6,3463320642.0,1,25391.0,350.0,96.0,9.2,23236.0,345.0,80.0,8.6,8.0,2.0,0.0,0.0,5.0,1.0,0.0,1.0
...

I drop the unnecessary columns and run tests with 30% data used as test data to predict the accuracy of blue team winning the game:

import pandas as pd
import numpy as np
import sklearn
from sklearn import linear_model

df = pd.read_csv('MatchTimelinesFirst15.csv', delimiter=',')

predict = "blue_win"

df = df.drop('Unnamed: 0', axis=1)
df = df.drop('redDragonKills', axis=1)
df = df.drop('blueDragonKills', axis=1)
# print(df.describe())

x = np.array(df.drop([predict], axis=1))
y = np.array(df[predict])


for _ in range(500):
    x_train, x_test, y_train, y_test = sklearn.model_selection.train_test_split(x, y, test_size=0.30)

    # print('{0}, {1}'.format(type(x_train), x_train))

    linear = linear_model.LinearRegression()

    # trains model
    linear.fit(x_train, y_train)

    acc = linear.score(x_test, y_test)

    print('Accuracy: {0}'.format(acc))

But my accuracy wont increase even tho training it through a loop 500 times? I keep getting the same range of results:

Accuracy: 0.39030223064480596
Accuracy: 0.3980014684661366
Accuracy: 0.3840247556358104
Accuracy: 0.3939949181269252
Accuracy: 0.38657487661026535
Accuracy: 0.3950506154649621
Accuracy: 0.3925506648304995
...

Any help will be greatly appreciated, also on improvements since i am very new to python and machine learning.

Solution

You are not training the model any further by using your loop. You start fresh every 500 times, only difference is the random initialisation of you train-test split.

As for improvements of your classifier, I would steer away from Linear Regression. Regression is not the same thing as classification. Classification will predict categorical class labels and regression predicts a continuous quantity.

Since you want to find out when the blue team wins, you have a binary classification problem. Either the blue team wins or it doesn't.

Try classification models like an SVM.

Good luck!

Answered By - Ethan Van den Bleeken

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Sunday, May 22, 2022

[FIXED] Accuracy not increasing when running multiple LinearRegressions tests

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels