Sunday, March 27, 2022

[FIXED] Can't get SVC Score function to work

March 27, 2022 machine-learning, numpy, python No comments

Issue

I am trying to run this machine learning platform and I get the following error:

ValueError: X.shape[1] = 574 should be equal to 11, the number of features at training time

My Code:

from pylab import *
from sklearn.svm import SVC
from sklearn.ensemble import RandomForestClassifier
import numpy as np

X = list ()
Y = list ()
validationX = list ()
validationY = list ()
file = open ('C:\\Users\\User\\Desktop\\csci4113\\project1\\whitewineTraining.txt','r')
for eachline in file:
    strArray = eachline.split(";")
    row = list ()
    for i in range(len(strArray) - 1):
        row.append(float(strArray[i])) 
    X.append(row)
    if (int(strArray[-1]) > 6):
        Y.append(1)
    else:
        Y.append(0)
file2 = open ('C:\\Users\\User\\Desktop\\csci4113\\project1\\whitewineValidation.txt', 'r')
for eachline in file2:
    strArray = eachline.split(";")
    row2 = list ()
    for i in range(len(strArray) - 1):
        row2.append(float(strArray[i])) 
    validationX.append(row2)      
    if (int(strArray[-1]) > 6):
        validationY.append(1)
    else:
        validationY.append(0)

X = np.array(X)
print (X)
Y = np.array(Y)
print (Y)
validationX = np.array(validationX)
validationY = np.array(validationY)

clf = svm.SVC()
clf.fit(X,Y)
result = clf.predict(validationX)
clf.score(result, validationY)

The goal of the program is to to build a model from the fit() command where we can use it to compare to a validation set in validationY and see the validity of our machine learning model. Here is the rest of the console output: keep in mind X is confusingly a 11x574 array!

[[  7.           0.27         0.36       ...,   3.           0.45         8.8       ]
 [  6.3          0.3          0.34       ...,   3.3          0.49         9.5       ]
 [  8.1          0.28         0.4        ...,   3.26         0.44        10.1       ]
 ..., 
 [  6.3          0.28         0.22       ...,   3.           0.33        10.6       ]
 [  7.4          0.16         0.33       ...,   3.04         0.68        10.5       ]
 [  8.4          0.27         0.3        ...,   2.89         0.3
   11.46666667]]
[0 0 0 ..., 0 1 0]
C:\Users\User\Anaconda3\lib\site-packages\sklearn\utils\validation.py:386: DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and willraise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
  DeprecationWarning)
Traceback (most recent call last):

  File "<ipython-input-68-31c649fe24b3>", line 1, in <module>
    runfile('C:/Users/User/Desktop/csci4113/project1/program1.py', wdir='C:/Users/User/Desktop/csci4113/project1')

  File "C:\Users\User\Anaconda3\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 714, in runfile
    execfile(filename, namespace)

  File "C:\Users\User\Anaconda3\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 89, in execfile
    exec(compile(f.read(), filename, 'exec'), namespace)

  File "C:/Users/User/Desktop/csci4113/project1/program1.py", line 43, in <module>
    clf.score(result, validationY)

  File "C:\Users\User\Anaconda3\lib\site-packages\sklearn\base.py", line 310, in score
    return accuracy_score(y, self.predict(X), sample_weight=sample_weight)

  File "C:\Users\User\Anaconda3\lib\site-packages\sklearn\svm\base.py", line 568, in predict
    y = super(BaseSVC, self).predict(X)

  File "C:\Users\User\Anaconda3\lib\site-packages\sklearn\svm\base.py", line 305, in predict
    X = self._validate_for_predict(X)

  File "C:\Users\User\Anaconda3\lib\site-packages\sklearn\svm\base.py", line 474, in _validate_for_predict
    (n_features, self.shape_fit_[1]))

ValueError: X.shape[1] = 574 should be equal to 11, the number of features at training time


runfile('C:/Users/User/Desktop/csci4113/project1/program1.py', wdir='C:/Users/User/Desktop/csci4113/project1')
10
[[  7.           0.27         0.36       ...,   3.           0.45         8.8       ]
 [  6.3          0.3          0.34       ...,   3.3          0.49         9.5       ]
 [  8.1          0.28         0.4        ...,   3.26         0.44        10.1       ]
 ..., 
 [  6.3          0.28         0.22       ...,   3.           0.33        10.6       ]
 [  7.4          0.16         0.33       ...,   3.04         0.68        10.5       ]
 [  8.4          0.27         0.3        ...,   2.89         0.3
   11.46666667]]
[0 0 0 ..., 0 1 0]
C:\Users\User\Anaconda3\lib\site-packages\sklearn\utils\validation.py:386: DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and willraise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
  DeprecationWarning)
Traceback (most recent call last):

  File "<ipython-input-69-31c649fe24b3>", line 1, in <module>
    runfile('C:/Users/User/Desktop/csci4113/project1/program1.py', wdir='C:/Users/User/Desktop/csci4113/project1')

  File "C:\Users\User\Anaconda3\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 714, in runfile
    execfile(filename, namespace)

  File "C:\Users\User\Anaconda3\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 89, in execfile
    exec(compile(f.read(), filename, 'exec'), namespace)

  File "C:/Users/User/Desktop/csci4113/project1/program1.py", line 46, in <module>
    clf.score(result, validationY)

  File "C:\Users\User\Anaconda3\lib\site-packages\sklearn\base.py", line 310, in score
    return accuracy_score(y, self.predict(X), sample_weight=sample_weight)

  File "C:\Users\User\Anaconda3\lib\site-packages\sklearn\svm\base.py", line 568, in predict
    y = super(BaseSVC, self).predict(X)

  File "C:\Users\User\Anaconda3\lib\site-packages\sklearn\svm\base.py", line 305, in predict
    X = self._validate_for_predict(X)

  File "C:\Users\User\Anaconda3\lib\site-packages\sklearn\svm\base.py", line 474, in _validate_for_predict
    (n_features, self.shape_fit_[1]))``

Solution

You are simply passing wrong object to score function, documentation clearly states

score(X, y, sample_weight=None)

X : array-like, shape = (n_samples, n_features) Test samples.

and you pass predictions instead, thus

result = clf.predict(validationX)
clf.score(result, validationY)

is invalid, and should be just

clf.score(validationX, validationY)

What you tried to do would be fine if you use some scorer, and not classifier, classifier .score methods call .predict on their own, thus you pass raw data as an argument.

Answered By - lejlot

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Sunday, March 27, 2022

[FIXED] Can't get SVC Score function to work

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels