Issue
my code doesn't work and I think it is because X and Y are not defined. I got the code from a book and it doesn't actually tell me how they are defined.
import pandas as pd
from matplotlib import pyplot
import seaborn as sns
from sklearn.feature_selection import SelectKBest
from sklearn.feature_selection import chi2
from sklearn.datasets import load_digits
from pandas import read_csv
from pandas.plotting import scatter_matrix
filename = '/Users/rahulparmeshwar/Documents/Algo Bots/Data/Live Data/Tester.csv'
data = read_csv(filename)
correlation = data.corr()
bestfeatures = SelectKBest(k=5)
fit = bestfeatures.fit(X,Y)
dfscores = pd.DataFrame(fit.scores_)
dfcolumns = pd.DataFrame(X.columns)
featurescores = pd.concat([dfcolumns,dfscores],axis=1)
pd.set_option('display.width',100)
data.head(1)
print(data)
scatter_matrix(data)
pyplot.show()
print(featurescores.nlargest('2,score'))
I've checked the documentation for SkLearn but it is not very helpful. Any Help will be greatly appreciated
Solution
X
and y
should be the feature set and target variable that you loaded from your data file. This is one typical way to define them:
data = read_csv(filename)
y = data['target variable name']
X = data.drop('target variable name', axis=1)
Answered By - Bill the Lizard
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.