Issue
Could anyone help me debug this error? I appreciate it!
Col_1 Col_2 Col_3
0 8 0 0
1 0 1 0
2 0 0 1
3 8 0 0
'''
import pandas as pd
import numpy as np
import sklearn
from sklearn import linear_model
from sklearn.utils import shuffle
data = pd.read_csv("simple_data_2.csv")
print(data.shape)
data.dropna()
data = data[["Col_1","Col_2","Col_3"]]
predict ="Col_1"
gene1 = "Col_2"
gene2 = "Col_3"
data = data.dropna()
data = data.reset_index(drop=True)
print(data)
x = np.array(data[gene1,gene2])
y = np.array(data[predict])
x_train, x_test, y_train, y_test = sklearn.model_selection.train_test_split(x,y, test_size=0.1)
linear = linear_model.LinearRegression()
linear.fit(x_train,y_train)
acc = linear.score(x_test, y_test)
print(acc)'''
Traceback (most recent call last): File "/Users/Chris/opt/anaconda3/envs/tf/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 3361, in get_loc return self._engine.get_loc(casted_key) File "pandas/_libs/index.pyx", line 76, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/hashtable_class_helper.pxi", line 5198, in pandas._libs.hashtable.PyObjectHashTable.get_item File "pandas/_libs/hashtable_class_helper.pxi", line 5206, in pandas._libs.hashtable.PyObjectHashTable.get_item KeyError: ('Col_2', 'Col_3')
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/Users/Chris/PycharmProjects/TensorEnv/debug.py", line 16, in x = np.array(data[gene1,gene2]) File "/Users/Chris/opt/anaconda3/envs/tf/lib/python3.7/site-packages/pandas/core/frame.py", line 3458, in getitem indexer = self.columns.get_loc(key) File "/Users/Chris/opt/anaconda3/envs/tf/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 3363, in get_loc raise KeyError(key) from err KeyError: ('Col_2', 'Col_3')
Solution
The error is when you assigned names gene1 and gene2 to columns, you didn't do it properly.
gene1 = df["Col_2"]
But I wouldn't even bother with that, just use actual column names when feeding data into the algorithm or change their names, you will confuse yourself.
Answered By - user4718221
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.