Issue
I'm writing a code to implement k-fold cross validation.
data = pd.read_csv('Data_assignment1.csv')
k=10
np.random.shuffle(data.values) # Shuffle all rows
folds = np.array_split(data, k) # split the data into k folds
for i in range(k):
x_cv = folds[i][:, 0] # Set ith fold for testing
y_cv = folds[i][:, 1]
new_folds = np.row_stack(np.delete(folds, i, 0)) # Remove ith fold for training
x_train = new_folds[:, 0] # Set the remaining folds for training
y_train = new_folds[:, 1]
When trying to set the values for x_cv and y_cv, I get the error:
TypeError: '(slice(None, None, None), 0)' is an invalid key
In an attempt to solve this, I tried using folds.iloc[i][:, 0].values etc:
for i in range(k):
x_cv = folds.iloc[i][:, 0].values # Set ith fold for testing
y_cv = folds.iloc[i][:, 1].values
new_folds = np.row_stack(np.delete(folds, i, 0)) # Remove ith fold for training
x_train = new_folds.iloc[:, 0].values # Set the remaining folds for training
y_train = new_folds.iloc[:, 1].values
I then got the error:
AttributeError: 'list' object has no attribute 'iloc'
How can I get around this?
Solution
folds = np.array_split(data, k)
will return alist of Dataframes
.type(folds) == list
- This is why you got
AttributeError: 'list' object has no attribute 'iloc'
.List
objects dont have theiloc
method. - So you need to first access list with index first to get each DataFrame object.
folds[i]
. type(folds[i]) == pandas.DataFrame
- Now use
iloc
on theDataFrame
object. folds[i].iloc[:,0].values
Answered By - Inyoung Kim 김인영
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.