Issue
After fitting a decision tree I want to do some predictions.
How can I make this output predicted array more readable ? As you can see, the output corresponds to the features' input values but how can I show the corresponding names above the values? For example the first value is
0.28945
corresponding to a column namedtime_ms
but since i have multiple columns I can't distinguish all of them. How can I display the names above the input values?These input values return a classification into
[1]
corresponding to the value "yes". How can I replace it to displayyes
instead of1
Finally, is it useful (or the best way) to do predictions like this when there are multiple features and not just two just like in the Iris database (petal length, width) ?
clf_dt_grid = grid_search_cv.fit(X_train, y_train)
y_pred_grid = grid_search_cv.predict(X_test)
#Predictions on training set
yhat= clf_dt_grid.predict(X_train)
acc = accuracy_score(y_train,yhat)
print(f'Train predictions accuracy : {acc}%')
#New input for predictions
new_input = [[0.28945,6593,0,178,2,154.5,True,6593.0,4.0,0,1,2.0,0.00,862.0,0.21,524103.0,2]]
new_output = clf_dt_grid.predict(new_input)
print(new_input,new_output)
>>> [[0.28945, 6593, 0, 178, 2, 154.5, True, 6593.0, 4.0, 0, 1, 2.0, 0.0, 862.0, 0.21, 524103.0, 2]] [1]
Solution
You can put your train/test data into a pandas dataframe and then everything would be more readable.
For example, list out the column names and then use the dataframe constructor like this:
col_names = ['time_ms', 'col2', 'col3',..., 'colN']
train_df = pd.DataFrame(columns=col_names, data=X_train)
test_df = pd.DataFrame(columns=col_names, data=X_test)
Then for your new predictions, you can structure it the same way in a data frame (e.g. call it new_input
, then call predict:
new_output = clf_dt_grid.predict(new_input.iloc[0].values) # get the first row of values only
print(new_input.iloc[0], new_output)
Answered By - TC Arlen
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.