Issue
I have recently done my trial code on Decision Tree. It is working perfectly fine except for one thing. The tree being plotted does not contain class names in it. Do I do something wrong?
Please see code below and a picture of the data set.
#Import Data#
import pandas as pd
data_set = pd.read_excel(r"C:\Users\User\Desktop\Tree.xlsx")
print(data_set.head())
#Set Features and Training Targets#
features_names=["Money","Debt"]
target_names=["Mood1", "Mood2", "Mood3"]
features = data_set[features_names]
targets = data_set[target_names]
print(features)
print(targets)
#Set Training Set and Test Set#
train_features = features[:10]
train_targets = targets[:10]
test_features = features[10:]
test_targets = targets[10:]
print (train_features)
print (train_targets)
print(test_features)
print(test_targets)
#Estimating Tree#
from sklearn.tree import DecisionTreeRegressor
dt = DecisionTreeRegressor(max_depth = 3)
dt = dt.fit(train_features, train_targets)
print(dt.score(train_features, train_targets))
print(dt.score(test_features, test_targets))
#Plotting the Tree#
from sklearn import tree
import matplotlib.pyplot as plt
tree.plot_tree(dt, feature_names=features_names, class_names=target_names, filled = True)
plt.show()
Solution
In regression tasks visualizing labels might not work; the documentation states that class_name
parameter is "Only relevant for classification".
In this case, your target variable Mood
could be categorical, representing it's values in a single column. Once this is done, you can set
tree.plot_tree(clf, class_names=True)
for symbolic representation of class names
or
class_names = ['setosa', 'versicolor', 'virginica']
tree.plot_tree(clf, class_names=class_names)
for the specific class names.
Full Example
import numpy as np
from matplotlib import pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier
from sklearn import tree
iris = load_iris()
X = iris.data
y = iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)
clf = DecisionTreeClassifier(max_leaf_nodes=3, random_state=0)
clf.fit(X_train, y_train)
# Symbolic class name representation
tree.plot_tree(clf, class_names=True)
# Specific class name representation
class_names = iris['target_names']
tree.plot_tree(clf, class_names=class_names)
Answered By - Miguel Trejo
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.