Issue
I watched many videos, read the Seaborn documentation, checked many websites, but I still haven't found the answer to a question.
This is from the Seaborn documentation:
iris = sns.load_dataset("iris")
ax = sns.boxplot(data=iris, orient="h", palette="Set2")
This code creates boxplots for each numerical variable in a single graph.
When I tried to add the hue= "species", ValueError: Cannot use hue
without x
and y
. Is there a way to do this with Seaborn? I want to see Boxplots of all the numerical variables and explore a categorical variable. So the graph will show all numerical variables for each species. Since there are 3 species, the total of Boxplots will be 12 (3 species times 4 numerical variables).
I am learning about EDA (exploratory data analysis). I think the above graph will help me explore many variables at once.
Thank you for taking the time to read my question!
Solution
To apply "hue", seaborn needs the dataframe in "long" form. df.melt()
is a pandas function that can help here. It converts the numeric columns into 2 new columns: one called "variable" with the old name of the column, and one called "value" with the values. The resulting dataframe will be 4 times as long so that "value" can be used for x=
, and "variable" for y=
.
The long form looks like:
species | variable | value | |
---|---|---|---|
0 | setosa | sepal_length | 5.1 |
1 | setosa | sepal_length | 4.9 |
2 | setosa | sepal_length | 4.7 |
3 | setosa | sepal_length | 4.6 |
4 | setosa | sepal_length | 5.0 |
... | ... | ... |
import seaborn as sns
from matplotlib import pyplot as plt
iris = sns.load_dataset("iris")
iris_long = iris.melt(id_vars=['species'])
ax = sns.boxplot(data=iris_long, x="value", y="variable", orient="h", palette="Set2", hue="species")
plt.tight_layout()
plt.show()
Answered By - JohanC
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.