Issue
I have a dataframe which looks like below:
df:
RY MAJ_CAT Value
2016 Cause Unknown 0.00227
2016 Vegetation 0.04217
2016 Vegetation 0.04393
2016 Vegetation 0.07878
2016 Defective Equip 0.00137
2018 Cause Unknown 0.00484
2018 Defective Equip 0.01546
2020 Defective Equip 0.05169
2020 Defective Equip 0.00515
2020 Cause Unknown 0.00050
I want to plot the distribution of the value over the given years. So I used distplot of seaborn by using following code:
year_2016 = df[df['RY']==2016]
year_2018 = df[df['RY']==2018]
year_2020 = df[df['RY']==2020]
sns.distplot(year_2016['value'].values, hist=False,rug=True)
sns.distplot(year_2018['value'].values, hist=False,rug=True)
sns.distplot(year_2020['value'].values, hist=False,rug=True)
In the next step I want to plot the same value distribution over the given year w.r.t MAJ_CAT. So I decided to use Facetgrid of seaborn, below is the code :
g = sns.FacetGrid(df,col='MAJ_CAT')
g = g.map(sns.distplot,df[df['RY']==2016]['value'].values, hist=False,rug=True))
g = g.map(sns.distplot,df[df['RY']==2018]['value'].values, hist=False,rug=True))
g = g.map(sns.distplot,df[df['RY']==2020]['value'].values, hist=False,rug=True))
However, when it ran the above command, it throws the following error:
KeyError: "None of [Index([(0.00227, 0.04217, 0.043930000000000004, 0.07877999999999999, 0.00137, 0.0018800000000000002, 0.00202, 0.00627, 0.00101, 0.07167000000000001, 0.01965, 0.02775, 0.00298, 0.00337, 0.00088, 0.04049, 0.01957, 0.01012, 0.12065, 0.23699, 0.03639, 0.00137, 0.03244, 0.00441, 0.06748, 0.00035, 0.0066099999999999996, 0.00302, 0.015619999999999998, 0.01571, 0.0018399999999999998, 0.03425, 0.08046, 0.01695, 0.02416, 0.08975, 0.0018800000000000002, 0.14743, 0.06366000000000001, 0.04378, 0.043, 0.02997, 0.0001, 0.22799, 0.00611, 0.13960999999999998, 0.38871, 0.018430000000000002, 0.053239999999999996, 0.06702999999999999, 0.14103, 0.022719999999999997, 0.011890000000000001, 0.00186, 0.00049, 0.13947, 0.0067, 0.00503, 0.00242, 0.00137, 0.00266, 0.38638, 0.24068, 0.0165, 0.54847, 1.02545, 0.01889, 0.32750999999999997, 0.22526, 0.24516, 0.12791, 0.00063, 0.0005200000000000001, 0.00921, 0.07665, 0.00116, 0.01042, 0.27046, 0.03501, 0.03159, 0.46748999999999996, 0.022090000000000002, 2.2972799999999998, 0.69021, 0.22529000000000002, 0.00147, 0.1102, 0.03234, 0.05799, 0.11744, 0.00896, 0.09556, 0.03202, 0.01347, 0.00923, 0.0034200000000000003, 0.041530000000000004, 0.04848, 0.00062, 0.0031100000000000004, ...)], dtype='object')] are in the [columns]"
I am not sure where am I making the mistake. Could anyone please help me in fixing the issue?
Solution
setup the dataframe
import pandas as pd
import numpy as np
import seaborn as sns
# setup dataframe of synthetic data
np.random.seed(365)
data = {'RY': np.random.choice([2016, 2018, 2020], size=400),
'MAJ_CAT': np.random.choice(['Cause Unknown', 'Vegetation', 'Defective Equip'], size=400),
'Value': np.random.random(size=400) }
df = pd.DataFrame(data)
Updated Answer
- From
seaborn v0.11
- Use
sns.displot
withkind='kde'
andrug=True
- Is a figure-level interface for drawing distribution plots onto a FacetGrid.
Plotting all 'MAJ_CAT'
together
sns.displot(data=df, x='Value', hue='RY', kind='kde', palette='tab10', rug=True)
Plotting 'MAJ_CAT'
separately
sns.displot(data=df, col='MAJ_CAT', x='Value', hue='RY', kind='kde', palette='tab10', rug=True)
Original Answer
- In
seaborn v0.11
,distplot
is deprecated
distplot
- Consolidate the original code to generate the distplot
for year in df.RY.unique():
values = df.Value[df.RY == year]
sns.distplot(values, hist=False, rug=True)
facetgrid
- properly configure the mapping and add
hue
toFacetGrid
g = sns.FacetGrid(df, col='MAJ_CAT', hue='RY')
p1 = g.map(sns.distplot, 'Value', hist=False, rug=True).add_legend()
Answered By - Trenton McKinney
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.