Issue
This is a followup question on this solution. There is automatic assignment of different colors when kind=line
but for scatter plot that's not the case.
import pandas as pd
import matplotlib.pylab as plt
import numpy as np
# random df
df = pd.DataFrame(np.random.randint(0,10,size=(25, 3)), columns=['label','x','y'])
# plot groupby results on the same canvas
fig, ax = plt.subplots(figsize=(8,6))
df.groupby('label').plot(kind='scatter', x = "x", y = "y", ax=ax)
There is a connected issue here. Is there any simple workaround for this?
Update:
When I try the solution recommended by @ImportanceOfBeingErnest for a label
column with strings, its not working!
df = pd.DataFrame(np.random.randint(0,10,size=(5, 2)), columns=['x','y'])
df['label'] = ['yes','no','yes','yes','no']
fig, ax = plt.subplots(figsize=(8,6))
ax.scatter(x='x', y='y', c='label', data=df)
It throws following error,
ValueError: Invalid RGBA argument: 'yes'
During handling of the above exception, another exception occurred:
Solution
You can use sns
:
df = pd.DataFrame(np.random.randint(0,10,size=(100, 2)), columns=['x','y'])
df['label'] = np.random.choice(['yes','no','yes','yes','no'], 100)
fig, ax = plt.subplots(figsize=(8,6))
sns.scatterplot(x='x', y='y', hue='label', data=df)
plt.show()
Output:
Another option is as what suggested in the comment: Map value to number, by categorical type:
fig, ax = plt.subplots(figsize=(8,6))
ax.scatter(df.x, df.y, c = pd.Categorical(df.label).codes, cmap='tab20b')
plt.show()
Output:
Answered By - Quang Hoang
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.