This is the code that I have been writing, but unable to add labels to the data points. Have tried multiple ways but getting error one after the other!! The data set in 9th line: 'country' is to be used as labelling. I want to label the 1st and last data point. Please Help!
import pandas as pd
import numpy as np
from sklearn.cluster import KMeans
import matplotlib.pyplot as plt
data = pd.read_csv('happy_income1.csv')
happy = data['happyScore']
satis = data['avg_satisfaction']
country = data['country']
# Zapping 2 arrays together
satis_happy = np.column_stack((satis,happy))
# Sorting
data.sort_values('avg_satisfaction', inplace=True) #Sorting Data Column
# Filtering
satisfied = data[data['avg_satisfaction']>4] #Making Section as per requirement
# Making clusters as required
k_res = KMeans(n_clusters=3).fit(satis_happy)
cluster = k_res.cluster_centers_
# Plotting
fig, week4 = plt.subplots()
week4.scatter(x=happy, y=satis)
week4.scatter(x=cluster[:,0], y=cluster[:,1], s=9999, alpha=0.25)
week4.set_title('Happiness versus Satisfaction')
# Labelling
# ----------------------------------------------
CSV File Link: Click Here
You can add these two additional lines after plotting the scatter plots. They will add the text to the first and last entries. You can do additional things like background box, etc. if required. You can check matplotlib documentation and examples here
week4.annotate(country[0], (happy[0]+offset, satis[0]+offset), color='red', weight='bold')
week4.annotate(country.iat[-1], (happy.iat[-1]+offset, satis.iat[-1]+offset), color='blue', weight='bold')
Output graph
Answered By - Redox
Post a Comment
Note: Only a member of this blog may post a comment.