Issue
I have a list of lists:
my_list= [['UV'],
['SB'],
['NMR'],
['ISSN'],
['UK', 'USA'],
['MT'],
['UK'],
['UK'],
['ESP'],
['UK'],
['UK'],
['UK'],
['UK'],
['UK'],
['UK']]
that I would like to plot in terms of frequency (from the most frequent term to the less frequent).
I am finding some issue in counting the items. What I first did is to flatten the list of lists:
flattened = []
for sublist in my_list:
for val in sublist:
flattened.append(val)
Then I tried to count items it
from collections import Counter
import pandas as pd
counts = Counter(flattened)
df_ver = pd.DataFrame.from_dict(counts, orient='index')
df_ver.plot(kind='bar')
However it does not work. Also it should be not sorted, I guess.
Solution
Let's try with pure Python:
counts = {}
for countries in my_list:
for country in countries:
counts[country] = counts.get(country,0) +1
sorted_counts = sorted(counts.items(), key=lambda i: (-i[1],i[0])) # sort by count and alphabetically if draw
# ktop = 10
# sorted_counts = sorted_counts[:ktop]
countries, counts = list(zip(*sorted_counts))
plt.bar(countries, counts);
Answered By - Sergey Bushmanov
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.