Issue
I have two columns and I want to reshape the table for a cross-count. How may I achieve this through Pandas?
data = {
"fruits": ["orange, apple, banana", "orange, apple, banana",
"apple, banana", "orange, apple, banana", "others"],
"places": ["New York, London, Boston", "New York, Manchester",
"Tokyo", "Hong Kong, Boston", "London"],
}
df = pd.DataFrame(data)
fruits places
0 orange, apple, banana New York, London, Boston
1 orange, apple, banana New York, Manchester
2 apple, banana Tokyo
3 orange, apple, banana Hong Kong, Boston
4 others London
Expected output:
New York London Boston Hong Kong Manchester Tokyo
orange 2 2 2 1 1 0
apple 2 1 2 1 1 1
banana 2 1 2 1 1 1
others 0 1 0 0 0 0
Solution
You can use pandas.crosstab
on the splitted/exploded columns:
df2 = (df.apply(lambda c: c.str.split(', ')) # split all columns
.explode('fruit').explode('places') # explode to new rows
)
pd.crosstab(df2['fruit'], df2['places']) # compute crosstab
output:
places Boston Hong Kong London Manchester New York Tokyo
fruit
apple 2 1 1 1 2 1
banana 2 1 1 1 2 1
orange 2 1 1 1 2 0
others 0 0 1 0 0 0
Answered By - mozway
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.