Issue
I have a data field like this.
Index | Product | Purchase_Address | Order_Date |
---|---|---|---|
0 | A | 604 Cherry st, Dallas | 2019-10-28 |
1 | B | 225 5th st, LA | 2019-10-29 |
2 | C | 604 Cherry st, Dallas | 2019-10-28 |
3 | D | 225 5th st, LA | 2019-10-29 |
4 | E | 967 12th st, NY | 2019-10-27 |
5 | F | 967 12th st, NY | 2019-10-27 |
6 | A | 628 Jefferson St, NY | 2019-10-20 |
7 | B | 628 Jefferson St, NY | 2019-10-20 |
8 | A | 694 Meadow St, Atlanta | 2019-10-25 |
9 | B | 694 Meadow St, Atlanta | 2019-10-25 |
10 | C | 27 Wilson St, Austin | 2019-10-26 |
11 | D | 27 Wilson St, Austin | 2019-10-26 |
I need to make a new data field where I would merge the products into a single column if the address and order date are the same (meaning they where ordered at the same time).
The df should look something like this:
Index | Product | Purchase_Address |
---|---|---|
0 | A, C | 604 Cherry st, Dallas |
1 | B, D | 225 5th st, LA |
2 | E, F | 967 12th st, NY |
3 | A, B | 628 Jefferson St, NY |
4 | A, B | 694 Meadow St, Atlanta |
5 | C, D | 27 Wilson St, Austin |
And then from that a df, where I count the number of times a combination has happened:
Index | Product_Combination | Nr_Of_Times |
---|---|---|
0 | A, C | 1 |
1 | B, D | 1 |
2 | E, F | 1 |
4 | A, B | 2 |
5 | C, D | 1 |
How would I achieve something like this? Thanks!
Solution
Use Groupby.agg
with Groupby.count
and Series.to_frame
:
In [1783]: out = df.groupby(['Purchase_Address', 'Order_Date']).agg({'Product': ','.join}).groupby('Product')['Product'].count().to_frame('Nr_Of_Times').reset_index()
In [1784]: out
Out[1784]:
Product Nr_Of_Times
0 A,B 2
1 A,C 1
2 B,D 1
3 C,D 1
4 E,F 1
Answered By - Mayank Porwal
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.