Issue
I have a table:customer_ids, order_ids,product_id and order_dates and I want to add a column to my table that contains last order date of each customer that bought this product(in python).
customerid orderid productid orderdate
-----------------------------------------------------
1 1 1 2018/01/01
1 1 2 2018/01/01
1 2 3 2018/01/04
1 3 1 2018/01/10
2 5 1 2018/01/14
1 7 3 2018/01/17
2 12 2 2018/01/12
1 20 1 2018/01/23
and i want to have a table like this:
customerid orderid productid orderdate lastorderdate
----------------------------------------------------------------------
1 1 1 2018/01/01 NA
1 1 2 2018/01/01 NA
1 2 3 2018/01/04 NA
1 3 1 2018/01/10 2018/01/01
2 5 1 2018/01/14 NA
1 7 3 2018/01/17 2018/01/04
2 12 2 2018/01/12 NA
2 20 1 2018/01/23 2018/01/14
what should i do?
Solution
IIUC, you can use:
df=df.sort_values(['customerid','productid'])
df['last_order']=df.groupby(['productid','customerid'])['orderdate'].\
apply(lambda x: x.shift())
print(df)
And the output is:
customerid orderid productid orderdate last_order
0 1 1 1 2018-01-01 NaT
3 1 3 1 2018-01-10 2018-01-01
7 1 20 1 2018-01-23 2018-01-10
1 1 1 2 2018-01-01 NaT
2 1 2 3 2018-01-04 NaT
5 1 7 3 2018-01-17 2018-01-04
4 2 5 1 2018-01-14 NaT
6 2 12 2 2018-01-12 NaT
you can also use df = df.sort_index()
to get the index aligned as original.
Output based on your data:
df=df.sort_values(['customer_id','product_id'])
df['last_order']=df.groupby(['product_id','customer_id'])['date'].\
apply(lambda x: x.shift())
print(df.sort_index().head(20))
row_id date customer_id product_id last_order
0 1 2018-04-07 4 1 NaT
1 2 2018-04-07 4 1 2018-04-07
2 3 2018-04-07 4 1 2018-04-07
3 4 2018-04-07 4 1 2018-04-07
4 5 2018-04-07 4 1 2018-04-07
5 6 2018-04-07 4 1 2018-04-07
6 7 2018-04-07 4 1 2018-04-07
7 8 2018-04-07 4 1 2018-04-07
8 13 2018-04-09 4 1 2018-04-07
9 49 2018-04-13 4 1 2018-04-09
10 106 2018-04-20 4 1 2018-04-13
11 115 2018-04-20 4 1 2018-04-20
12 142 2018-04-27 4 2 NaT
13 143 2018-04-27 4 2 2018-04-27
14 149 2018-04-29 4 2 2018-04-27
15 168 2018-05-02 4 1 2018-04-20
16 169 2018-05-02 4 1 2018-05-02
17 229 2018-05-08 4 5 NaT
18 230 2018-05-08 4 5 2018-05-08
19 231 2018-05-08 4 5 2018-05-08
Answered By - anky
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.