Issue
I have a data frame as below.
Index Date Group Name
0 Oct-10 A Sam
1 Oct-10 A Sam
2 Dec-12 A Sam
3 Dec-12 A Sam
4 May-13 A Peter
5 May-14 A Peter
6 May-15 A Chris
7 Apr-17 A Chris
8 Apr-17 A Sam
9 Jul-23 A Sam
10 Dec-12 B Peter
11 Dec-12 B Peter
12 May-13 B Peter
13 May-14 B Peter
14 Apr-18 B Sam
15 Jul-19 B Sam
16 Jul-20 B Sam
17 Jul-22 B Sam
18 Jul-23 B Sam
I want to forward fill the date values for every "Group" in such a way that every person start date should be forward filled.
Example : Group A, Sam started from Oct 2010, hence till from index 0 to 3, it should be Oct 2010 for him. Peter started on May 2013, so same May 2013 for index 4 & 5. However, again Sam comes in April 2017 so it should be April 2017 for index 8 & 9.
I am finding this is very difficult because Sam is appearing twice in group A. Your help will be highly appriciated to solve this. I am trying someting like below but not giving any results.
for every group in df["Group"]:
df["date2"] = (df["Date"].groupby(["Name"]).transform(lambda x: x.ffill()))
the result I want is below
Index Date Group Name date2
0 Oct-10 A Sam Oct-10
1 Oct-10 A Sam Oct-10
2 Dec-12 A Sam Oct-10
3 Dec-12 A Sam Oct-10
4 May-13 A Peter May-13
5 May-14 A Peter May-13
6 May-15 A Chris May-15
7 Apr-17 A Chris May-15
8 Apr-17 A Sam Apr-17
9 Jul-23 A Sam Apr-17
10 Dec-12 B Peter Dec-12
11 Dec-12 B Peter Dec-12
12 May-13 B Peter Dec-12
13 May-14 B Peter Dec-12
14 Apr-18 B Sam Apr-18
15 Jul-19 B Sam Apr-18
16 Jul-20 B Sam Apr-18
17 Jul-22 B Sam Apr-18
18 Jul-23 B Sam Apr-18
Solution
You can compute a mask with shift
and any
, then use it to select the first values and ffill
:
cols = ['Group', 'Name']
m = df[cols].ne(df[cols].shift()).any(axis=1)
df['date2'] = df['Date'].where(m).ffill()
Output:
Index Date Group Name date2
0 0 Oct-10 A Sam Oct-10
1 1 Oct-10 A Sam Oct-10
2 2 Dec-12 A Sam Oct-10
3 3 Dec-12 A Sam Oct-10
4 4 May-13 A Peter May-13
5 5 May-14 A Peter May-13
6 6 May-15 A Chris May-15
7 7 Apr-17 A Chris May-15
8 8 Apr-17 A Sam Apr-17
9 9 Jul-23 A Sam Apr-17
10 10 Dec-12 B Peter Dec-12
11 11 Dec-12 B Peter Dec-12
12 12 May-13 B Peter Dec-12
13 13 May-14 B Peter Dec-12
14 14 Apr-18 B Sam Apr-18
15 15 Jul-19 B Sam Apr-18
16 16 Jul-20 B Sam Apr-18
17 17 Jul-22 B Sam Apr-18
18 18 Jul-23 B Sam Apr-18
Intermediates:
Index Date Group Name date2 Group_shift Name_shift Group_eq_shift Name_eq_shift m
0 0 Oct-10 A Sam Oct-10 None None True True True
1 1 Oct-10 A Sam Oct-10 A Sam False False False
2 2 Dec-12 A Sam Oct-10 A Sam False False False
3 3 Dec-12 A Sam Oct-10 A Sam False False False
4 4 May-13 A Peter May-13 A Sam False True True
5 5 May-14 A Peter May-13 A Peter False False False
6 6 May-15 A Chris May-15 A Peter False True True
7 7 Apr-17 A Chris May-15 A Chris False False False
8 8 Apr-17 A Sam Apr-17 A Chris False True True
9 9 Jul-23 A Sam Apr-17 A Sam False False False
10 10 Dec-12 B Peter Dec-12 A Sam True True True
11 11 Dec-12 B Peter Dec-12 B Peter False False False
12 12 May-13 B Peter Dec-12 B Peter False False False
13 13 May-14 B Peter Dec-12 B Peter False False False
14 14 Apr-18 B Sam Apr-18 B Peter False True True
15 15 Jul-19 B Sam Apr-18 B Sam False False False
16 16 Jul-20 B Sam Apr-18 B Sam False False False
17 17 Jul-22 B Sam Apr-18 B Sam False False False
18 18 Jul-23 B Sam Apr-18 B Sam False False False
Answered By - mozway
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.