Issue
I have a large dataframe such as below:
vehicle id delta
0 0 0
1 0 20
2 0 40
3 0 400
4 0 10
5 1 0
6 1 10
7 1 500
8 1 10
9 1 10
10 1 100
11 1 10
I want to add a new column as 'Trip' for each different vehicle that starts with trip_1 and if the delta is more than 50, then it adds a number to the trip number so the results would be as follow:
vehicle id delta Trip
0 0 0 trip_1
1 0 20 trip_1
2 0 40 trip_1
3 0 400 trip_2
4 0 10 trip_2
5 1 0 trip_1
6 1 10 trip_1
7 1 500 trip_2
8 1 10 trip_2
9 1 10 trip_2
10 1 100 trip_3
11 1 10 trip_3
I'm thinking about using iterrow() but I want to avoid it since the dataframe is huge. Any suggestions?
Solution
Try this:
df['Trip'] = 'trip_' + df.groupby('id')['delta'].transform(
lambda grp: np.where(grp > 50, 1, 0).cumsum()+1).apply(str)
print(df)
vehicle id delta Trip
0 0 0 0 trip_1
1 1 0 20 trip_1
2 2 0 40 trip_1
3 3 0 400 trip_2
4 4 0 10 trip_2
5 5 1 0 trip_1
6 6 1 10 trip_1
7 7 1 500 trip_2
8 8 1 10 trip_2
9 9 1 10 trip_2
10 10 1 100 trip_3
11 11 1 10 trip_3
Answered By - I'mahdi
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.