Issue
I ultimately want to count the number of months in a given range per each user. For example, see below, user 1 has 1 range of data from April 2021-June 2021. Where I'm struggling is counting users that multiple ranges (see users 3 & 4).
I have a pandas df w/ columns that looks like these:
username Jan_2021 Feb_2021 March_2021 April_2021 May_2021 June_2021 July_2021 Sum_of_Months
user 1 0 0 0 1 1 1 0 3
user 2 0 0 0 0 0 0 1 1
user 3 1 1 1 0 1 1 0 5
user 4 0 1 1 1 0 1 1 5
Id like to be able to get a summary column that says the number of groups and their count. For example: When I say num of groups I mean the amount of grouped 1's together. and when I say length of group I mean the amount of months in 1 group, like if I were to draw a circle around the 1s. For example, user 1 is 3 because there's a 1 in columns April-June 2021
username Num_of_groups Lenth_of_group
user 1 1 3
user 2 1 1
user 3 2 3,2
user 4 2 3,2
Solution
You can try with groupby function from itertools
from itertools import groupby
df1 = df[[col for col in df.columns if "2021" in col]]
df["Lenth_of_group"] = df1.apply(lambda x: [sum(g) for i, g in groupby(x) if i == 1],axis=1)
df["Num_of_groups"] = df["Lenth_of_group"].apply(lambda x: len(x))
Hope this Helps...
Answered By - Sachin Kohli
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.