Issue
I start with a file in which I have daily data from a group of people, and I would like to captures when one value of one column change if it did change
The dataframe's structure looks like the one below:
id | name | startdate | filedate | value |
---|---|---|---|---|
1 | Sta | 10-12-2019 | 24-04-2021 | 1 |
1 | Sta | 10-12-2019 | 25-04-2021 | 0.5 |
1 | Sta | 10-12-2019 | 26-04-2021 | 0.5 |
1 | Sta | 10-12-2019 | 27-04-2021 | 0.9 |
2 | Danny | 20-03-2020 | 24-04-2021 | 1 |
2 | Danny | 20-03-2020 | 25-04-2021 | 1 |
2 | Danny | 20-03-2020 | 26-04-2021 | 0.3 |
2 | Danny | 20-03-2020 | 27-04-2021 | 0.3 |
3 | Elle | 14-08-2020 | 24-04-2021 | 1 |
3 | Elle | 14-08-2020 | 25-04-2021 | 1 |
3 | Elle | 14-08-2020 | 26-04-2021 | 1 |
3 | Elle | 14-08-2020 | 27-04-2021 | 1 |
my goal is to set the first effective date of a person to the startdate and then set the effective date the filedate when the value change.
getting a dataframe like this one:
id | name | effective date | value |
---|---|---|---|
1 | Sta | 10-12-2019 | 1 |
1 | Sta | 25-04-2021 | 0.5 |
1 | Sta | 27-04-2021 | 0.9 |
2 | Danny | 20-03-2020 | 1 |
2 | Danny | 26-04-2021 | 0.3 |
3 | Elle | 14-08-2020 | 1 |
Solution
Comapre for not equal values per groups by DataFrameGroupBy.shift
, filter by boolean indexing
and replace first values per names by Series.mask
with DataFrame.duplicated
, last rename
and remove column:
df = df[df['value'].ne(df.groupby('name')['value'].shift())].copy()
df['startdate'] = df['startdate'].mask(df.duplicated('name'), df['filedate'])
df = df.rename(columns={'startdate':'effective date'}).drop('filedate', axis=1)
print (df)
id name effective date value
0 1 Sta 10-12-2019 1.0
1 1 Sta 25-04-2021 0.5
3 1 Sta 27-04-2021 0.9
4 2 Danny 20-03-2020 1.0
6 2 Danny 26-04-2021 0.3
8 3 Elle 14-08-2020 1.0
Answered By - jezrael
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.