Issue
I have a dataframe where if a certain condition is met, I'd like to essentially create a duplicate of that row. Row should be duplicated IF 'Date' = Q4.22 or > AND type = 'live' Also, for every duplicate created the 'unit' count should be updated to reflect this (grouped by id and Date) Once the duplicate is established, the unit count should reflect the new count based on the same id and Date.
Data
id Date set type unit energy
bb Q4.22 l live l01 20
bb Q4.22 l live l02 20
ba Q3.22 l non l01 20
aa Q4.22 l non l01 20
aa Q4.22 l live l01 20
cc Q3.22 l non l01 20
aa Q4.22 l live l02 20
Desired
id Date set type unit energy
bb Q4.22 l live l01 20
bb Q4.22 l live l02 20
bb Q4.22 l live l03 20
bb Q4.22 l live l04 20
ba Q3.22 l non l01 20
aa Q4.22 l non l01 20
aa Q4.22 l live l01 20
aa Q4.22 l live l02 20
cc Q3.22 l non l01 20
aa Q4.22 l live l03 20
aa Q4.22 l live l04 20
Doing
pd.concat([df, df.loc[(df['Date'] == > 'Q4.22') & (df['live'] == 'live')]])
However, I still need to add counter to the new duplicates that are created. Any suggestion is appreciated.
Solution
Try:
- Convert your date column to timestamps
concat
your original data with the filtered datagroupby
to get thecumcount
of "id" and "Date" and set the "unit" accordingly
df["Date"] = pd.to_datetime(df["Date"].str.replace(r"(Q\d).(\d+)", r"\2-\1",regex=True))
output = pd.concat([df, df[df["Date"].ge(pd.Timestamp("2022-10-01"))&df["type"].eq("live")]], ignore_index=True)
output["unit"] = output["set"]+output.groupby(["id", "Date"]).cumcount().add(1).astype(str).str.zfill(2)
output = output.sort_values("id", ignore_index=True)
#convert Date back to original format if needed
output["Date"] = output["Date"].dt.to_period("Q").astype(str).str.replace(r"\d\d(\d+)(Q\d)",r"\2.\1",regex=True)
>>> output
id Date set type unit energy
0 aa Q4.22 l non l01 20
1 aa Q4.22 l live l02 20
2 aa Q4.22 l live l03 20
3 aa Q4.22 l live l04 20
4 aa Q4.22 l live l05 20
5 ba Q3.22 l non l01 20
6 bb Q4.22 l live l01 20
7 bb Q4.22 l live l02 20
8 bb Q4.22 l live l03 20
9 bb Q4.22 l live l04 20
10 cc Q3.22 l non l01 20
Answered By - not_speshal
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.