Issue
Imagine a below dataset. INPUT DATASET
id status type location bb_count vo_count tv_count
123 open r hongkong 1 0 4
456 open r hongkong 1 7 2
456 closed p India 0 6 1
OUTPUT DATASET I need to insert a row with product type if any(bb_count, tv_count, vo_count) is greater than 0.
id status type location product
123 open r hongkong bb
123 open r hongkong tv
456 open r hongkong bb
456 open r hongkong vo
456 open r hongkong tv
456 closed p India vo
456 closed p India rv
what I tried:
def insert_row(df):
if df["bb_count"] > 0:
print("inserting bb row")
if df["tv_count"] > 0:
print("inserting tv row")
if df["vo_count"] > 0:
print("inserting vo row")
df.apply(insert_row, axis=1)
But I'm not getting the exact output.
Solution
You aren't changing your dataframe in the function at all. You are simply printing some statements. You don't really need a custom function for what you want to do.
Try:
melt
the dataframe to create the required structure.- Filter to keep rows where the value is greater than 0.
- Re-format the "product" column as required (removing the "_count").
melted = df.melt(["id", "status", "type", "location"],
["bb_count","vo_count","tv_count"],
var_name="product")
output = melted[melted["value"].gt(0)].drop("value",axis=1)
output["product"] = output["product"].str.replace("_count","")
.replace({"bb": "broadband",
"vo":"fixedvoice",
"tv":"television"})
>>> output
id status type location product
0 123 open r hongkong broadband
1 456 open r hongkong broadband
4 456 open r hongkong fixedvoice
5 456 closed p India fixedvoice
6 123 open r hongkong television
7 456 open r hongkong television
8 456 closed p India television
Answered By - not_speshal
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.