Issue
this is driving me insane. I am using
3.10.13 (main, Aug 25 2023, 13:20:03) [GCC 9.4.0]
pandas version 2.1.1
numpy version 1.26.2
pyarrow version 14.0.1
I want to examine df_entry_log['AM_PM']
and, dependant on a test populate a column.
create a column df_entry_log['bag_weight']
if df_entry_log['AM_PM'] = "AM"
then look up df_details['bag_weight']
using df_entry_log['prev_wrk_day']
and put the value in df_entry_log['bag_weight']
ie 97.10
else df_entry_log['AM_PM'] == "PM'
put the value of df_details['bag_weight']
found looking up df_entry_log['entry_date']
I wanted to use a vector BUT my mind has gone to mush.
df_entry_log:
guest entry_date AM_PM prev_wrk_day next_wrk_day
1 janet 2007-01-17 PM 2007-01-16 2007-01-18
2 janet 2007-04-25 AM 2007-04-24 2007-04-26
3 janet 2007-07-25 AM 2007-07-24 2007-07-26
df_details
guest gate_date bag_weight
8 janet 2007-01-16 97.10
9 janet 2007-01-17 94.95
10 janet 2007-01-18 89.07
Solution
You can try:
def get_weight(row):
try:
if row["AM_PM"] == "AM":
return df_details.loc[(row["guest"], row["prev_wrk_day"]), "bag_weight"]
else:
return df_details.loc[(row["guest"], row["entry_date"]), "bag_weight"]
except KeyError:
return np.nan
# set index to guest/gate_date for easy searching:
df_details = df_details.set_index(["guest", "gate_date"])
df_entry_log["bag_weight"] = df_entry_log.apply(get_weight, axis=1)
print(df_entry_log)
Prints:
guest entry_date AM_PM prev_wrk_day next_wrk_day bag_weight
1 janet 2007-01-17 PM 2007-01-16 2007-01-18 94.95
2 janet 2007-04-25 AM 2007-04-24 2007-04-26 NaN
3 janet 2007-07-25 AM 2007-07-24 2007-07-26 NaN
Answered By - Andrej Kesely
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.