Issue
I have a dataframe a
with 14 rows and another dataframe comp1sum
with 7 rows. a
has date column for 7 days in 12hr interval. So that makes it 14 rows. Also, comp1sum
has a column with 7 days.
This is the comp1sum
dataframe
And this is the a
dataframe
I want to map 2 rows of a
dataframe to single rows of comp1sum
dataframe. So, that one day of dataframe a
is mapped to one day of comp1sum
dataframe.
I have the following code for that
j=0
for i in range(0,7):
a.loc[i,'comp1_sum'] = comp_sum.iloc[j]['comp1sum']
a.loc[i,'comp2_sum'] = comp_sum.iloc[j]['comp2sum']
j=j+1
And its output is
dt_truncated comp1_sum
3 2015-02-01 00:00:00 142.0
10 2015-02-01 12:00:00 144.0
12 2015-02-03 00:00:00 145.0
2 2015-02-05 00:00:00 141.0
14 2015-02-05 12:00:00 NaN
The code is mapping the days from comp1sum
based on index of a
and not based on dates of a
. I want 2015-02-01 00:00:00
to have the values 139.0
and 2015-02-02 00:00:00
to have the value 140.0 and so on such that increasing dates have increasing values.
I am not able to map in such a way. please help.
Edit1- As per @Ssayan answer, I am getting this error-
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-255-77e55efca5f9> in <module>
3 # use the sorted index to iterate through the sorted dataframe
4 for i, idx in enumerate(a.index):
----> 5 a.loc[idx, 'comp1_sum'] = b.iloc[i//2]['comp1sum']
6 a.loc[idx,'comp2_sum'] = b.iloc[i//2]['comp2sum']
IndexError: single positional indexer is out-of-bounds
Solution
Your issue is that your DataFrame a
is not sorted by date so the index 0 does not match the earliest date. When you use loc
it uses the value of the index, not the order in which the table is, so even with sorting the DataFrame the issue remains.
One way out is to sort the DataFrame a
by date and then to use the sorted index to apply the value in the order you need.
# sort the dataframe by date
a = a.sort_values("dt_truncated")
# use the sorted index to iterate through the sorted dataframe
for i, idx in enumerate(a.index):
a.loc[idx, 'val_1'] = b.iloc[i//2]['val1']
a.loc[idx,'val_2'] = b.iloc[i//2]['val2']
Answered By - Ssayan
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.