Saturday, August 13, 2022

[FIXED] addition of two dataframes using the same timestamps

August 13, 2022 jupyter-notebook, pandas, python No comments

Issue

So I have two data frames. Energy:

                       Affluent  Adversity  Affluent  Comfortable  Adversity  \
Time                                                                         
2019-01-01 01:00:00     0.254      0.244     0.155        0.215      0.274   
2019-01-01 02:00:00     0.346      0.154     0.083        0.246      0.046   
2019-01-01 03:00:00     0.309      0.116     0.085        0.220      0.139   
2019-01-01 04:00:00     0.302      0.158     0.083        0.226      0.186   
2019-01-01 05:00:00     0.181      0.171     0.096        0.246      0.051   
...                       ...        ...       ...          ...        ...   
2019-12-31 20:00:00     1.102      0.263     2.157        0.209      2.856   
2019-12-31 21:00:00     0.712      0.269     1.409        0.212      0.497   
2019-12-31 22:00:00     0.398      0.274     0.073        0.277      0.199   
2019-12-31 23:00:00     0.449      0.452     0.072        0.252      0.183   
2020-01-01 00:00:00     0.466      0.291     0.110        0.203      0.117

loadshift:

Time       load_difference
2019-01-01 01:00:00 0.10
2019-01-01 02:00:00 0.10
2019-01-01 03:00:00 0.15
2019-01-01 04:00:00 0.10
2019-01-01 05:00:00 0.10
... ...
2019-12-31 20:00:00 -0.10
2019-12-31 21:00:00 0.10
2019-12-31 22:00:00 0.15
2019-12-31 23:00:00 0.10
2020-01-01 00:00:00 -0.10

all I want to do is add the load difference to the df1 so for example the first affluent house at 1 am would change to 0.345. I have been able to use concat to multiply in my other models but somehow really struggling with this.

Expected output(but for all 8760 hours):

                  Affluent  Adversity  Affluent  Comfortable Adversity  \
Time
2019-01-01 01:00:00     0.354      0.344     0.255        0.315      0.374
2019-01-01 02:00:00     0.446      0.254     0.183        0.446      0.146
2019-01-01 03:00:00     0.409      0.216     0.185        0.320      0.239
2019-01-01 04:00:00     0.402      0.258     0.183        0.326      0.286
2019-01-01 05:00:00     0.281      0.271     0.196        0.346      0.151

I have tried: Energy.add(loadshift, fill_value=0) but I get

Concatenation operation is not implemented for NumPy arrays, use np.concatenate() instead. Please do not rely on this error; it may not be given on all Python implementations.

also tried:

df_merged = pd.concat([Energy,loadshift], ignore_index=True, sort=False)
df_merged =Energy.append(loadshift)

this prints:

InvalidIndexError: Reindexing only valid with uniquely valued Index objects

How do I please go about to fixing these errors. Thanks

Solution

here is one way to do it, which is to make use of add and then update

#to add two DF, both should have same number of columns
# so, we duplicate single column in loadshift(DF2) into as many columns as we #have in the energy df (DF), minus the datetime column.

#next we concat to add the datetime column to have a new df (DF3) that #matches the columns count with the energy (df) DF

df3=pd.concat([df2.iloc[:,0],
               pd.concat(
                   [df2.iloc[:,1]]*(len(df.columns)-1), axis=1
               )]
              , axis=1 )
#update column names
df3.columns = df.columns

# add DF and DF3 values and then update the original energy (df ) columns
df.update(df.iloc[:,1:].add(df3.iloc[:,1:]))

df

    Time    Affluent    Adversity   Affluent.1  Comfortable     Adversity.1
0   2019-01-01 01:00:00     0.354   0.344   0.255   0.315   0.374
1   2019-01-01 02:00:00     0.446   0.254   0.183   0.346   0.146
2   2019-01-01 03:00:00     0.459   0.266   0.235   0.370   0.289
3   2019-01-01 04:00:00     0.402   0.258   0.183   0.326   0.286
4   2019-01-01 05:00:00     0.281   0.271   0.196   0.346   0.151
5   2019-12-31 20:00:00     1.002   0.163   2.057   0.109   2.756
6   2019-12-31 21:00:00     0.812   0.369   1.509   0.312   0.597
7   2019-12-31 22:00:00     0.548   0.424   0.223   0.427   0.349
8   2019-12-31 23:00:00     0.549   0.552   0.172   0.352   0.283
9   2020-01-01 00:00:00     0.366   0.191   0.010   0.103   0.017

Answered By - Naveed

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Saturday, August 13, 2022

[FIXED] addition of two dataframes using the same timestamps

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels