Issue
I have a dataframe which has two date
columns. I am doing difference of those two date
columns to find number of days between them. The problem is the time in both of them starts from 06:00:00
so when the actual day gets over, the difference remains the same.
Here is the dataframe
ID date1 comp date2 daydiff
30 2021-01-01 23:00:00 1 2020-07-31 06:00:00 154
30 2021-01-02 00:00:00 1 2020-07-31 06:00:00 154
30 2021-01-02 01:00:00 1 2020-07-31 06:00:00 154
30 2021-01-02 02:00:00 1 2020-07-31 06:00:00 154
30 2021-01-02 03:00:00 1 2020-07-31 06:00:00 154
30 2021-01-02 04:00:00 1 2020-07-31 06:00:00 154
30 2021-01-02 05:00:00 1 2020-07-31 06:00:00 154
30 2021-01-02 06:00:00 1 2020-07-31 06:00:00 155
30 2021-01-02 07:00:00 1 2020-07-31 06:00:00 155
Here you can see when we have date as 2021-01-02
and time as 00:00:00
, there is no increment in daydiff
(ideally it should have been incremented as date as increased by 1). It is only increasing when time is 06:00:00
.
I want the increment to happen when the date changes.
Solution
if I understand you right, you want to calculate the difference between dates and not between timestamps:
df['sincelastComp1'] = (df['datetime_tel'].apply(pd.Timestamp).dt.date -
df['datetime_maint'].apply(pd.Timestamp).dt.date).dt.days
output:
machineID datetime_tel comp1 datetime_maint sincelastComp1
0 30 2021-01-01 23:00:00 1 2020-07-31 06:00:00 154
1 30 2021-01-02 00:00:00 1 2020-07-31 06:00:00 155
2 30 2021-01-02 01:00:00 1 2020-07-31 06:00:00 155
3 30 2021-01-02 02:00:00 1 2020-07-31 06:00:00 155
4 30 2021-01-02 03:00:00 1 2020-07-31 06:00:00 155
5 30 2021-01-02 04:00:00 1 2020-07-31 06:00:00 155
6 30 2021-01-02 05:00:00 1 2020-07-31 06:00:00 155
7 30 2021-01-02 06:00:00 1 2020-07-31 06:00:00 155
8 30 2021-01-02 07:00:00 1 2020-07-31 06:00:00 155
Answered By - Ze'ev Ben-Tsvi
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.