Wednesday, December 13, 2023

[FIXED] Grouping and counting time intervals by hours with a marker for overlapping days

December 13, 2023 dataframe, datetime, pandas, python No comments

Issue

I need to count the number of hour intervals over a monthly period. I need to group it by only time and not by date. For example

Date	Start	End
23-02-2023	12:10:00	12:34:00
24-02-2023	12:15:00	12:45:00

would count 2 for 12:00:00 to 12:59:59 slot

My sample data looks like this (If needed I can change my sample data format)

first_appear	last_appear
12:10:00	12:31:00
12:33:49	13:29:12
15:30:20	18:40:30
20:12:20	23:10:20
23:34:20	6:11:00

If you notice the last entry denotes the overlap with the next day. The code

import pandas as pd
import numpy as np

import staircase as sc

df = pd.read_csv('Overlapping Schedule - Sheet2.csv')
df["first_appear"] = pd.to_timedelta(df["first_appear"].map(str))
df["last_appear"] = pd.to_timedelta(df["last_appear"].map(str))

df["first_appear"] = df["first_appear"].dt.floor("H")
df["last_appear"] = df["last_appear"].dt.ceil("H")

sf = sc.Stairs(df, start="first_appear", end="last_appear")

sample_times = pd.timedelta_range("00:00:00", "24:00:00", freq=pd.Timedelta("1hr"))
sf(sample_times, include_index=True)

The output

0 days 00:00:00	0
0 days 01:00:00	0
0 days 02:00:00	0
0 days 03:00:00	0
0 days 04:00:00	0
0 days 05:00:00	0
0 days 06:00:00	0
0 days 07:00:00	0
0 days 08:00:00	0
0 days 09:00:00	0
0 days 10:00:00	0
0 days 11:00:00	0
0 days 12:00:00	2
0 days 13:00:00	1
0 days 14:00:00	0
0 days 15:00:00	1
0 days 16:00:00	1
0 days 17:00:00	1
0 days 18:00:00	1
0 days 19:00:00	0
0 days 20:00:00	1
0 days 21:00:00	1
0 days 22:00:00	1
0 days 23:00:00	1
1 days 00:00:00	0

Ideally I would like to see following entries as well

1 days 01:00:00	1
1 days 02:00:00	1
1 days 03:00:00	1
1 days 04:00:00	1
1 days 05:00:00	1
1 days 06:00:00	1

I referred to multiple Stack Overflow answers to come up with this but now I am stuck. @riley's answer to Group and count by time interval - Python helped me to get started

Solution

Try:

def get_hours(row):
    out = []

    if row["last_appear"] < row["first_appear"]:
        out.extend(
            pd.timedelta_range(
                row["first_appear"].floor("1h"),
                "24:00:00",
                freq="1H",
            )
        )
        out.extend(
            pd.timedelta_range(
                "00:00:00",
                row["last_appear"].floor("1h"),
                freq="1H",
            )
            + pd.Timedelta("1 day")
        )
    else:
        out.extend(
            pd.timedelta_range(
                row["first_appear"].floor("1h"),
                row["last_appear"].floor("1h"),
                freq="1H",
            )
        )

    return out


df["first_appear"] = pd.to_timedelta(df["first_appear"])
df["last_appear"] = pd.to_timedelta(df["last_appear"])

df = (
    df.assign(hours=df.apply(get_hours, axis=1))
    .explode("hours")
    .groupby("hours")["hours"]
    .count()
)
df = df.reindex(pd.timedelta_range("00:00:00", df.index.max(), freq="1H"), fill_value=0)

print(df)

Prints:

0 days 00:00:00    0
0 days 01:00:00    0
0 days 02:00:00    0
0 days 03:00:00    0
0 days 04:00:00    0
0 days 05:00:00    0
0 days 06:00:00    0
0 days 07:00:00    0
0 days 08:00:00    0
0 days 09:00:00    0
0 days 10:00:00    0
0 days 11:00:00    0
0 days 12:00:00    2
0 days 13:00:00    1
0 days 14:00:00    0
0 days 15:00:00    1
0 days 16:00:00    1
0 days 17:00:00    1
0 days 18:00:00    1
0 days 19:00:00    0
0 days 20:00:00    1
0 days 21:00:00    1
0 days 22:00:00    1
0 days 23:00:00    2
1 days 00:00:00    2
1 days 01:00:00    1
1 days 02:00:00    1
1 days 03:00:00    1
1 days 04:00:00    1
1 days 05:00:00    1
1 days 06:00:00    1
Freq: H, Name: hours, dtype: int64

Answered By - Andrej Kesely

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Wednesday, December 13, 2023

[FIXED] Grouping and counting time intervals by hours with a marker for overlapping days

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels