Issue
I have a dataframe covering a month of text with timestamp, something like this:
timestamp text
2023-01-01 00:00:00 ABC
2023-01-01 00:00:01 DEF
2023-01-01 00:00:01 GHI
...
I would like to count the number of text for each hour and for each day of the week, so to have at the end 168 (24*7) numbers.
For example if the 2023-01-01, which is Sunday, between 10am and 11am there are 10 texts, the next Sunday (2023-01-08) always between 10am and 11am there are 15 texts, and so on. At the end the number of all texts for all the Sundays between 10am and 11am is: 10+15+...
I want to do this for each hour and for each day of the week.
If the original dataframe is df
, I started to group by the hours:
hours_df = df.groupby(pd.Grouper(key="timestamp", freq="h")).size().reset_index(name="count_hours")
then I added the day_of_week
:
hours_df["day_of_week"] = hours_df["timestamp"].dt.dayofweek
but if I group now by the day_of_week
in this way:
day_df = hours_df.groupby("day_of_week").size().reset_index(name="count_days")
I am going to lose the info about the hours and the result is a dataframe with 7 entries, i.e. the days.
How I can combine the grouping of the hours with that of the days?
Solution
You could directly group by both dayofweek and hour:
df.groupby([df['timestamp'].dt.dayofweek.rename('dow'),
df['timestamp'].dt.hour.rename('hour')
]).size()
Or using concat
and value_counts
:
pd.concat([df['timestamp'].dt.dayofweek.rename('dow'),
df['timestamp'].dt.hour.rename('hour')], axis=1
).value_counts()
Output:
dow hour
6 0 3
dtype: int64
NB. for a long enough input, you should have all combinations, if not you can always reindex
.
Alternatively, for a rectangular output, use crosstab
:
pd.crosstab(df['timestamp'].dt.dayofweek.rename('dow'),
df['timestamp'].dt.hour.rename('hour'))
# or for all values:
out = (pd.crosstab(df['timestamp'].dt.dayofweek.rename('dow'),
df['timestamp'].dt.hour.rename('hour'))
.reindex(index=range(1, 7), columns=range(24), fill_value=0)
)
Output:
hour 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
dow
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
6 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Answered By - mozway
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.