Issue
Say I have this DataFrame:
user | sub_date | unsub_date | group | |
---|---|---|---|---|
0 | alice | 2021-01-01 00:00:00 | 2021-02-09 00:00:00 | A |
1 | bob | 2021-02-03 00:00:00 | 2021-04-05 00:00:00 | B |
2 | charlie | 2021-02-03 00:00:00 | NaT | A |
3 | dave | 2021-01-29 00:00:00 | 2021-09-01 00:00:00 | B |
What is the most efficient way to count the subbed users per date and per group? In other words, to get this DataFrame:
date | group | subbed |
---|---|---|
2021-01-01 | A | 1 |
2021-01-01 | B | 0 |
2021-01-02 | A | 1 |
2021-01-02 | B | 0 |
... | ... | ... |
2021-02-03 | A | 2 |
2021-02-03 | B | 2 |
... | ... | ... |
2021-02-10 | A | 1 |
2021-02-10 | B | 2 |
... | ... | ... |
Here's a snippet to init the example df:
import pandas as pd
import datetime as dt
users = pd.DataFrame(
[
["alice", "2021-01-01", "2021-02-09", "A"],
["bob", "2021-02-03", "2021-04-05", "B"],
["charlie", "2021-02-03", None, "A"],
["dave", "2021-01-29", "2021-09-01", "B"],
],
columns=["user", "sub_date", "unsub_date", "group"],
)
users[["sub_date", "unsub_date"]] = users[["sub_date", "unsub_date"]].apply(
pd.to_datetime
)
Solution
The data is described by step functions, and staircase can be used for these applications
import staircase as sc
stepfunctions = users.groupby("group").apply(sc.Stairs, "sub_date", "unsub_date")
stepfunctions
will be a pandas.Series
, indexed by group, and the values are Stairs
objects which represent step functions.
group
A <staircase.Stairs, id=2516834869320>
B <staircase.Stairs, id=2516112096072>
dtype: object
You could plot the step function for A
if you wanted like so
stepfunctions["A"].plot()
Next step is to sample the step function at whatever dates you want, eg for every day of January..
sc.sample(stepfunctions, pd.date_range("2021-01-01", "2021-02-01")).melt(ignore_index=False).reset_index()
The result is this
group variable value
0 A 2021-01-01 1
1 B 2021-01-01 0
2 A 2021-01-02 1
3 B 2021-01-02 0
4 A 2021-01-03 1
.. ... ... ...
59 B 2021-01-30 1
60 A 2021-01-31 1
61 B 2021-01-31 1
62 A 2021-02-01 1
63 B 2021-02-01 1
note: I am the creator of staircase. Please feel free to reach out with feedback or questions if you have any.
Answered By - Riley
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.