Issue
I have a pandas dataframe with information about the deliveries done by a delivery person. In this pandas dataframe there are four columns. The first one is DateTime
, the second one is SortieNumber
, the third one is CustomerName
and the fourth one is ProductCode
.
I want to study this pandas dataframe and find chains within it. I want to find out if this delivery person delivers to the same customers in the same order in each sortie. I don’t care about the ordered products. The first rows of the data frame are something like this:
DateTime SortieNumber CustomerName ProductCode
01/01/2023 09:00:00 1 Josh 001
01/01/2023 09:10:00 1 Alice 002
01/01/2023 09:15:00 1 Robert 002
01/01/2023 12:00:00 2 Anna 001
01/01/2023 12:00:10 2 Anna 003
01/01/2023 12:15:00 2 Robert 003
01/01/2023 15:00:00 3 Josh 004
01/01/2023 15:05:10 3 Alice 003
01/01/2023 15:15:00 3 Robert 001
01/01/2023 15:30:10 3 Robert 002
01/01/2023 15:35:15 3 Robert 003
From this data, I want to say that the chain Josh-Alice-Robert
happens in 2 of the 3 sorties, Anna-Robert
happens in one of the three sorties and so on for the remaining rows.
Can this be done?
Solution
You could ensure the rows are sorted by SortieNumber
and DateTime
, then remove the identical successive SortieNumber
/CustomerName
, groupby.aggregate
as string and value_counts
:
(df.sort_values(by=['SortieNumber', 'DateTime'])
.loc[lambda d: d[['SortieNumber', 'CustomerName']]
.ne(d[['SortieNumber', 'CustomerName']].shift())
.any(axis=1)]
.groupby('SortieNumber')['CustomerName'].agg('-'.join)
.value_counts()
)
NB. if you are sure that within one SortieNumber
the same customer is never delivered with another customer in between, you can simplify the .loc[…]
into .drop_duplicates(['SortieNumber', 'CustomerName'])
.
Output:
CustomerName
Josh-Alice-Robert 2
Anna-Robert 1
Name: count, dtype: int64
If you want a proportion, pass normalize=True
to value_counts
:
CustomerName
Josh-Alice-Robert 0.666667
Anna-Robert 0.333333
Name: proportion, dtype: float64
Answered By - mozway
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.