Friday, March 18, 2022

[FIXED] Prioritization data from data frame with dictionary using pandas

March 18, 2022 database, dataframe, dictionary, pandas, python No comments

Issue

I have a problem with my db, here is a sample frame that I am working on which describes number of toys (Qty) that I need to relocate to specific factory (To) and number of them which already is in other factories (A,B,C,D,E,F,G):

And I have created a dictionary which prioritize the destinations between this factories in format

'to': ['first choice', 'second choice'...]:

d = {'A': ['F', 'E', 'D', 'B'],
     'B': ['E', 'F'],
     'C': ['A', 'B'],
     'D': ['F', 'C', 'E', 'B', 'A'],
     'E': ['F', 'C', 'D', 'B', 'A'],
     'F': ['F', 'C', 'D', 'B', 'A'],
     'G': ['F', 'C', 'D', 'B', 'A']}

Now I wanted to pick toys from factory basing on my priority dict and create a new column which will show name of the factory.\

E.g. if I look at the toy with id 3 I want 20 of them to the factory A and as we can see there is a lack of them in factory F so I want to take 2 of them from factory F and rest (18) from the next one from my dictionary so from factory E.

Simple output:

Here is this tables in text to copy it to your excel/txt files.

ID  To  Qty A   B   C   D   E   F   G
1   A   6   0   0   0   0   0   302 0
2   A   4   0   625 0   0   357 0   0
3   A   20  0   0   0   318 154 2   0
4   B   12  0   0   0   0   0   473 0
5   C   13  0   0   0   0   0   6713 0
6   A   3   0   0   0   0   0   37  0
7   A   2   0   0   0   0   0   105 0
8   D   51  0   0   0   0   0   6782 0
9   B   2   0   0   0   0   0   946 0

Solution

(1) The idea is to flat your dataframe, remove all factories with no stock and sort rows by priority. (2) For each ID, decrease the quantity until it reaches 0:

pri = {(k, v): i for k, l in d.items() for i, v in enumerate(l)}

out = (
    df.melt(['ID', 'To', 'Qty'], var_name='From', value_name='Stock')
      .set_index(['To', 'From']).query('Stock != 0')
      .assign(Priority=lambda x: x.index.map(pri))
      .sort_values(['ID', 'Priority'])
)

At this point your dataframe looks like:

>>> out
         ID  Qty  Stock  Priority
To From                          
A  F      1    6    302       0.0
   E      2    4    357       1.0
   B      2    4    625       3.0
   F      3   20      2       0.0  # take 2, remain 18
   E      3   20    154       1.0  # take 18, remain 0
   D      3   20    318       2.0  # don't use it
B  F      4   12    473       1.0
C  F      5   13   6713       NaN
A  F      6    3     37       0.0
   F      7    2    105       0.0
D  F      8   51   6782       0.0
B  F      9    2    946       1.0

Update quantity for each stock by priority:

def update_qty(df):
    qty = df['Qty'] - df['Stock'].cumsum()
    return pd.Series(np.where(qty > 0, df['Stock'], df['Stock'] + qty), 
                     index=df.index, name='Qty')

out = (
    out.assign(Stock=out['Stock'].mask(out['Priority'].isna()))
       .groupby('ID').apply(update_qty).reset_index().query('Qty.gt(0) | Qty.isna()')
)

Final output:

>>> out
    ID To From   Qty
0    1  A    F   6.0
1    2  A    E   4.0
3    3  A    F   2.0  # taken 2
4    3  A    E  18.0  # taken 18
6    4  B    F  12.0
7    5  C    F   NaN
8    6  A    F   3.0
9    7  A    F   2.0
10   8  D    F  51.0
11   9  B    F   2.0

Answered By - Corralien

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Friday, March 18, 2022

[FIXED] Prioritization data from data frame with dictionary using pandas

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels