Friday, October 28, 2022

[FIXED] Initial value of multiple variables dataframe for time dilation

October 28, 2022 numpy, pandas, python, time No comments

Issue

Dataframe:

product1	product2	product3	product4	product5
straws	orange	melon	chair	bread
melon	milk	book	coffee	cake
bread	melon	coffe	chair	book

CountProduct1	CountProduct2	CountProduct3	Countproduct4	Countproduct5
1	1	1	1	1
2	1	1	1	1
2	3	2	2	2

RatioProduct1	RatioProduct2	RatioProduct3	Ratioproduct4	Ratioproduct5
0.28	0.54	0.33	0.35	0.11
0.67	0.25	0.13	0.11	0.59
2.5	1.69	1.9	2.5	1.52

I want to create five others columns that keep my initial ratio of each item along the dataframe.

Output:

InitialRatio1	InitialRatio2	InitialRatio3	InitialRatio4	InitialRatio5
0.28	0.54	0.33	0.35	0.11
0.33	0.25	0.13	0.31	0.59
0.11	0.33	0.31	0.35	0.13

Solution

Check the code again. Do you have an error in product3 = coffe and product4 = coffee? Fixed coffe to coffee. As a result, 0.31 should not be.

import pandas as pd
pd.set_option('display.max_rows', None)  # print everything rows
pd.set_option('display.max_columns', None)  # print everything columns

df = pd.DataFrame(
{
    'product1':['straws', 'melon', 'bread'],
    'product2':['orange', 'milk', 'melon'],
    'product3':['melon', 'book', 'coffee'],
    'product4':['chair', 'coffee', 'chair'],
    'product5':['bread', 'cake', 'book'],
    'time':[1,2,3],
    'Count1':[1,2,2],
    'Count2':[1,1,3],
    'Count3':[1,1,2],
    'Count4':[1,1,2],
    'Count5':[1,1,2],
    'ratio1':[0.28, 0.67, 2.5],
    'ratio2':[0.54, 0.25, 1.69],
    'ratio3':[0.33, 0.13, 1.9],
    'ratio4':[0.35, 0.11, 2.5],
    'ratio5':[0.11, 0.59, 1.52],

})

print(df)

product = df[['product1', 'product2', 'product3', 'product4', 'product5']].stack().reset_index()
count = df[['Count1',  'Count2',  'Count3', 'Count4',  'Count5']].stack().reset_index()
ratio = df[['ratio1',  'ratio2',  'ratio3',  'ratio4',  'ratio5']].stack().reset_index()
print(ratio)


arr = pd.unique(product[0])
aaa = [i for i in range(len(arr)) if product[product[0] == arr[i]].count()[0] > 1]
for i in aaa:
    prod_ind = product[product[0] == arr[i]].index
    val_ratio = ratio.loc[prod_ind[0], 0]
    ratio.loc[prod_ind, 0] = val_ratio

print(ratio.pivot_table(index='level_0', columns='level_1', values=[0]))

Output:

level_1 ratio1 ratio2 ratio3 ratio4 ratio5
level_0                                   
0         0.28   0.54   0.33   0.35   0.11
1         0.33   0.25   0.13   0.11   0.59
2         0.11   0.33   0.11   0.35   0.13

To work with data, they need to be turned into one column using stack().reset_index(). Create a list of unique products arr. Further in the list aaa I get indexes of arr, which are more than one.

prod_ind = product[product[0] == arr[i]].index

In a loop, I get indexes of products that are more than one.

val_ratio = ratio.loc[prod_ind[0], 0]

Get the first value of the product.

ratio.loc[prod_ind, 0] = val_ratio

Set this value for all products. To access the values, explicit loc indexing is used, where the row indices are in square brackets on the left, and the names of the columns on the right. Read more here.

In pivot_table I create back the table. To insert the processed data into the original dataframe, simply use the following:

table = ratio.pivot_table(index='level_0', columns='level_1', values=[0])
df[['ratio1',  'ratio2',  'ratio3',  'ratio4',  'ratio5']] = table
print(df)

Answered By - inquirer

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Friday, October 28, 2022

[FIXED] Initial value of multiple variables dataframe for time dilation

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels