Tuesday, November 21, 2023

[FIXED] how to print same rows values of different columns in a excel file using python

November 21, 2023 dataframe, excel, exception, pandas, python No comments

Issue

i have an excel file that after each FOR loop adds one "SN" column to excel find.
what i want is to find the same rows in different "SN" columns. THIS IS THE EXCEL THAT I WANT TO DO OPERATION ON :

print (df)
   SN1   R1   SN2    R2
0  SN1   38  SN20  1000
1  SN2   38  SN21   181
2  SN3  532  SN23     5
3  SN4  521  SN25    38
4  SN6  115  SN26  8718
5  SN7    3  SN27  3452
6  SN8  136  SN28   229
7  SN9  168  SN30   136

AND THIS IS THE RESULT THAT I WANT

THE CODES :

v=int(input("how many periods do you have:\n"))
new_excel_by_user=str(input("write the name of your new EXCEL 
file:"))
user_excel = pd.ExcelWriter(new_excel_by_user+'.xlsx', 
engine='xlsxwriter')
sys_excel=pd.ExcelWriter('sys_excel132.xlsx', engine='xlsxwriter')
final_excel=pd.ExcelWriter('final132.xlsx' , engine='xlsxwriter')


merged_df_H = pd.DataFrame()
merged_df_L = pd.DataFrame()
final_merged=pd.DataFrame()


for i in range(v):
print("\npriod %d :"%(i+1))

excelpathH=''+input(''+r"Insert your Excel path(HIGH):")
excelpathL=''+input(''+r"Insert your Excel path(LOW):")
excelpathH=excelpathH.replace('"', '')
excelpathL=excelpathL.replace('"', '')
dfH=pd.read_excel(os.path.join(excelpathH), engine='openpyxl' )
dfL=pd.read_excel(os.path.join(excelpathL), engine='openpyxl' )

Rnh=dfH['KPIVALUE']
Rnl=dfL['KPIVALUE']
R=Rnh-Rnl
merged_df_H['R%s'%(i+1)]=R
R=merged_df_H['R%s'%(i+1)]
#the below code returns true false only
RESULT=merged_df_H['R%s'%(i+1)].ge(130)
merged_df_H = 
pd.concat([merged_df_H,dfH.loc[mask,'SN'].rename('SN%s'% 
(i+1)).reset_index(drop=True),
R[mask].reset_index(drop=True)], axis=1)


merged_df_H = pd.concat([ merged_df_H,dfH,dfL,R,RESULT], axis=1 )

final_merged=pd.concat([final_merged,dfH['SN'], R,RESULT],axis=1)

BAD RESULT:

newstrong text

Solution

You can use DataFrame.filter for get SN and R columns, reshape by DataFrame.stack and filter same values by Series.duplicated:

R = df.filter(like='R').stack().reset_index(drop=True)
SN = df.filter(like='SN').stack().reset_index(drop=True)
mask = R.duplicated(keep=False)

print (pd.concat([SN[mask], R[mask]], axis=1, keys=('SN-RESULT','R-RESULT')))
   SN-RESULT  R-RESULT
0        SN1        38
2        SN2        38
7       SN25        38
12       SN8       136
15      SN30       136

out = df.join(pd.concat([SN[mask], R[mask]], 
                        axis=1, 
                        keys=('SN-RESULT','R-RESULT')).reset_index(drop=True))

If columns SN-RESULT, R-RESULT are already in input DataFrame add rsuffix to DataFrame.join:

out = df.join(pd.concat([SN[mask], R[mask]], 
                        axis=1, 
                        keys=('SN-RESULT','R-RESULT')).reset_index(drop=True), 
              rsuffix='_new')

Another idea with DataFrame.melt`:

R = df.filter(like='R').melt()['value']
SN = df.filter(like='SN').melt()['value']
mask = R.duplicated(keep=False)
out = df.join(pd.concat([SN[mask], R[mask]], 
                       axis=1, 
                       keys=('SN-RESULT','R-RESULT')).reset_index(drop=True))

Answered By - jezrael

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Tuesday, November 21, 2023

[FIXED] how to print same rows values of different columns in a excel file using python

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels