Tuesday, January 2, 2024

[FIXED] Split dataframe according to column value and export to different Excel worksheets

January 02, 2024 excel, pandas, python, python-3.x No comments

Issue

Source used before asked:

Pandas: Iterate through a list of DataFrames and export each to excel sheets

Splitting dataframe into multiple dataframes

I have managed to do all of this:

# sort the dataframe
df.sort(columns=['name'], inplace=True)
# set the index to be this and don't drop
df.set_index(keys=['name'], drop=False,inplace=True)
# get a list of names
names=df['name'].unique().tolist()
# now we can perform a lookup on a 'view' of the dataframe
joe = df.loc[df.name=='joe']
# now you can query all 'joes'

I have managed to make this work - joe = df.loc[df.name=='joe'] and it gave the exact result what I was looking for.

As solution to make it work for big amount of data I found this potential solution.

writer = pandas.ExcelWriter("MyData.xlsx", engine='xlsxwriter')
List = [Data , ByBrand]
for i in List:
        i.to_excel(writer, sheet_name= i)
writer.save()

Currently I have:

teacher_names = ['Teacher A', 'Teacher B', 'Teacher C']

df =

              ID   Teacher_name      Student_name
Teacher_name                                                         
Teacher A     1.0  Teacher A         Student 1 
Teacher A     NaN  Teacher A         Student 2  
Teacher B     0.0  Teacher B         Student 3 
Teacher C     2.0  Teacher C         Student 4

If I use - test = df.loc[df.Teacher_name=='Teacher A'] - Will receive exact result.

Question: How to optimise that it will automatically save the "test" result to (for each teacher separate) excel file ( .to_excel(writer, sheet_name=Teacher_name ) with teacher name, and will do it for all the existing in the database teacher?

Solution

This should work for you. You were nearly there, you just need to iterate the names list and filter your dataframe each time.

names = df['name'].unique().tolist()

writer = pandas.ExcelWriter("MyData.xlsx", engine='xlsxwriter')

for myname in names:
    mydf = df.loc[df.name==myname]
    mydf.to_excel(writer, sheet_name=myname)
    
writer.close()

EDIT: After 1.5 version of Pandas you just need to use close() instead of save().

Answered By - jpp

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Tuesday, January 2, 2024

[FIXED] Split dataframe according to column value and export to different Excel worksheets

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels