Issue
I need to convert large csv file to xlsx file, about 5 million rows, I need to set each output xlsx file to 100,000 lines and save it separately
import pandas as pd
data = pd.read_csv("k.csv")
data.to_excel("new_file.xlsx", index=None, header=True)
how should i add the row count parameter?
Solution
The following approach will split your k.csv
into chunks of n
rows each. Each chunk is given a number e.g. new_file001.xslx
import pandas as pd
n = 100000 # number of rows per chunk
df = pd.read_csv("k.csv")
for i in range(0, df.shape[0], n):
df[i:i+n].to_excel(f"new_file{i:03}.xlsx", index=None, header=True)
Answered By - Martin Evans
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.