Issue
My code is like this:
df = tabula.read_pdf('test.pdf', pages = ['all'])[0]
df.head()
df.to_excel('test.xlsx')`
When I run it, I have just the first page in my Excel...
Solution
You read the whole pdf with all pages but you fetch the erst element.
df = tabula.read_pdf('test.pdf', pages = ['all'])[0]
^^^
I think you have to remove that and concat it to get all pages to excel. Something like that:
dfs = tabula.read_pdf(self.file, pages='all')
df = pd.concat(dfs)
df.to_excel("filename.xlsx")
Here is a good article how to handle pdfs
Answered By - René Höhle
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.