Issue
I have one html file where table is stored and I store that html file into pandas Dataframe like this.
from bs4 import BeautifulSoup
import pandas as pd
table = BeautifulSoup(open('/home/lenovo/Downloads/F4311.html','r').read()).find('table')
# You are passing a <class 'bs4.element.Tag'> element into pandas read_html. You need to convert it to a string.
df = pd.read_html(str(table))
It worked and i could print df too. Then I tried to list it's column name.
cols_df=df.columns.tolist()
It threw an error
AttributeError: 'list' object has no attribute 'columns'
Then I tried to export to csv file.
df.to_csv("data.csv")
It threw me an error
AttributeError: 'list' object has no attribute 'to_csv'
Please help me in fixing these things.
Solution
If you have a look at the documentation for pd.read_html
, you will find that it returns not a dataframe, but "[a] list of DataFrames". This explains the error:
AttributeError: 'list' object has no attribute 'columns'
I.e. your actual pd.DataFrame
will be the first item in a list that you have called df
. I.e. you access it by using df[0]
.
Answered By - ouroboros1
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.