Issue
I had the below HTML from URL:
<h4>
\r\n \r\n\r\n
<a href="/l">
\r\n <!-- mp_trans_rt_start id="1" args="as" 1 -->\r\n <span class="brandWrapTitle">\r\n <span class="productdescriptionbrand">Mxxx</span>\r\n </span>\r\n <span class="nameWrapTitle">\r\n <span class="productdescriptionname">Axxxname</span>\r\n </span>\r\n <!-- mp_trans_rt_end 1 -->\r\n
</a>
\r\n\r\n
</h4>
And im trying to use python to find class name:
import urllib.request
from bs4 import BeautifulSoup
url = "https://link"
user_agent = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.47 Safari/537.36'
urlwithagent = urllib.request.Request(url,headers={'User-Agent': user_agent})
response = urllib.request.urlopen(urlwithagent)
soup = response.read()
product = soup.find("h4", attrs ={"class=": "productdescriptionname"})
print (product)
Everythink works perfect until line :
product = soup.find("h4", attrs ={"class=": "productdescriptionname"})
Im getting error like:
find() takes no keyword arguments
And I had no idea how to fix it - there is lot of info around but nothing works :/
Solution
You need to convert it to a BeautifulSoup
object before using find
or else it uses str.find
Ex:
soup = BeautifulSoup(response.read(), "html.parser")
product = soup.find("h4", attrs ={"class": "productdescriptionname"})
print (product)
Answered By - Rakesh
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.