Issue
I would like to scrape the table with 'tbQuote' id in the site. Yet, the table return None and I cannot find the table id in the soup. I don't know which part is wrong. Thank you for your help
from bs4 import BeautifulSoup
from urllib.request
import urlopen, Request
site = "http://www.aastocks.com/en/stocks/quote/detail-quote.aspx?symbol=00002"
hdr = {'User-Agent': 'Mozilla/5.0'}
req = Request(site, headers=hdr)
res = urlopen(req)
rawpage = res.read().decode("utf-8")
soup = BeautifulSoup(rawpage, "html.parser")
table = soup.find("table", id="tbQuote")
print(table)
Solution
Page is not showing the table unless cookie
is set. In the below code, requesting the data by setting the cookie header.
import requests
from bs4 import BeautifulSoup
site = "http://www.aastocks.com/en/stocks/quote/detail-quote.aspx?symbol=00002"
hdr = {'User-Agent': 'Mozilla/5.0',
'Cookie': 'aa_cookie=183.83.143.202_58960_1575053871; __asc=8f1faa5516eb8732c569a020f49; __auc=8f1faa5516eb8732c569a020f49; __utma=177965731.2012567291.1575052586.1575052586.1575052586.1; __utmc=177965731; __utmz=177965731.1575052586.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); __utmt_a3=1; __utma=81143559.1525929694.1575052586.1575052586.1575052586.1; __utmc=81143559; __utmz=81143559.1575052586.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); __utmt_a2=1; __utmt_b=1; CookiePolicyCheck=0; __utmb=177965731.3.10.1575052586; __utmb=81143559.6.10.1575052586; AALTP=1'}
response = requests.request('GET', url=site, headers=hdr)
soup = BeautifulSoup(response.text, features='lxml')
table = soup.find("table", id="tbQuote")
print(table)
Answered By - Harish Vutukuri
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.