Issue
I am trying to scrape a website for some data using Scrapy. I found the table using css but its returning only thread data.
Tried using xpath too but that too didn't help. Actually, the code doesn't have tbody tag because of it the function returns null.
I am trying to scrape this website
def parse(self, response):
table = response.css('div.iw_component div.mobile-collapse div.fund-component div#exposureTabs div.component-tabs-panel div.table-chart-container div.fund-component table#tabsSectorDataTable')
print(table.extract())
I want to access data in the selected table which is present in tbody tag.
Solution
The data you're looking for is loaded dynamically using Javascript that's why Scrapy can't find it. You can try to use Scrapy-Splash or parse it by yourself:
import json
def parse(self, response):
table_json = response.xpath('//script[contains(., "var tabsSectorDataTable =")]/text()').re_first(r'var tabsSectorDataTable =(.+?\]);')
table = json.loads(table_json)
Answered By - gangabass
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.