Saturday, December 11, 2021

[FIXED] Fetch data using scrapy.FormRequest in aspx

December 11, 2021 python-3.x, scrapy, web-crawler, web-scraping No comments

Issue

I am new in Python and scrapy. I have a project to fetch data from a aspx webpages and store the values. There are 1000 pages like this that should I fetch the values everydays. it seems easy but I couldn't do it.

this is the webpage http://www.tsetmc.com/Loader.aspx?ParTree=151311&i=35366681030756042 that I should scrape! I look at the code from chrome

the red box is updating everyday, this is the the sourepage

<div class="aspNetHidden">
<input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="" />
</div>
    
<div id="tabs" class="InsInfoTab"></div>
<div id="MainBox" class="box1 zFull silver">
<div id="MainContent" class="tabcontent content">

<div id="TopBox"></div>

all data that i need included in TobBox ,

The network tab in chrome shows the data translation but there is not the information that i want to fetch

I prefer to use scrapy since I have many pages to fetch everyday . I also used scrapy.FromRequest but i didnt get the information of < div id='topbox' ...> that i want! which means the data that I get is something esle forexample one of the valie that i need is 28688000000 but the data from below code is something else.

import scrapy
class SpidyQuotes(scrapy.Spider):
    name = 'spidyquotes-viewstate'
    start_urls = ['http://tsetmc.com/Loader.aspx?ParTree=151311&i=35366681030756042']
    download_delay = 1.5
    
    def parse(self, response):
        self.log('A response from %s just arrived!' % response.url)
        return scrapy.FormRequest(
                'http://www.tsetmc.com/tsev2/data/instinfofast.aspx?i=35366681030756042&c=23%20',               
            )

Solution

I didn't set up a scrapy project for this, but could you try it anyhow?

import requests

headers = {
    "referer": "http://www.tsetmc.com/Loader.aspx?ParTree=151311&i=35366681030756042",
    "user-agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.198 Safari/537.36",
    "X-Requested-With": "XMLHttpRequest"
}
print(requests.get("http://www.tsetmc.com/tsev2/data/instinfofast.aspx?i=35366681030756042&c=23%20", headers=headers).text)

Output:

12:30:00,A ,22860,22500,22600,23040,23500,21890,29025,107770816,2424554373560,1,20201111,123000;;44@121810@22850@22850@254506@60,2@2400@22840@22860@4652@6,3@36529@22810@22880@629@1,;95523,610991,381092;;;;0;

Which kinda looks like your data.

EDIT:

I'm pretty sure this is the data you're after. Some parts of the table might be static, but I'm not 100% sure. However, the response depends on the time you make it. Give it a try a few times and you'll get different results.

For example:

12:29:59,A ,23620,23540,23100,22500,23620,23100,6812,32872444,773767679280,0,20201114,122959;99/8/24 13:31:11,F,1247591.37,<div class='pn'>26516.85</div> 2.17%,47421383985982830,5039085007,61918112976496,730494,P,2229923171,42233712134879,761621,P,2247810,86754663785,3726,;1980@12346584@23620@23660@434@1,11@7181@23610@23700@4000@2,28@61104@23600@23790@2000@1,;95579,611671,381092;31872444,1000000,0,20603084,12269360,2242,1,0,3575,8;;;0;

Also, you can get the entire .csv file with historical data.

Here's how:

import requests

headers = {
    "referer": "http://www.tsetmc.com/Loader.aspx?ParTree=151311&i=35366681030756042",
    "user-agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.198 Safari/537.36",
    "X-Requested-With": "XMLHttpRequest"
}

export_url = "http://tsetmc.com/tsev2/data/Export-txt.aspx?t=i&a=1&b=0&i=35366681030756042"
export = requests.get(export_url, headers=headers).text

with open("oil_data.csv", "w") as f:
    f.write(export)

And this is what you get back:

Answered By - baduker

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Saturday, December 11, 2021

[FIXED] Fetch data using scrapy.FormRequest in aspx

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels