Issue
Basically, I am trying to scrape this link https://www.nseindia.com/live_market/dynaContent/live_watch/option_chain/optionKeys.jsp?segmentLink=17&instrument=OPTIDX&symbol=BANKNIFTY
This following code is the solution -
import requests
headers = {
'Connection': 'keep-alive',
'Cache-Control': 'max-age=0',
'DNT': '1',
'Upgrade-Insecure-Requests': '1',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.79 Safari/537.36',
'Sec-Fetch-User': '?1',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
'Sec-Fetch-Site': 'none',
'Sec-Fetch-Mode': 'navigate',
'Accept-Encoding': 'gzip, deflate, br',
'Accept-Language': 'en-US,en;q=0.9,hi;q=0.8',
}
response = requests.get('https://www1.nseindia.com/live_market/dynaContent/live_watch/option_chain/optionKeys.jsp?segmentLink=17&instrument=OPTIDX&symbol=NIFTY', headers=headers, verify=True,timeout=(5, 14))
print(response.content)
It works perfectly on my laptop but it is not working on Google Collab or Heroku server or digital ocean as well as in motion hosting.
What is the catch here?
Solution
After testing in various environments, NSE blocks python-request as well as normal access to this website from most of the reputed cloud servers.
However, CURL works in DigitalOcean Bangalore and Amazon AWS Mumbai server. But if you convert that CURL data to Python request, that is blocked.
So here is a pythonic solution that will work in those cloud servers. It looks lame but I'm using it without digging deep -
import subprocess
import os
os.chdir(os.path.dirname(os.path.abspath(__file__)))
subprocess.Popen('curl "https://www.nseindia.com/api/quote-derivative?symbol=BANKNIFTY" -H "authority: beta.nseindia.com" -H "cache-control: max-age=0" -H "dnt: 1" -H "upgrade-insecure-requests: 1" -H "user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.117 Safari/537.36" -H "sec-fetch-user: ?1" -H "accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9" -H "sec-fetch-site: none" -H "sec-fetch-mode: navigate" -H "accept-encoding: gzip, deflate, br" -H "accept-language: en-US,en;q=0.9,hi;q=0.8" --compressed -o maxpain.txt', shell=True)
f=open("maxpain.txt","r")
var=f.read()
print(var)
It basically runs the curl function and sends the output to a file and read the file back. That's it.
Answered By - Amit Ghosh
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.