Issue
I want to download the ipranges.json
(which is updated weekly) from https://www.microsoft.com/en-us/download/confirmation.aspx?id=56519
I have this python code which keeps running forever.
import wget
URL = "https://www.microsoft.com/en-us/download/confirmation.aspx?id=56519"
response = wget.download(URL, "ips.json")
print(response)
How can I download the JSON file in Python?
Solution
Because https://www.microsoft.com/en-us/download/confirmation.aspx?id=56519
is the link which automatically trigger javascript to download, therefore you just download the page, not the file
If you check downloaded file, the source will look like this
We realize the file will change after a while, so we have to scrape it in generic way
For convenience, I will not use wget, 2 libraries here are requests
to request page and download file, beaufitulsoup
to parse html
# pip install requests
# pip install bs4
import requests
from bs4 import BeautifulSoup
# request page
URL = "https://www.microsoft.com/en-us/download/confirmation.aspx?id=56519"
page = requests.get(URL)
# parse HTML to get the real link
soup = BeautifulSoup(page.content, "html.parser")
link = soup.find('a', {'data-bi-containername':'download retry'})['href']
# download
file_download = requests.get(link)
# save in azure_ips.json
open("azure_ips.json", "wb").write(file_download.content)
Answered By - Tấn Nguyên
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.