Issue
I'm trying to get the latitude and longitude of an address using a get request, but I'm having a hard time with this.
The address is formed by the street name, a number and a sector of the city. In this case, the street name is Hernando de Magallanes
, the number is 958
and the sector is Las Condes
.
My code looks as follows:
import requests
from bs4 import BeautifulSoup as soup
url = "https://www.google.cl/maps/place/Hernando+de+Magallanes+958,+Las+Condes"
resp=requests.request(method="GET",url=url)
soup_parser = soup(resp.text, "html.parser")
The part I'm looking for is under script, which is under hmtl tag.
html_content = soup_parser.html.contents[1]
_script = html_content.find_all("script")[7]
Looking into _script
, there is a huge load of text, but the part that I'm looking for is the https
url that is here:
.... uPfjZVD6AEAACAAAAADAAAAAAAAAAEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACAAAAAAAAgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA\",null,null,[[[1,0]\n]\n,1,null,0,0]\n]\n,null,\"Hernando de Magallanes 958, Las Condes, RegiĆ³n Metropolitana\",null,null,\"https://www.google.cl/maps/preview/place/Hernando+de+Magallanes+958,+Las+Condes,+Regi%C3%B3n+Metropolitana/@-33.4164174,-70.5598746,3330a,13.1y/data\\u003d!4m2!3m1!1s0x9662ceef18131219:0x3bab969f4e95bd4e\",1,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,1,null,null,null,null,null,null,null,0,[[[\"W2vSX-_FKMv39QOpipUw\",\"0ahUKEwiv3531h8TtAhXLe30KHSlFBQYQwlUIGCgAMAA\",[\"Hernando d ....
In particular, I'm looking for the two numbers that are next to the @
-> -33.4164174,-70.5598746
.
How can I get this coordinates? Also, I'm thinking to do this for a bunch of other addresses. Is there any daily quota for this requests?
Solution
Use regular expression regex re
. and use the pattern.
import requests
from bs4 import BeautifulSoup as soup
import re
url = "https://www.google.cl/maps/place/Hernando+de+Magallanes+958,+Las+Condes"
resp=requests.request(method="GET",url=url)
soup_parser = soup(resp.text, "html.parser")
html_content = soup_parser.html.contents[1]
_script = html_content.find_all("script")[7]
matches=re.findall("(-\d+\.\d{7})",_script.text)
print(matches[0],matches[1])
Answered By - KunduK
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.