Issue
Here is link to website: https://www.ohorse.com/stables/
I want to extract the address from every div as circled in the screenshot below:
Write the following code
from requests import get
from bs4 import BeautifulSoup
from bs4 import NavigableString, Tag
url = 'https://www.ohorse.com/stables/'
resp = get(url)
soup = BeautifulSoup(resp.text, 'lxml')
all_divs = soup.findAll('div', class_ = 'contentright')
for div in all_divs:
# print(div.find('a', class_ = 'listing').get('href'))
sub_divs = div.findAll('div', class_ = 'listing_content')
for s_div in sub_divs:
add = list(s_div.children)[0]
add2 = list(s_div.children)[2]
print(add)
print(add2)
And got this output:
As one the very first line I got a image tag because in first div there is Facebook link is given instead of address and it do not return the address.
second with this I got some tags how I can I apply condition on tag that if list have any tag so I can pass it.
I just want a solution so I can extract address of every in standard form.
Solution
import requests
from bs4 import BeautifulSoup
def main(url):
r = requests.get(url)
soup = BeautifulSoup(r.content, 'html.parser')
target = soup.findAll('div', class_='listing_content')
for tar in target:
tar = list(tar.strings)[:4]
print(list(dict.fromkeys(tar)))
main("https://www.ohorse.com/stables/")
Output:
["Visit Houghton College Equestrian Center's Facebook Page", '9823 School Farm Rd', 'Houghton, NY 14744', '(585) 567-8142']
['HC80 Box 16', 'Burwell, NE 68823', '(308) 346-5530']
['1500 Kings Gap Road', 'Pine Mountain, GA 31811', '(229) 886-1709']
['6280 Taylor Ranch Loop', 'Kaufman, TX 75142', '(972) 467-4053']
['28424 Hegar Rd', 'Hockley, TX 77447', '(281) 702-2048']
['28424 Hegar Rd', 'Hockley, TX 77447', '(936) 931-1188']
['1409 US Hwy 59', 'Garvin, MN 56132', '(507) 629-4401']
['1911 De La Vina St', 'Santa Barbara, CA 93101', '(805) 448-4896']
['Shawnee, KS 66216', '(913) 963-8212', '[email protected]']
['11127 Orcas Ave', 'Lake View Terrace, CA 91342', '(818) 899-9221']
Answered By - αԋɱҽԃ αмєяιcαη
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.