Issue
I've seen this before but I've never seen anything related to Google. When something is searched on Google, all of the links and titles are put in h3 tags. However, if i try to use Beautiful Soup, none of the h3 tags appear and it seems like a lot of the tags are missing. I don't think this is a JavaScript issue. Is there anything I'm missing?
link = "http://google.com/search?q=" + input
soup = BeautifulSoup(link, "lxml")
for item in soup.find_all("h3"):
print (item)
Edit: code
Solution
Use requests
to get the data. And narrow down your search in order to print only the titles
:
from bs4 import BeautifulSoup
import requests
link = "http://google.com/search?q=" + input("Enter search")
headers = {'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:32.0) Gecko/20100101 Firefox/32.0'}
r = requests.get(link, headers=headers)
soup = BeautifulSoup(r.text,'html5lib')
headings = soup.find_all('h3', class_ = 'LC20lb DKV0Md')
for heading in headings:
print(heading.text)
Output:
Enter search>? beautifulsoup
Beautiful Soup Documentation — Beautiful Soup 4.9.0 ...
Beautiful Soup: We called him Tortoise because he taught us.
Beautiful Soup documentation - Crummy
beautifulsoup4 · PyPI
Intro to Beautiful Soup | Programming Historian
Beautiful Soup (HTML parser) - Wikipedia
Implementing Web Scraping in Python with BeautifulSoup ...
Tutorial: Web Scraping with Python Using Beautiful Soup
Beautiful Soup - Quick Guide - Tutorialspoint
Answered By - Sushil
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.