Issue
How to extract meta description of any webpage? I have used the below script to get the meta-information of the webpage.
import requests
from bs4 import BeautifulSoup
url = 'https://www.dataquest.io/'
response = requests.get(url)
soup = BeautifulSoup(response.text)
metas = soup.find_all('meta')
The result of the script is:
[<meta charset="utf-8"/>,
<meta content="width=device-width, initial-scale=1" name="viewport"/>,
<meta content="Learn Python, R, and SQL skills. Follow career paths to become a job-qualified data scientist, analyst, or engineer with interactive data science courses!" name="description"/>,
<meta content="index, follow" name="robots"/>,
<meta content="index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1" name="googlebot"/>,
<meta content="index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1" name="bingbot"/>,
<meta content="en_US" property="og:locale"/>]
Now I want to pull the content of meta property where name="description"
i.e, the second line in this case.
Kindly suggest!
Solution
You can use the python array filtering syntax for this:
[m.get('content') for m in metas if m.get('name') == 'description']
This returns an array.
Answered By - Catch22
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.