Issue
I'm new to python and I'm wanting to get into web scraping on YouTube. I'm wanting to use this link to get the newest videos uploaded: 'https://www.youtube.com/results?search_query=programming&sp=CAISBAgBEAE%253D' and I want to scrape the new 5 videos. How can I do this? I've used this piece of code to test it (I only want the links) from this question
from bs4 import BeautifulSoup
import requests
url="https://www.youtube.com/results?search_query=programming&sp=CAISBAgBEAE%253D"
html = requests.get(url)
soup = BeautifulSoup(html.text, features="html.parser")
for entry in soup.find_all("entry"):
for link in entry.find_all("link"):
print(link["href"])
Edit: I don't get any response from the python terminal. It's not scraping anything. It only has the default ">>>".
Solution
You cannot scrape YouTube without using Googles YouTube API key which you can get by doing these steps. I can repost a legit answer to your question if you're still down to try.
In the mean time, try practicing your parsing with beautifulsoup on this website videvo.net
Here's some code to help you get started
def get_source(url):
return BeautifulSoup(requests.get(url, headers={"User-Agent": "Mozilla/5.0"}, verify=False).text, 'html.parser')
soup = get_source('http://videvo.net')
for tags in soup.find_all('a'):
print(tags['href'])
EDIT I stand corrected (slightly). Youtube's main url cannot be parsed. You can try this code
def get_source(url):
return BeautifulSoup(requests.get(url, headers={"User-Agent": "Mozilla/5.0"}, verify=False).text, 'html.parser')
soup = get_source('https://www.youtube.com/feeds/videos.xml?user=kinagrannis')
for entry in soup.find_all("entry"):
for title in entry.find_all("title"):
print(title.text)
for link in entry.find_all("link"):
print(link["href"])
for name in entry.find_all("name"):
print(name.text)
for pub in entry.find_all("published"):
print(pub.text)
note: you can put any user name instead of 'kinnagrannis', user=[username]
Answered By - Mainly Karma
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.