Issue
from bs4 import BeautifulSoup
import requests
req = requests.get("https://en.wikipedia.org/wiki/Harvard_University")
html_soup=soup.findAll('table', style="text-align:center; float:right; font-size:85%; margin-right:2em;")
classes=soup.findAll('tables')
How to extract only class names from all the tables ?
Solution
You can use findAll()
to select all the tables. Then loop through the tables and add the classes to a set (to avoid duplicates).
from bs4 import BeautifulSoup
import requests
page = requests.get("https://en.wikipedia.org/wiki/Harvard_University")
soup = BeautifulSoup(page.text, 'html.parser')
tables = soup.findAll("table")
classes = set()
for t in tables:
if t.has_attr('class'):
classes.update(t['class'])
l = list(classes)
print(l)
Answered By - jignatius
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.