Issue
I tried all the solutions mentionned here but none of it is working on my code. My problem is i only want to get the text from spans tags which are children of h2 tags (and not h3 tags) on this Wikipedia page (https://fr.wikipedia.org/wiki/Manga) This is my code :
import numbers
import urllib.request
from bs4 import BeautifulSoup
quote_page ='https://fr.wikipedia.org/wiki/Manga#:~:text=Un%20manga%20(%E6%BC%AB%E7%94%BB)%20est%20une,quelle%20que%20soit%20son%20origine.'
page = urllib.request.urlopen(quote_page)
soup = BeautifulSoup(page, 'html.parser')
spans = soup.find_all('h2 > span.mw-heading')
#not working, results show all spans in h2 AND h3
for span in spans :
print(span.text)
#div_span = soup.find_all('span', class_="mw-headline")
#for spans in div_span:
# print(spans.text) #or string ?
Is someone has the solution today, i would be thankfull to him ;) (comments are working but taking spans tags with h3 tags in it :/)
Solution
You are close to your goal but mixing things in my opinion and should use select
while operating with css selectors
:
soup.select('h2 > span.mw-headline')
Another issue here is that the class is named mw-headline
instead mw-heading
.
Example
import urllib.request
from bs4 import BeautifulSoup
quote_page ='https://fr.wikipedia.org/wiki/Manga#:~:text=Un%20manga%20(%E6%BC%AB%E7%94%BB)%20est%20une,quelle%20que%20soit%20son%20origine.'
page = urllib.request.urlopen(quote_page)
soup = BeautifulSoup(page, 'html.parser')
for e in soup.select('h2 > span.mw-headline'):
print(e.text)
Output
Étymologie
Genre et nombre du mot « manga » en français
Histoire des mangas
Caractéristiques du manga
Diffusion
Influence du manga
Produits dérivés
Notes et références
Voir aussi
Answered By - HedgeHog
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.