Issue
I am trying to get the amount of chapters in this manga using BeautifulSoup but the way it's contained is making it confusing:
[The Section]https://gyazo.com/c45fef82b0ce52dacd99d213538ab570)
I only want the Chapter number and not the content of the other divs. Currently I have (not the full code):
[The Website]https://www.anime-planet.com/manga/the-beginning-after-the-end
chp = []
temp = soup.select('section.pure-g entryBar > div.pure-1 md-1-5')
for txt in temp:
if "Ch" in txt.text:
chp.append(txt.text)
How would I access the text within the first div?
Solution
Looking at the structure of HTML, you can extract text from the first <div>
under class="entryBar"
:
import requests
from bs4 import BeautifulSoup
url = "https://www.anime-planet.com/manga/the-beginning-after-the-end"
soup = BeautifulSoup(requests.get(url).content, "html.parser")
ch = soup.select_one(".entryBar > div").text.split()[-1].strip("+")
print(ch)
Prints:
159
Answered By - Andrej Kesely
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.