Issue
I'm trying to scrape some data with beautifulsoup on python (url:http://books.toscrape.com/catalogue/tipping-the-velvet_999/index.html
)
When the data is first occurrence, no problem like
titlebook = soup.find("h1")
titlebook = titlebook.text
but i want to scrape different values, further in page, like upc, price incl.tax, etc
Upc value is first and i have it running universal_product_code= soup.find("tr").find("td").text
I tried so many solutions to access the other ones (i've read beautifulsoup documentation and tried lot of things but it didn't really help me) So my question is, how to access specific values in a tree where tags are same? I joined a screenshot of the tree to help you understand what i'm talking about
Thank you for your help
Solution
For example, if you want to find price (excluding tax), you can use string=
parameter in .find
and then search for text in next <td>
:
import requests
from bs4 import BeautifulSoup
url = "http://books.toscrape.com/catalogue/tipping-the-velvet_999/index.html"
soup = BeautifulSoup(requests.get(url).content, "html.parser")
# get "Price (excl. tax)" from the table:
key = "Price (excl. tax)"
print("{}: {}".format(key, soup.find("th", string=key).find_next("td").text))
Prints:
Price (excl. tax): £53.74
Or: Use CSS selector:
print(soup.select_one('th:-soup-contains("Price (excl. tax)") + td').text)
Answered By - Andrej Kesely
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.