Issue
I am reading HTML
using Beautiful Soup
. I have ran the command soup.find_all("span",{"class":"budget-list__data__number budget-list__number show-for-medium"})
and obtain:
[<span class="budget-list__data__number budget-list__number show-for-medium">
4 000 €
<span class="project-votes display-inline-block">24 <span class="text-uppercase text-small">votes</span></span>
</span>, <span class="budget-list__data__number budget-list__number show-for-medium">
25 000 €
<span class="project-votes display-inline-block">24 <span class="text-uppercase text-small">votes</span></span>
</span>, <span class="budget-list__data__number budget-list__number show-for-medium">
14 000 €
<span class="project-votes display-inline-block">23 <span class="text-uppercase text-small">votes</span></span>
</span>, <span class="budget-list__data__number budget-list__number show-for-medium">
35 000 €
.
.
.
I am interested in keeping only the elements that include monetary amounts (e.g: 4 000 euros, etc) but ignoring the bits of code included in <span class="project-votes display-inline-block">
. I thought about using span.clear()
but that does not do the trick. Do you have any suggestions?
Solution
Try:
spans = soup.find_all(
"span",
{"class": "budget-list__data__number budget-list__number show-for-medium"},
)
for span in spans:
print(span.contents[0].strip())
Prints:
4 000 €
25 000 €
14 000 €
35 000 €
Answered By - Andrej Kesely
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.