Issue
So Im practicing my scraping and I came across something like this:
<div class="profileDetail">
<div class="profileLabel">Mobile : </div>
021 427 399
</div>
and I need the number outside of the <div>
tag:
My code is:
num = soup.find("div",{"class":"profileLabel"}).text
but the output of that is Mobile :
only it's the text inside the <div>
tag not the text outside of it.
so how do we extract the text outside of the <div>
tag?
Solution
I would make a reusable function to get the value by label, finding the label by text
and getting the next sibling:
import re
def find_by_label(soup, label):
return soup.find("div", text=re.compile(label)).next_sibling
Usage:
find_by_label(soup, "Mobile").strip() # prints "021 427 399"
Answered By - alecxe
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.