Issue
html_text = driver.page_source
soup = BeautifulSoup(html_text, "html.parser")
get_details = soup.find_all('li', attrs={"class":"news"})
# get_details is an aggregation of results fetched by BeautifulSoup find_all() method
one instance of the resultset is as below:
<li class="news">blah blah blah what i want blah blah blah <a href="/graphic/graphicInfoData/000002230030421305">View details</a></li>
What I want is the "blah blah blah what i want blah blah blah", the so-called Navigable string in BeautifulSoup. But I can not use .string attribute to a list, even when I use the print(get_details[0].string), the result is None, why?
by the way , as a comparison, below code works!
print(get_details[0].a.string)
>>> print(get_details[0].li.string)
Traceback (most recent call last):
File "<pyshell#57>", line 1, in <module>
print(get_details[0].li.string)
AttributeError: 'NoneType' object has no attribute 'string'
Any thoughts will be highly appreciated!
Solution
Use .get_text()
instead of .string
:
print(get_details[0].a.get_text())
Output: View details
print(get_details[0].get_text())
Output: blah blah blah what i want blah blah blah View details
Be aware, that get_details[0].get_text()
will get all the text of the li
.
Following will only get the first part:
get_details[0].contents[0].strip()
Output: blah blah blah what i want blah blah blah
Answered By - HedgeHog
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.