Issue
From the following html,
html = '''
<td>the keyword is present in the <a href='text' title='text'>text</a> </td>
<td>word key is not present</td>
<td>no keyword here</td>'''
I want to find the strings that that the word "keyword" them.
So in this example, I want to find
<td>the keyword is present in the <a href='text' title='text'>text</a> </td>
<td>no keyword here</td>
So I tried:
soup = BeautifulSoup(html, 'lxml')
ans = soup.find_all('td', text=lambda l: l and 'keyword' in l)
print(ans)
# [<td>no keyword here</td>]
But this doesn't return the other line that has "keyword" in it. How do I go about it?
Solution
Try this:
from bs4 import BeautifulSoup
html = '''
<td>the keyword is present in the <a href='text' title='text'>text</a> </td>
<td>word key is not present</td>
<td>no keyword here</td>'''
soup = BeautifulSoup(html , 'html.parser')
print(*[td for td in soup.find_all("td") if 'keyword' in td.text], sep='\n')
Output:
<td>the keyword is present in the <a href="text" title="text">text</a> </td>
<td>no keyword here</td>
You can use td.text
for get text in <td>
like below:
print(*[td.text for td in soup.find_all("td") if 'keyword' in td.text], sep='\n')
Output:
the keyword is present in the text
no keyword here
Answered By - user1740577
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.