Issue
HTML sample below. I am using BeautifulSoup to extract the texts.
txt = """[<dd class="qs" id="qsff"><br/>Pretty women wonder where my secret lies. <br/>I'm not cute or built to suit a fashion model's size<br/>But when I start to tell them,<br/>They think I'm telling lies.<br/><br/>I say,<br/>It's in the reach of my arms<br/>The span of my hips,<br/>The stride of my step,<br/>The curl of my lips.<br/><br/></dd>]"""
from bs4 import BeautifulSoup
soup = BeautifulSoup(txt, "lxml")
for node in soup:
print (node.text)
# [Pretty women wonder where my secret lies. I'm not cute or built to suit a fashion model's sizeBut when I start to tell them,They think I'm telling lies.I say,It's in the reach of my armsThe span of my hips,The stride of my step,The curl of my lips.]
It shows me a whole chunk of string as above, but I want to have them line by line, like:
Pretty women wonder where my secret lies.
I'm not cute or built to suit a fashion model's size
But when I start to tell them,
....
I tried below but it doesn't work.
for node in soup.find_all('br'):
print (node.text)
What's the right way to output them line by line?
Solution
Iterate over strings, not nodes:
for node in soup.dd.strings:
print(node)
#Pretty women wonder where my secret lies.
#I'm not cute or built to suit a fashion model's size
#But when I start to tell them,
#....
And why do you enclose your text in square brackets?
Answered By - DYZ
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.