Issue
I want to extract the text content from the below HTML tag, but the <sup>
tag is preventing me from getting the desired text.
The text I want to extract is simply (4:6, 6:7)
. how can I extract this text at the same time escaping the <sup>
tag.
I tried this "//p/text()"
, but I am only getting the part before the <sup>
tag (4:6, 6
my html tag
'<p class="result"><span class="bold">Final result </span><strong>0:2</strong> (4:6, 6<sup>5</sup>:7)</p>
Solution
It's the only text that is a direct text of p
, the rest are texts inside a child tag.
scrapy shell file:///path/to/file.html
In [1]: ''.join(response.xpath('//p[@class="result"]/text()').getall())
Out[1]: ' (4:6, 6:7)'
Answered By - SuperUser
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.