Issue
I have the following issue when trying to get information from some website using scrapy.
I'm trying to get all the text inside <p>
tag, but my problem is that in some cases inside those tags there is not just text, but sometimes also an <a>
tag, and my code stops collecting the text when it reaches that tag.
This is my Xpath expression, it's working properly when there aren't tags contained inside:
description = descriptionpath.xpath("span[@itemprop='description']/p/text()").extract()
Solution
Posting Pawel Miech's comment as an answer as it appears his comment has helped many of us thus far and contains the right answer:
Tack //text()
on the end of the xpath to specify that text should be recursively extracted.
So your xpath would appear like this:
span[@itemprop='description']/p//text()
Answered By - Stunner
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.