Issue
I'm trying to fetch some data from the page using scrapy. Let's say there html:
<div class=example id=example>
<p>Some text</p>
<ul>
<li>A list 1</li>
<li>A list 2</li>
<li>A list 3</li>
</ul>
<p>text again</p>
</div>
I'm selecting this data by selecting whole id and then axtracting the data by attributes, like this:
response.xpath('//*[@id="example"]/p').getall()
The result is:
<p>Some text</p>
<p>text again</p>
But I can't get the list. I would like to get this:
<p>Some text</p>
<ul>
<li>A list 1</li>
<li>A list 2</li>
<li>A list 3</li>
</ul>
<p>text again</p>
Any suggestions how should I get all the attributes and data inside this class?
Solution
The code:
lst = response.xpath('//div[@id="example"]/*').getall()
will return what you want:
lst = ['<p>Some text</p>', '<ul>\r\n<li>A list 1</li>\r\n<li>A list 2</li>\r\n<li>A list 3</li>\r\n</ul>', '<p>text again</p>']
Let's print the list in order:
for i in lst:
print(i)
<p>Some text</p>
<ul>
<li>A list 1</li>
<li>A list 2</li>
<li>A list 3</li>
</ul>
<p>text again</p>
Answered By - SuperUser
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.