Issue
Task
Writing a crawler that outputs: Title, Artikelnummer, Price, Delivery Status in a .csv
https://www.karton.eu/einwellig-ab-100-mm
Problem
It´s really hard to figure out, which html-tag on that webpage contains the information i need.
For example: <small>Artikelnummer: 001</small>
How do i collect the 001?
There are several more tags, i do not clearly understand to get the info of
Solution
First you will select the node where the text you want is:
response.xpath('//div[@class="delivery-status"]/small/text()')
Now, to catch only part of the return you can use regex. Fortunately Scrapy selectors supports builtin regex. So you can use like this:
response.xpath('//div[@class="delivery-status"]/small/text()').re_first(r'\d+')
or for a list with all results:
response.xpath('//div[@class="delivery-status"]/small/text()').re(r'\d+')
Answered By - renatodvc
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.