Issue
I wish to extract category and title from the following HTML piece of code using Scrapy:
<div class="box-text box-text-products">
<div class="title-wrapper">
<p class="category uppercase is-smaller no-text-overflow product-cat op-7">Supplements </p>
<p class="name product-title"><a href="https://martslu.com/product/explosive-energy-pre-workout-cherry-punch-300g/">Explosive Energy Pre Workout Cherry Punch – 300g</a></p></div><div class="price-wrapper">
</div>
</div>
The following is the code I wrote
def parse(self,response):
for product in response.css('div.box-text.box-text-products::text'):
yield{
'category': product.css('div.title-wrapper.p::text').get(),
'title': product.css('div.title-wrapper>p.name product-title::text').get()}
I am still unclear as to how to point out specific class names in p tags. Any help is appreciated.
Solution
def parse(self,response):
for product in response.css('div.box-text.box-text-products'):
yield {
'category': product.css('div.title-wrapper > p.category::text').get(),
'title': product.css('div.title-wrapper > p.product-title > a::text').get()
}
You're not familiar with CSS Selectors. Google some material and learn the syntax.
Parsing in scrapy
depends on parsel
, which has introduced 2 additional custom non-standard pseudo-elements
::text
::attr(name)
Besides these 2 custom pseudo-elements, most of the css selector syntax are supported.
Answered By - Simba
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.