Issue
I'm very new to Scrapy and I want to be able to extract both texts paragraph using Scrapy shell: "Fintec, Cybersecurity" and "Serie C"
If I run
response.css('div.card-body p.card-text strong::text').get()
I get 'Secteur' but I'm looking for 'Fintec, Cybersecurity'.
for
response.css('div.card-body p.card-text::text').get()
I get '/n'
I've noticed if I use
response.css('div.card-body p.card-text:nth-child(3)').get()
I get < p class="card-text">\nRound : Série C\n < /p> and for
response.css('div.card-body p.card-text:nth-child(2)').get()
I get
< p class="card-text">\nSecteur : Fintech, Cybersecurity\n < / p>
How do I get Serie C and Fintech Cybersecurity?
Thank you
Solution
This should work... 'div.card-body p.card-text::text'
you just need to use either the getall
or extract
methods.
Here is an example I did in ipython:
In [3]: html = '''<div class="card-body">
...: <h3 class="card-title mb-1">L</h3>
...: <p class="card-text">
...: <strong>Secteur</strong>
...: " : Fintech, Cybersecurity "
...: </p>
...: <p class="card-text">
...: <strong>Round</strong>
...: " : Serie C "
...: </p>
...: <p class="card-text">
...: <small class="text-muted"> 2820 votes enregistres </small>
...: </p>
...: </div>'''
In [4]: response = parsel.Selector(html)
In [5]: for p in response.css('div.card-body p.card-text::text').getall():
...: text=''.join(p).strip()
...: print(text)
...:
" : Fintech, Cybersecurity "
" : Serie C "
Answered By - Alexander
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.