Issue
I'm having trouble getting the url of an image on a website and I was wondering if I could get some help.
I want to get the image url of the card on the website, but using xpath only gives me the image url of the website logo.
scrapy shell https://db.ygoprodeck.com/card/?search=7%20Colored%20Fish
response.xpath('//img')
Out[2]: [<Selector xpath='//img' data='<img src="https://db.ygoprodeck.com/sear'>]
There should be another img link to the card picture but it is not showing up
Solution
So there is some logic to how the images are done. Each card has an ID listed on the page. The ID is the name of the image. They hide this ID from you also.
They load much of this information in via the meta attributes at the top of the page. Often times the JS will be put at the top in the script or meta attributes. This is particularly true of shopify stores.
If you ever have trouble finding something for example with this image get the image name and search the rest of the document for references for that keyword. You will often be able to track down the information or at least figure out how it is loaded. This is also useful when websites require a "token" often they will supply the token on the previous page somewhere.
# with css
In [6]: response.css('meta[property="og:image"]::attr(content)').extract_first()
Out[6]: 'https://ygoprodeck.com/pics/23771716.jpg'
# with xpath
In [8]: response.xpath('//meta[@property="og:image"]/@content').extract_first()
Out[8]: 'https://ygoprodeck.com/pics/23771716.jpg'
Answered By - ThePyGuy
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.