Sunday, January 30, 2022

[FIXED] Not able to scrape the images from Flipkart.com website the src attribute is coming emtpy

January 30, 2022 beautifulsoup, e-commerce, python-requests, web-scraping No comments

Issue

I am able to scrape all the data from flipkart website except the images using the code below:

jobs = soup.find_all('div',{"class":"IIdQZO _1R0K0g _1SSAGr"})

for job in jobs:
    product_name = job.find('a',{'class':'_2mylT6'})
    product_name = product_name.text if product_name else "N/A"

    product_offer_price = job.find('div',{'class':'_1vC4OE'})
    product_offer_price = product_offer_price.text if product_offer_price else "N/A"

    product_mrp = job.find('div',{'class':'_3auQ3N'})
    product_mrp = product_mrp.text if product_mrp else "N/A"

    product_link = job.find('a',{'class':'_3dqZjq'})
    product_link = product_link.get('href') if product_link else "N/A"
    product_link = url+ product_link

    product_img = job.find('div',{'class':'_3ZJShS _31bMyl'})


    print('product name {}\nproduct offer price {}\nproduct mrp {}\nproduct link {}\nproduct image {}'.\
      format(product_name,product_offer_price,product_mrp,product_link,product_img))
    print('\n')

Results for eg.:

product name UV Protection Wayfarer Sunglasses (54)
product offer price ₹8,000
product mrp ₹8,890
product link https://www.flipkart.com/search?q=rayban/ray-ban-wayfarer- 
product image <img alt="" class="_3togXc" src=""/>

When I am manually inspecting the page the src is there but when scraping it is coming empty as above

Solution

As I mentioned on comment use selenium.

from selenium import webdriver
from bs4 import BeautifulSoup

driver=webdriver.Chrome()
driver.get('https://www.flipkart.com/search?q=rayban/ray-ban-wayfarer')
time.sleep(3)
soup=BeautifulSoup(driver.page_source,'html.parser')
url='"https://www.flipkart.com'
jobs = soup.find_all('div',{"class":"IIdQZO _1R0K0g _1SSAGr"})

for job in jobs:
    product_name = job.find('a',{'class':'_2mylT6'})
    product_name = product_name.text if product_name else "N/A"

    product_offer_price = job.find('div',{'class':'_1vC4OE'})
    product_offer_price = product_offer_price.text if product_offer_price else "N/A"

    product_mrp = job.find('div',{'class':'_3auQ3N'})
    product_mrp = product_mrp.text if product_mrp else "N/A"

    product_link = job.find('a',{'class':'_3dqZjq'})
    product_link = product_link.get('href') if product_link else "N/A"
    product_link = url+ product_link

    product_img =job.find('div',{'class':'_3ZJShS _31bMyl'}).find('img')['src']

    print('product name {}\nproduct offer price {}\nproduct mrp {}\nproduct link {}\nproduct image {}'.\
      format(product_name,product_offer_price,product_mrp,product_link,product_img))
    print('\n')

Answered By - KunduK

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Sunday, January 30, 2022

[FIXED] Not able to scrape the images from Flipkart.com website the src attribute is coming emtpy

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels