Issue
I want to get the rank of all Amazon items on the page using the following Chrome Extension: DS Amazon Quick View
To do so, I use selenium to log into my chrome profile (where the extension is already installed) and try to scrape the rank html info. However, "find_all" return an empty object:
from selenium import webdriver
from bs4 import BeautifulSoup
options = webdriver.ChromeOptions()
options.add_argument(r"--user-data-dir=C:\\Users\\Edo\\AppData\\Local\\Google\\Chrome\\User Data") # get your owm chrome local directory
options.add_argument(r'--profile-directory=Default')
driver = webdriver.Chrome(executable_path=r"C:\Program Files (x86)\chromedriver.exe", options=options) #get your own exe directory
driver.get("https://www.amazon.com/Best-Sellers-Kindle-Store/zgbs/digital-text/ref=zg_bs_unv_digital-text_1_154606011_1")
soup = BeautifulSoup(driver.page_source.encode('utf-8').strip(), 'html.parser')
print(soup.find_all("div", {"class":"xtaqv-result"})))
>>> 0
Solution
There are 4 things in this page:
- Items get loads when you scroll down (Initially it shows only 30 items)
- Items rankings also loads with scroll
- There is pagination if we want to get items from other pages
- Correct locator (Xpath, CSS etc)
Therefore in our code, if we are not waiting for page/rankings to load completely, we will not get values.
Below code retunes all names and ranking details for all the available pages(in this case only 2):
Imports:
from time import sleep
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
options = webdriver.ChromeOptions()
options.add_argument(r"--user-data-
dir=C:\Users\username\AppData\Local\Google\Chrome\User Data")
options.add_argument(r'--profile-directory=Default')
PATH = r"path to your\chromedriver.exe"
driver = webdriver.Chrome(executable_path=r"your chromedriver exe path\chromedriver.exe", options=options)
driver.get(
"https://www.amazon.com/Best-Sellers-Kindle-Store/zgbs/digital-text/ref=zg_bs_unv_digital-text_1_154606011_1")
sleep(10)
def pagescroll():
for x in range(9):
driver.find_element(By.CSS_SELECTOR, 'body').send_keys(Keys.PAGE_DOWN)
sleep(2)
def get_items():
pagescroll()
sleep(5)
allnames = driver.find_elements(By.XPATH, "//*[@id='gridItemRoot']//a//span//div")
allRanks = driver.find_elements(By.XPATH, "//*[@class='xtaqv-result']")
for index in range(len(allnames)):
print(f"--------------------------Item : {index + 1}----------------------------------------")
print(allnames[index].text)
print("--------------------------------------------------------------------------------------")
print(allRanks[index].text)
get_items()
nextElement = driver.find_element(By.CSS_SELECTOR, "div.a-text-center > ul > li.a-last > a")
counter = 1
try:
while nextElement.is_displayed():
counter = counter + 1
print("--------------------------------------------------------------------------------------")
print(f"{counter} : <- page scrapping started")
print("--------------------------------------------------------------------------------------")
nextElement.click()
sleep(5)
get_items()
except:
print("--------------------------------------------------------------------------------------")
print("There is no more page left with items")
print("--------------------------------------------------------------------------------------")
driver.quit()
**Output:** Not all items shared as charcaters are going beyond specified limit.
--------------------------Item : 1----------------------------------------
Taste
--------------------------------------------------------------------------------------
#1 in Kindle Store (Top 100)
#1 in Contemporary Romance (Kindle Store)
#1 in Romantic Comedy (Kindle Store)
#1 in Romantic Comedy (Books)
--------------------------Item : 2----------------------------------------
Family Money
--------------------------------------------------------------------------------------
#2 in Kindle Store (Top 100)
#1 in Domestic Thrillers (Kindle Store)
#1 in Psychological Thrillers (Books)
#1 in Literature & Fiction (Kindle Store)
--------------------------Item : 3----------------------------------------
Run, Rose, Run: A Novel
--------------------------------------------------------------------------------------
#3 in Kindle Store (Top 100)
#1 in Southern United States Fiction
#2 in Literature & Fiction (Kindle Store)
#2 in Crime Thrillers (Kindle Store)
--------------------------Item : 4----------------------------------------
Reminders of Him: A Novel
--------------------------------------------------------------------------------------
#4 in Kindle Store (Top 100)
#1 in New Adult & College Romance (Books)
#1 in Mothers & Children Fiction
#1 in Romance (Kindle Store)
--------------------------Item : 5----------------------------------------
The Last Eligible Billionaire
--------------------------------------------------------------------------------------
#5 in Kindle Store (Top 100)
#1 in Billionaire Romance
#2 in Romantic Comedy (Kindle Store)
#2 in Women's Romance Fiction
--------------------------Item : 6----------------------------------------
The Washington Post Digital Access
--------------------------------------------------------------------------------------
#6 in Kindle Store (Top 100)
#1 in eNewspapers
#1 in U.S. Newspapers
--------------------------Item : 7----------------------------------------
Things We Never Got Over
--------------------------------------------------------------------------------------
#7 in Kindle Store (Top 100)
#1 in General Humorous Fiction
#1 in Men, Women & Relationships Humor
#1 in Small Town & Rural Fiction (Kindle Store)
-
-
-
--------------------------Item : 49----------------------------------------
Heir to Love
--------------------------------------------------------------------------------------
#49 in Kindle Store (Top 100)
#8 in New Adult & College Romance (Books)
#23 in Romance (Kindle Store)
--------------------------Item : 50----------------------------------------
America's Last Fortress: Puerto Rico's Sovereignty, China's Caribbean Belt and Road, and America's National Security
--------------------------------------------------------------------------------------
#50 in Kindle Store (Top 100)
#1 in History of the Caribbean & West Indies
#1 in History of Latin America
#1 in International Relations (Kindle Store)
--------------------------------------------------------------------------------------
2 : <- page scrapping started
--------------------------------------------------------------------------------------
--------------------------Item : 1----------------------------------------
Stepbrother Weekend: Filthy Dirty Desires
--------------------------------------------------------------------------------------
#51 in Kindle Store (Top 100)
#1 in Erotic Literature & Fiction
#1 in Erotica (Kindle Store)
--------------------------Item : 2----------------------------------------
Forget-Me-Not Bombshell
--------------------------------------------------------------------------------------
#52 in Kindle Store (Top 100)
#1 in Women's Action & Adventure Fiction
#1 in Organized Crime (Kindle Store)
#2 in Action & Adventure Romance (Kindle Store)
--------------------------Item : 3----------------------------------------
By a Thread: A Grumpy Boss Romantic Comedy
--------------------------------------------------------------------------------------
#53 in Kindle Store (Top 100)
#2 in General Humorous Fiction
#3 in Romance Literary Fiction
#7 in Romantic Comedy (Kindle Store)
--------------------------Item : 4----------------------------------------
How To Start A Conversation And Make Friends: Revised And Updated
--------------------------------------------------------------------------------------
#54 in Kindle Store (Top 100)
#1 in Motivational Self-Help (Kindle Store)
#1 in Running Meetings & Presentations (Kindle Store)
#1 in Healthy Relationships (Kindle Store)
--------------------------Item : 5----------------------------------------
What Lies Beyond the Veil (Of Flesh & Bone Series Book 1)
--------------------------------------------------------------------------------------
#55 in Kindle Store (Top 100)
#1 in Romantic Fantasy (Books)
#1 in Sword & Sorcery Fantasy (Books)
#1 in Greco-Roman Myth & Legend Fantasy eBooks
--------------------------Item : 6----------------------------------------
Verity
--------------------------------------------------------------------------------------
#56 in Kindle Store (Top 100)
#7 in Psychological Thrillers (Kindle Store)
#9 in Psychological Thrillers (Books)
#14 in Romance (Kindle Store)
--------------------------Item : 7----------------------------------------
Mr. Bloomsbury: A feel-good British Billionaire Romance
--------------------------------------------------------------------------------------
#57 in Kindle Store (Top 100)
#1 in Romance Anthologies (Books)
#1 in Romance Collections & Anthologies
#1 in Romance Anthologies (Kindle Store)
--------------------------Item : 8----------------------------------------
Put Me in Detention
--------------------------------------------------------------------------------------
#58 in Kindle Store (Top 100)
#9 in Romantic Comedy (Kindle Store)
#10 in Romantic Comedy (Books)
#15 in Romance (Kindle Store)
--------------------------Item : 9----------------------------------------
A Place Called Freedom
--------------------------------------------------------------------------------------
#59 in Kindle Store (Top 100)
#1 in Espionage Thrillers (Kindle Store)
#1 in Mystery Action Fiction (Kindle Store)
#1 in Historical Scottish Fiction
--------------------------Item : 10----------------------------------------
The Second Home: A Novel
--------------------------------------------------------------------------------------
#60 in Kindle Store (Top 100)
#2 in Sibling Fiction
#4 in Sisters Fiction
#4 in Coming of Age Fiction (Books)
--------------------------Item : 11----------------------------------------
The Last Green Valley: A Novel
--------------------------------------------------------------------------------------
#61 in Kindle Store (Top 100)
#1 in Historical Biographical Fiction
#1 in Biographical Fiction (Books)
#1 in Biographical Literary Fiction
--------------------------Item : 12----------------------------------------
Hidden: An Exciting Novel of Suspense (A Lost and Found Novel Book 1)
--------------------------------------------------------------------------------------
#62 in Kindle Store (Top 100)
#1 in Contemporary
#1 in Thrillers (Kindle Store)
#1 in Heist Thrillers
--------------------------Item : 49----------------------------------------
Sweet (Landry Family Series Book 6)
--------------------------------------------------------------------------------------
#99 in Kindle Store (Top 100)
#4 in Inspirational Romance
#5 in New Adult & College Romance (Kindle Store)
#6 in Women's New Adult & College Fiction
--------------------------Item : 50----------------------------------------
Sold on a Monday: A Novel
--------------------------------------------------------------------------------------
#100 in Kindle Store (Top 100)
#3 in Historical Fiction (Kindle Store)
#3 in Literary Fiction (Kindle Store)
#3 in U.S. Historical Fiction
--------------------------------------------------------------------------------------
There is no more page left with items
--------------------------------------------------------------------------------------
Answered By - QualityMatters
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.