Issue
I have the below code for opening the "New" link of a page which has the data i want to scrape (As in the screenshot). It's working ok and actually clicking the link but the soup i get is still for content under "Popular" (As in screenshot).
What am i doing wrong?
driver = webdriver.Chrome(ChromeDriverManager().install())
driver.get("https://www.homeworkmarket.com/fields/business-finance")
time.sleep(2)
doc = driver.find_elements_by_xpath('//*[@id="wrapper"]/div[2]/div[1]/div[1]/div[3]/div[1]/ul/li[1]/a')[0]
doc.click()
time.sleep(10)
page = driver.page_source
soup = BeautifulSoup(page, 'html.parser')
The rest of the code for scraping href links:
question_links = soup.find_all(class_='css-e5w42e')
final_links = []
for link in question_links:
if 'href' in link.attrs:
link = 'https://www.homeworkmarket.com' + str(link.attrs['href'])
print(link)
final_links.append(link)
Solution
You do not need to click on New, cause elements are already present in HTML DOM:
driver = webdriver.Chrome(ChromeDriverManager().install())
driver.maximize_window()
driver.implicitly_wait(30)
driver.get("https://www.homeworkmarket.com/fields/business-finance")
for link in driver.find_elements(By.XPATH, "(*//a[text()='New']/ancestor::div[contains(@class,'css')])[3]/following-sibling::div/section/descendant::a[contains(@class,'css')]"):
print(link.get_attribute('href'))
Initial 80 links are from popular tab and rest should be from new tab.
Answered By - cruisepandey
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.