Issue
I used python to get listed companies description from website.
I intended that this code would get information consistently, but it just works only one time and occurs the attribute error.
Here is my code
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from openpyxl import load_workbook
import time
from bs4 import BeautifulSoup
import requests
wb = load_workbook("listedcorp.xlsx")
ws = wb.active
col_B = ws["B"]
# print(col_B)
# for cell in col_B:
# print(cell.value)
browser = webdriver.Chrome()
# browser.maximize_window()
for cell in col_B:
url = "https://finance.naver.com/item/main.nhn?code={}".format(cell.value)
browser.get(url)
soup = BeautifulSoup(browser.page_source, "lxml")
ov = soup.find("div", attrs={"class":"summary_info"}).get_text()
print(str.strip(ov) + '\n\n')
time.sleep(5)
and here is the result
Please let me know what problem causing this.
Solution
Content needs a moment to be generated so you should wait until your elements presence is located before create your BeautifulSoup
object from driver.page_source
:
wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, '#summary_info')))
Example
from selenium import webdriver
from selenium.webdriver.common.by import By
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from bs4 import BeautifulSoup
import time
l = ['102280','002900']
url = 'https://finance.naver.com/item/main.naver?code='
driver = webdriver.Chrome(ChromeDriverManager().install())
driver.maximize_window()
wait = WebDriverWait(driver, 5)
for code in l:
driver.get(url+code)
wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, '#summary_info')))
soup = BeautifulSoup(driver.page_source, "lxml")
if soup.find("div", attrs={"id":"summary_info"}):
ov = soup.find("div", attrs={"id":"summary_info"}).get_text()
else:
ov = 'no text found'
print(str.strip(ov) + '\n\n')
time.sleep(5)
Answered By - HedgeHog
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.