Issue
from bs4 import BeautifulSoup
from selenium import webdriver
import time
import sys
query_txt = input("크롤링할 내용 입력 :")
path = "C:\Temp\chromedriver_240\chromedriver.exe"
driver = webdriver.Chrome(path)
driver.get("https://www.naver.com")
time.sleep(2)
driver.find_element_by_id("query").send_keys(query_txt)
driver.find_element_by_id("search_btn").click()
driver.find_element_by_link_text("블로그 더보기").click()
full_html = driver.page_source
soup = BeautifulSoup(full_html, 'html.parser')
content_list = soup.find('ul', id='elThumbnailResultArea')
print(content_list)
content = content_list.find('a','sh_blog_title _sp_each_url _sp_each_title' ).get_text()
print(content)
for i in content_list:
con = i.find('a', class_='sh_blog_title _sp_each_url _sp_each_title').get_text()
print(con)
print('\n')
i typed this code with watching online learning but in loop it always error. con = i.find('a', class_='sh_blog_title _sp_each_url _sp_each_title').get_text() this line show error 'find() takes no keyword arguments'
Solution
The problem is, you have to use .find_all()
to get all <a>
tags. .find()
only returns one tag (if there's any):
import requests
from bs4 import BeautifulSoup
url = 'https://search.naver.com/search.naver?query=tree&where=post&sm=tab_nmr&nso='
full_html = requests.get(url).content
soup = BeautifulSoup(full_html, 'html.parser')
content_list = soup.find_all('a', class_='sh_blog_title _sp_each_url _sp_each_title' )
for i in content_list:
print(i.text)
print('\n')
Prints:
[2017/공학설계 입문] Romantic Tree
장충동/Banyan Tree Club & Spa/Club Members Restaurant
2020-06-27 Joshua Tree National Park Camping(조슈아트리...
[결혼준비/D-102] 웨딩밴드 '누니주얼리 - like a tree'
Book Club - Magic Tree House # 1 : Dinosaur Before Dark...
비밀 정원, 조슈아 트리 국립공원(Joshua Tree National Park)
그뤼너씨 TEA TREE 티트리 라인 3종리뷰
Number of Nodes in the Sub-Tree With the Same Label
태국의 100년 넘은 Giant tree
[부산 기장 카페] 오션뷰 뷰맛집카페 : 씨앤트리 sea&tree
Answered By - Andrej Kesely
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.