Issue
driver.get("https://tinhte.vn/thread/tong-hop-16-tin-don-moi-nhat-ve-iphone-16.3739136/")
userid = driver.find_elements(By.XPATH, "//*[contains(@href, '/profile/')]").get_attribute("href")
print(userid)
Here is my code, my goal is scrape all username and their id from the website. The user profile link format is https://tinhte.vn/profile/username.123456/ in @href tag.
I have searched on google and know that I cant get attribute on multiple elements. So what is the alternative way to do it ?
Solution
Root cause of the issue: Below line is not correct. You cannot use .get_attribute()
method directly on a list object.
userid = driver.find_elements(By.XPATH, "//*[contains(@href, '/profile/')]").get_attribute("href")
Solution: In a loop, iterate the list and use .get_attribute()
on each web element to capture the href
attribute.
Code: Check the code below with in-line explanation:
driver.get("https://tinhte.vn/thread/tong-hop-16-tin-don-moi-nhat-ve-iphone-16.3739136/")
driver.maximize_window()
wait = WebDriverWait(driver, 15)
# Click on Consent button
wait.until(EC.element_to_be_clickable((By.XPATH, "//button[@aria-label='Consent']"))).click()
# Store all users into a variable called `userid`
userid = wait.until(EC.visibility_of_all_elements_located((By.XPATH, "//*[contains(@href, '/profile/')]")))
# declare an array called users
users = []
# iterate through each web element in the userid array and get the href attribute of it and store in users array
for user in userid:
users.append(user.get_attribute("href"))
# Print the array
print(users)
Console output:
['https://tinhte.vn/profile/vnninja.43700/', 'https://tinhte.vn/profile/vnninja.43700/', 'https://tinhte.vn/profile/xecatang.2612547/', 'https://tinhte.vn/profile/xecatang.2612547/', 'https://tinhte.vn/profile/cuteo.2549422/', 'https://tinhte.vn/profile/cuteo.2549422/', 'https://tinhte.vn/profile/xecatang.2612547/', 'https://tinhte.vn/profile/soj.2649515/', 'https://tinhte.vn/profile/soj.2649515/', 'https://tinhte.vn/profile/xecatang.2612547/', 'https://tinhte.vn/profile/bao-sai-gon.2928116/', 'https://tinhte.vn/profile/bao-sai-gon.2928116/', 'https://tinhte.vn/profile/soj.2649515/', 'https://tinhte.vn/profile/soj.2649515/', 'https://tinhte.vn/profile/bao-sai-gon.2928116/', 'https://tinhte.vn/profile/jinnie-ktl.2392977/', 'https://tinhte.vn/profile/jinnie-ktl.2392977/', 'https://tinhte.vn/profile/penguin-small.1673674/', 'https://tinhte.vn/profile/penguin-small.1673674/', 'https://tinhte.vn/profile/lekhanh1504.1717481/', 'https://tinhte.vn/profile/lekhanh1504.1717481/', 'https://tinhte.vn/profile/penguin-small.1673674/', 'https://tinhte.vn/profile/penguin-small.1673674/', 'https://tinhte.vn/profile/penguin-small.1673674/', 'https://tinhte.vn/profile/lekhanh1504.1717481/', 'https://tinhte.vn/profile/lekhanh1504.1717481/', 'https://tinhte.vn/profile/lekhanh1504.1717481/', 'https://tinhte.vn/profile/penguin-small.1673674/', 'https://tinhte.vn/profile/soj.2649515/', 'https://tinhte.vn/profile/soj.2649515/', 'https://tinhte.vn/profile/penguin-small.1673674/', 'https://tinhte.vn/profile/naturelovely9.206218/', 'https://tinhte.vn/profile/naturelovely9.206218/', 'https://tinhte.vn/profile/soj.2649515/', 'https://tinhte.vn/profile/soj.2649515/', 'https://tinhte.vn/profile/naturelovely9.206218/', 'https://tinhte.vn/profile/truong0977.2369659/', 'https://tinhte.vn/profile/truong0977.2369659/', 'https://tinhte.vn/profile/airwalker.2543950/', 'https://tinhte.vn/profile/airwalker.2543950/', 'https://tinhte.vn/profile/truong0977.2369659/', 'https://tinhte.vn/profile/namphuong000.1462961/', 'https://tinhte.vn/profile/namphuong000.1462961/', 'https://tinhte.vn/profile/truong0977.2369659/', 'https://tinhte.vn/profile/lethangk47.1319353/', 'https://tinhte.vn/profile/lethangk47.1319353/', 'https://tinhte.vn/profile/%E2%80%9Cbenh-vien-tra-ve%E2%80%9D-da-noi-rang.2991590/', 'https://tinhte.vn/profile/%E2%80%9Cbenh-vien-tra-ve%E2%80%9D-da-noi-rang.2991590/', 'https://tinhte.vn/profile/lethangk47.1319353/', 'https://tinhte.vn/profile/lethangk47.1319353/', 'https://tinhte.vn/profile/lethangk47.1319353/', 'https://tinhte.vn/profile/benh-vien-tra-ve-da-noi-rang.2991590/', 'https://tinhte.vn/profile/mrbinhta.2970728/', 'https://tinhte.vn/profile/mrbinhta.2970728/', 'https://tinhte.vn/profile/kennyn73.185469/', 'https://tinhte.vn/profile/kennyn73.185469/', 'https://tinhte.vn/profile/thienduongld.496304/', 'https://tinhte.vn/profile/thienduongld.496304/', 'https://tinhte.vn/profile/kennyn73.185469/', 'https://tinhte.vn/profile/evilartist.2379222/', 'https://tinhte.vn/profile/evilartist.2379222/', 'https://tinhte.vn/profile/thienduongld.496304/', 'https://tinhte.vn/profile/kaitokid1908.2575569/', 'https://tinhte.vn/profile/kaitokid1908.2575569/', 'https://tinhte.vn/profile/nguyennguyen0127.2933297/', 'https://tinhte.vn/profile/nguyennguyen0127.2933297/', 'https://tinhte.vn/profile/grozar.274181/', 'https://tinhte.vn/profile/grozar.274181/', 'https://tinhte.vn/profile/huong-giang-trang.1777251/', 'https://tinhte.vn/profile/huong-giang-trang.1777251/', 'https://tinhte.vn/profile/hoanglong213.2692496/', 'https://tinhte.vn/profile/hoanglong213.2692496/', 'https://tinhte.vn/profile/tuananhcao.2460888/', 'https://tinhte.vn/profile/tuananhcao.2460888/', 'https://tinhte.vn/profile/dlcr.537546/', 'https://tinhte.vn/profile/dlcr.537546/', 'https://tinhte.vn/profile/iphone2g-lock.2442834/', 'https://tinhte.vn/profile/iphone2g-lock.2442834/', 'https://tinhte.vn/profile/tientran517.2706203/', 'https://tinhte.vn/profile/tientran517.2706203/', 'https://tinhte.vn/profile/teslaspacex.1738728/', 'https://tinhte.vn/profile/teslaspacex.1738728/']
Process finished with exit code 0
Answered By - Shawn
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.