Issue
I want to create a dictionary that stores the values from one class as keys, and another class as it's values from the webpage I'm working on.
Here's what I have tried:
from bs4 import BeautifulSoup
import pandas as pd
from selenium import webdriver
DRIVER_PATH = '/usr/local/bin/chromedriver'
driver = webdriver.Chrome(executable_path=DRIVER_PATH)
all_data = []
for i in range(1, 5, 1):
url = 'https://www.transfermarkt.co.uk/cristiano-ronaldo/profil/spieler/'+str(i)
driver.get(url)
html = driver.page_source
soup = BeautifulSoup(html, 'html.parser')
data = {}
market_left = soup.find('div', {'class':'right-td'})
market_right = soup.find('div', {'class':'left-td'})
for m in market_left:
for mr in market_right:
print(data[m.text.strip()].append(mr.text.strip()))
However I get the following error:
AttributeError: 'NavigableString' object has no attribute 'text'
Also when I increase the number in the range so say for example from range(1, 10, 1)
, It doesn't seem to iterate over many pages, it only selects the last one. Any idea on how it can grab the information for each page within the loop?
Expected output :
{'Current market value:':[-, -, -,-,-]}
Solution
Can use zip
in python to iterate over two list simultaneously.
Try like below:
for i in range(1,10):
url = "https://www.transfermarkt.co.uk/silvio-adzic/profil/spieler/{}".format(i)
driver.get(url)
time.sleep(5)
soup = BeautifulSoup(driver.page_source, 'html5lib')
market_left = soup.find_all('div',class_="left-td")
market_right = soup.find_all('div',class_="right-td")
print(f"In Page {i}")
for l,r in zip(market_left,market_right):
l_value = l.text.replace('\n','').replace(' ','')
r_value = r.text.replace('\n','').replace(' ','')
print(f"{l_value} {r_value}")
# Code to add the details to dictionary.
print("-----------------------------------------------------------")
And few of the pages dint have the data you are looking for.
In Page 1
Currentmarketvalue: -
Lastupdate: Apr23,2009
Highestmarketvalue:Lastupdate: £225Th.Oct4,2004
-----------------------------------------------------------
In Page 2
-----------------------------------------------------------
In Page 3
-----------------------------------------------------------
In Page 4
Currentmarketvalue: -
Lastupdate: Feb13,2007
Highestmarketvalue:Lastupdate: £360Th.Oct4,2004
-----------------------------------------------------------
In Page 5
Currentmarketvalue: -
Lastupdate: Sep14,2010
Highestmarketvalue:Lastupdate: £1.26mOct4,2004
-----------------------------------------------------------
In Page 6
Currentmarketvalue: -
Lastupdate: Aug2,2010
Highestmarketvalue:Lastupdate: £765Th.Oct6,2005
-----------------------------------------------------------
In Page 7
Currentmarketvalue: -
Lastupdate: Jan30,2014
Highestmarketvalue:Lastupdate: £1.08mOct4,2004
-----------------------------------------------------------
In Page 8
Currentmarketvalue: -
Lastupdate: Jan2,2010
Highestmarketvalue:Lastupdate: £1.35mJun2,2006
-----------------------------------------------------------
In Page 9
-----------------------------------------------------------
Answered By - pmadhu
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.