Issue
I'm trying to extract instrument price using python:
import requests
from bs4 import BeautifulSoup
def main():
instrument = "us100"
get_price(instrument)
def get_price(instrument):
url = "https://www.xtb.com/pl/oferta/dostepne-rynki/indeksy/{}".format(instrument)
page_content = requests.get(url)
parsed_page_content = BeautifulSoup(page_content.content, "html.parser")
print(parsed_page_content.find("span", {"id": "bid"}))
if __name__ == "__main__":
main()
but as result i receive: main.py None
what i'm doing wrong? :)
Solution
requests.get() won't get any content that is dynamically loaded into the webpage, such as that price info which is constantly updating. As a result, beautifulsoup isn't useful here. You can use Selenium along with the webdriver for your browser of choice to launch the browser window, get the info and automatically close the browser.
Here's a quick solution I hacked up. I used MS Edge because I didn't currently have the updated webdriver for the version of Chrome I have installed.
from selenium import webdriver
from selenium.webdriver.edge.service import Service
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions
from selenium.webdriver.support.wait import WebDriverWait
instrument = "us100"
edgeService = Service("C:\Drivers\webdrivers\msedgedriver.exe")
edgeService.start()
edgeDriver = webdriver.Remote(edgeService.service_url)
edgeDriver.get("https://www.xtb.com/pl/oferta/dostepne-rynki/indeksy/{}".format(instrument))
WebDriverWait(edgeDriver, 10).until(
expected_conditions.presence_of_element_located(
(By.CSS_SELECTOR, ".box-bid")
)
)
priceelement = edgeDriver.find_element(by=By.CSS_SELECTOR, value=".box-bid")
print(priceelement)
span = priceelement.find_element(by=By.CSS_SELECTOR, value="#bid")
print(span)
print(span.text)
print(span.get_attribute("innerHTML"))
edgeDriver.stop_client()
edgeService.stop()
Output:
<selenium.webdriver.remote.webelement.WebElement (session="9a6b6bec2ea3d781566cb321e929d2b2", element="eef454c1-f414-4bcb-815e-8ef36712d8c4")>
<selenium.webdriver.remote.webelement.WebElement (session="9a6b6bec2ea3d781566cb321e929d2b2", element="48d6c6d7-b3e2-42f0-a28b-0441998da99a")>
11662.02
11662.<strong>02</strong>
Answered By - nigh_anxiety
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.