Issue
I am trying to extract the bitcoin price from the coinbase website in Python but am not that familiar with HTML. When I inspect the coinbase site (https://www.coinbase.com/price/bitcoin), I get the following (this is a snippet of the whole inspection), which I think has the price to be extracted from the following:
<div class="cds-flex-f1g67tkn cds-baseline-b1fcn5pa cds-column-ci8mx7v cds-0-_1t4ck38">
<div class="cds-typographyResets-t1xhpuq2 cds-display3-doujgnf cds-foreground-f1yzxzgu cds-transition-txjiwsi cds-start-s1muvu8a cds-tabularNumbers-t11sqpt cds-1-_qem5ui">
<span>£21,976.91</span>
</div>
...
I tried asking chatGPT because again, I don't know HTML and it came up with this:
import requests
from bs4 import BeautifulSoup
# define the URL of the Coinbase webpage
url = 'https://www.coinbase.com/price/bitcoin'
# make a GET request to the webpage and get the HTML code
response = requests.get(url)
html_code = response.content
# create a BeautifulSoup object and parse the HTML code
soup = BeautifulSoup(html_code, 'html.parser')
# find the element that contains the price data
price_element = soup.find('div', {'class': 'cds-typographyResets-t1xhpuq2 cds-display3-doujgnf cds-foreground-f1yzxzgu cds-transition-txjiwsi cds-start-s1muvu8a cds-tabularNumbers-t11sqpt cds-1-_qem5ui'})
# extract the price data from the element
price_data = price_element.text.strip()
# print the price data
print(price_data)
It throws back the error that 'NoneType' object doesn't have an attribute 'text'. I believe this is because the soup.find() doesn't receive any input. After doing some research and some trial and error, I can't seem to fix this. Could you guys help? Thanks in advance!
Solution
The data you see on the page is loaded with JavaScript from external URL (so beautifulsoup
doesn't see it).
You can simulate the Ajax request with following example:
import requests
api_url = "https://www.coinbase.com/graphql/query"
params = {
"operationName": "useGetPriceChartDataQuery",
"extensions": "{\"persistedQuery\":{\"version\":1,\"sha256Hash\":\"3aa896a38f822d856792f18ca18f98f49580540f517693ad4e51508c529ddb6e\"}}",
"variables": '{"skip":false,"slug":"bitcoin","currency":"EUR"}',
}
data = requests.get(api_url, params=params).json()
print(data['data']['assetBySlug']['latestQuote']['price'])
Prints:
25210.325
Note: You can find the GraphQL URL in your browser: Open Web Developer Tools -> Network Tab and reload the page. This URL should be there with the request parameters and returned data.
Answered By - Andrej Kesely
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.