Issue
Good day. My script is on progress and I need help or ideas to make it work properly. I am able to grab some data but its not really that readable and useful and your help and ideas are needed.
from bs4 import BeautifulSoup
import requests
headers = {"User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:92.0) Gecko/20100101 Firefox/92.0"}
url = "https://bscscan.com/tx/0xb07b68f72f0b58e8cfb8c8e896736f49b13775ebda25301475d24554a601ff97#eventlog"
urlpage = requests.get(url, headers=headers, timeout=10, allow_redirects=False)
soup = BeautifulSoup(urlpage.content, 'html.parser')
price = soup.find('div', class_='d-none d-md-inline-block u-label u-label--price rounded mt-1 ml-n1 text-nowrap').get_text()#.strip()
print ("Price: ", price)
data1 = soup.find('div', class_='card-body').get_text()#.strip()
print (data1)
data2 = soup.find('span', class_='btn btn-icon btn-soft-success rounded-circle').get_text()#.strip()
print (data2)
Current Output:
Price:
BNB: $422.35 (-3.05%) | 5 Gwei
Transaction Hash:
0xb07b68f72f0b58e8cfb8c8e896736f49b13775ebda25301475d24554a601ff97
Status:Success
Squeezed text (173 lines).
206
Wanted Output:
Price:
BNB: $422.35 (-3.05%) | 5 Gwei
207 #-- latest data
Address: 0x81e0ef68e103ee65002d3cf766240ed1c070334d
Topics: 0 0x598cd56214a374d15f638dd04913e0288cd76c7833ee66b15cf55845d875a187
Data
0000000000000000000000000000000000000000000000000000000061b23bae
00000000000000000000000000000000000000000000000000000000979144b0
Solution
Alternative which caters for always picking up latest transaction (if more transactions added). Because JavaScript doesn't run with requests
content isn't as it appears on webpage. You need to target the element with id myTabContent
.
I've attempted broadly to go with hopefully more stable selector lists and avoid some of the potentially less robust classes.
import requests
from bs4 import BeautifulSoup as bs
r = requests.get('https://bscscan.com/tx/0xb07b68f72f0b58e8cfb8c8e896736f49b13775ebda25301475d24554a601ff97#eventlog', headers = {'User-Agent':'Mozilla/5.0'})
soup = bs(r.content, 'lxml')
#select price info
price = soup.select_one('#ethPrice').get_text(' ', strip = True)
# select latest event
last_transaction = soup.select_one('#myTabContent div.media:nth-last-child(2)')
latest_number = int(last_transaction.select_one('.btn-icon__inner').text)
address = last_transaction.select_one('a.text-break').text
topic = last_transaction.select_one('li > .text-break').text
print('Price:', price)
print('Latest number:', latest_number)
print('Address:', address)
print('Topics:', topic)
print('Data')
for data in last_transaction.select('[id^=chunk].text-break'):
print(data.text)
Answered By - QHarr
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.