Issue
from bs4 import BeautifulSoup
import requests
def get_stats():
url = f'https://dsebd.org/dse_close_price_archive.php?startDate=2021-08-28&endDate=2023-08-28&inst=REPUBLIC&archive=data'
content = requests.get(url).text
soup = BeautifulSoup(content, 'html.parser')
rate = soup.find("table", class_="table table-bordered background-white").get_text()
with open("stats.csv", 'w') as file:
file.write(rate)
return rate
get_stats()
This is the code I have so far. This is the output I get: https://pastebin.pl/view/raw/98ddcd6f
I tried grabbing it with beautifulsoup4 as you can see in the code but I just get a bunch of mumbo jumbo with no clue on how to organize this. Thank you to anyone who can help me!
Solution
To get the table into pandas dataframe you can use next example:
import pandas as pd
import requests
from bs4 import BeautifulSoup
url = "https://dsebd.org/dse_close_price_archive.php?startDate=2021-08-28&endDate=2023-08-28&inst=REPUBLIC&archive=data"
soup = BeautifulSoup(requests.get(url).content, "html.parser")
table = soup.select("table:not(:has(table))")[-2]
all_data = []
header = [th.get_text(strip=True) for th in table.select("th")]
for row in table.select("tr")[1:]:
tds = [td.get_text(strip=True) for td in row.select("td")]
all_data.append(tds)
df = pd.DataFrame(all_data, columns=header)
print(df.head())
Prints:
# DATE TRADING CODE CLOSEP* YCP
0 1 2023-08-28 REPUBLIC 36 35.9
1 2 2023-08-27 REPUBLIC 35.9 35.3
2 3 2023-08-24 REPUBLIC 35.3 35.9
3 4 2023-08-23 REPUBLIC 35.9 36.7
4 5 2023-08-22 REPUBLIC 36.7 36.7
Answered By - Andrej Kesely
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.