Issue
I want to scrape the table 'Summary statement holding of specified securities' from this website https://www.bseindia.com/stock-share-price/infosys-ltd/infy/500209/shareholding-pattern/ I tried scraping data using selenium but it was all in one column without any table and there is no unique identifier to this table. How to use pandas and Beautiful Soup to scrape the table in a structured format or any other method. This is the code I'm trying to figure out but it didn't work.
import requests
import pandas as pd
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:80.0) Gecko/20100101 Firefox/80.0"
}
params = {
'id': 0,
'txtscripcd': '',
'pagecont': '',
'subject': ''
}
def main(url):
r = requests.get(url, params=params, headers=headers)
df = pd.read_html(r.content)[-1].iloc[:, :-1]
print(df)
main("")
Solution
To load the table to DataFrame and csv, you can use this example:
import requests
import pandas as pd
from bs4 import BeautifulSoup
headers = {'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:80.0) Gecko/20100101 Firefox/80.0'}
api_url = 'https://api.bseindia.com/BseIndiaAPI/api/shpSecSummery_New/w?qtrid=&scripcode=500209'
soup = BeautifulSoup(requests.get(api_url, headers=headers).json()['Data'], 'lxml')
table = soup.select_one('b:contains("Summary statement holding of specified securities")').find_next('table')
df = pd.read_html(str(table))[0].iloc[2:, :]
df.to_csv('data.csv')
Saves data.csv
(screenshot from LibreOffice):
Answered By - Andrej Kesely
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.