Issue
I am new at webscraping. I have our college website that has results in an iFrame. The results iFrame has many internal links that redirect or load the iFrame with respective result. How can I scrape the marks of my semester using beautifulsoup.
Link: http://cvr.ac.in/home4/index.php/academics/results In the iFrame displayed I want to scrape in the link, lets say "B.Tech IV YEAR II SEM Main Examinations (R15-B16) held in August-2020"
Earlier they had a separate results page without frames and I could soup it something like this:
import requests
from bs4 import BeautifulSoup as bs
result = requests.post("<old url>", data={'srno':'<number>'})
s = bs(result.content, 'lxml')
//skim thru results
But I am not sure how to do it in frames case. Help is appreciated. Thanks in advance.
Solution
To get data from AUGUST 2020 link, you can use this example:
import requests
from bs4 import BeautifulSoup
url = 'http://cvr.ac.in/home4/index.php/academics/results'
headers = {'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:80.0) Gecko/20100101 Firefox/80.0'}
soup = BeautifulSoup(requests.get(url, headers=headers).content, 'html.parser')
base_url = 'https://' + soup.iframe['src'].replace('https://', '').split('/')[0]
soup = BeautifulSoup(requests.get(soup.iframe['src'], headers=headers).content, 'html.parser')
# select AUGUST 2020 link:
link = base_url + soup.select_one('a:contains("AUGUST 2020")')['href']
data = {
'srno': "1111", # <-- Change to your desired ROLL number
'type': "roll",
'phase1': ""
}
soup = BeautifulSoup( requests.post(link, data=data, headers=headers).content, 'html.parser' )
# parse your required data from soup
# ...
Answered By - Andrej Kesely
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.