Issue
I have problems trying to get the html content from a web page.
In this website: https://tmofans.com/library/manga/5763/nisekoi when you click on play icon for examen in "Capitulo 230.00" its open the next link: https://tmofans.com/goto/347231 redirects you to this website: https://tmofans.com/viewer/5c187dcea0240/paginated
The problem is when you open directly on this link: https://tmofans.com/goto/347231 the page gives a message of 403 Forbidden. the only way to be redirected to final page is by clicking on the play button from first page.
I want to get the final url content using only the tmofans.com/goto link
I trie to get html content using requests and BeautifulSoup
import requests
from BeautifulSoup import BeautifulSoup
response = requests.get("https://tmofans.com/goto/347231")
page = str(BeautifulSoup(response.content))
print page
When i do this with https://tmofans.com/goto/347231 i only get the content of 403 Forbidden page.
Solution
This website checks if you have a referer from their site, it gives you a 403 response otherwise. You can easily bypass this by setting a referer.
import requests
ref='https://tmofans.com'
headers = { 'Referer': ref }
r = requests.get('https://tmofans.com/goto/347231',headers=headers)
print(r.url)
print(r.status_code)
Output
https://tmofans.com/viewer/5c187dcea0240/paginated
200
Answered By - Bitto Bennichan
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.