Thursday, December 2, 2021

[FIXED] Scrape an Ajax form with .submit() with Python and Selenium

December 02, 2021 beautifulsoup, python, scrapy, selenium-webdriver, web-scraping No comments

Issue

I am trying to get the link from a web page. The web page sends the request using javascript, then the server sends a response which goes directly to download a PDF. This new PDF is automatically downloaded into your browser. My first approach was to use selenium to get the information:

# Path chromedriver & get url
path = "/Users/my_user/Desktop/chromedriver"
browser = webdriver.Chrome(path)
browser.get("https://www.holzwickede.de/amtsblatt/index.php")

# Banner click
ban = WebDriverWait(browser,15).until(EC.element_to_be_clickable((By.XPATH,"//a[@id='cc_btn_accept_all']"))).click()

#Element to get
elem = browser.find_element_by_xpath("//div[@id='content']/div[7]/table//form[@name='gazette_52430']/a[@href='#gazette_52430']")
elem.click()
print (browser.current_url)

The result was the current URL which corresponds to the same webpage, while the request is directly to the server.

https://www.holzwickede.de/amtsblatt/index.php#gazette_52430

I tried after this unsuccessful result to grab it with requests.

 # Access requests via the `requests` attribute
 for request in browser.requests: #It captures all the requessin chronologica order
     if request.response.headers:
         print(
             request.path,
             request.response.status_code,
             request.response.headers,
            request.body,
            "/n"

        )

The result stills not the behind link from which the PDF is coming. Do you guys have an idea what can I do ? Thanks in advance.

Solution

I found the answer. The request sends a POST form. Therefore, we have to extract the header contents and their parameters. When you know the parameters the form sends, you can use the request to get back the link to your console.

response = requests.get(url, params={'key1': 'value1', 'key2': 'value2'})
print (response.url)

This question solves additionally this question: Capture AJAX response with selenium python

Cheers!

Answered By - Geomario

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Thursday, December 2, 2021

[FIXED] Scrape an Ajax form with .submit() with Python and Selenium

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels