Issue
I am able to read a CSV file with an associated URL directly into Python using the following:
import pandas as pd
URL = 'http://samplecsvs.s3.amazonaws.com/SalesJan2009.csv'
pd.read_csv(URL)
However, I would similarly like to read CSV files created by an HTML5 Export-Button directly into Python3 (as opposed to downloading the file locally and uploading it to Python).
For example, I would like the CSV file created by clicking the 'CSV' button on this webpage to be read into Python directly as a Pandas DataFrame: https://datatables.net/extensions/buttons/examples/html5/simple.html Button Location is shown here
I have tried doing this using a combination of Selenium, BeautifulSoup and Pandas in Python3 but haven't been successful.
Solution
Clicking the buttons executes JavaScript and there aren't any URL. But you can use pandas.read_html
function to load the data (they are in <table>
form inside the page).
For example:
import requests
import pandas as pd
url = 'https://datatables.net/extensions/buttons/examples/html5/simple.html'
df = pd.read_html(requests.get(url).text)[0]
print(df)
Prints:
Name Position Office Age Start date Salary
0 Tiger Nixon System Architect Edinburgh 61 2011/04/25 $320,800
1 Garrett Winters Accountant Tokyo 63 2011/07/25 $170,750
2 Ashton Cox Junior Technical Author San Francisco 66 2009/01/12 $86,000
3 Cedric Kelly Senior Javascript Developer Edinburgh 22 2012/03/29 $433,060
4 Airi Satou Accountant Tokyo 33 2008/11/28 $162,700
5 Brielle Williamson Integration Specialist New York 61 2012/12/02 $372,000
6 Herrod Chandler Sales Assistant San Francisco 59 2012/08/06 $137,500
7 Rhona Davidson Integration Specialist Tokyo 55 2010/10/14 $327,900
8 Colleen Hurst Javascript Developer San Francisco 39 2009/09/15 $205,500
9 Sonya Frost Software Engineer Edinburgh 23 2008/12/13 $103,600
... and so on.
Answered By - Andrej Kesely
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.