Issue
I am trying to scrape the table data from this table URL: https://covid19criticalcare.com/pharmacies/
On my previous scrape I used the following Python packages: from bs4 import BeautifulSoup import requests import mysql.connector import pandas as pd from sqlalchemy import create_engine
But this url's HTML doesn't contain the table data on it, instead it seems to be drawing the data from an external database.
Could someone please point me in the right direction for scraping a table data with this sort of HTML setup using a python script?
I tried doing a blind scrape, by using the method I used on my previous scrape.
from bs4 import BeautifulSoup
import requests
import mysql.connector
import pandas as pd
from sqlalchemy import create_engine
url = "https://covid19criticalcare.com/pharmacies/"
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36'}
result = requests.get(url, headers = headers)
doc = BeautifulSoup(result.text, "html.parser")
name = doc.find_all("td", class\_="column-1")
td_pharmacy_name = \[\]
for td in name:
names = td.text
td_names.append(names)
print(td_names)
Solution
Just as alternative to @Naphat Theerawats answer and while I noticed that you started with a
seleniumbased solution you could get your goal with that much easier in combination with
pandas`.
Load the website and extract table from driver.page_source
with pd.read_html()
- To avoid iterating each page just select Show All entries
Example
from selenium import webdriver
from selenium.webdriver.common.by import By
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import Select
import pandas as pd
url = 'https://covid19criticalcare.com/pharmacies/'
driver = webdriver.Chrome(ChromeDriverManager().install())
driver.maximize_window()
driver.get(url)
wait = WebDriverWait(driver, 5)
select = Select(wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, '[name = "DataTables_Table_0_length"'))))
select.select_by_value('-1')
wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, 'a.paginate_button.next.disabled')))
df = pd.read_html(driver.page_source, displayed_only=False)[1]
driver.close()
df
Output
Pharmacy Name | Phone | Website | Requires prescription? | Pharmacy Address | Based in the United States? | Overnight shipping to the United States? | Overnight International shipping? | Ships to the following States/Provinces | |
---|---|---|---|---|---|---|---|---|---|
0 Covid Pharmacy | [email protected] | (785) 672 9222 | 0covidpharmacy.com | NO | 245 Krishna Market Channi RoadNagpur, Maharashtra 440001India | NO | YES | YES | AlabamaAlaskaArizonaArkansasCaliforniaColoradoConnecticutDelawareDistrict of ColumbiaFloridaGeorgiaHawaiiIdahoIllinoisIndianaIowaKansasKentuckyLouisianaMaineMarylandMassachusettsMichiganMinnesotaMississippiMissouriMontanaNebraskaNevadaNew HampshireNew JerseyNew MexicoNew YorkNorth CarolinaNorth DakotaOhioOklahomaOregonPennsylvaniaRhode IslandSouth CarolinaSouth DakotaTennesseeTexasUtahVermontVirginiaWashingtonWest VirginiaWisconsinWyomingGuamPuerto RicoVirgin IslandsArmed Forces AmericasArmed Forces EuropeArmed Forces Pacific |
1 Ivermectin Service | [email protected] | (888) 290 0964 (US), +91 22509 72606 (IN) | 1ivermectin.com | NO | 1/16, First Floor, Tardeo Air Conditioned Market Building, TardeoMumbai, Tardeo 400034India | NO | YES | YES | AlabamaAlaskaArizonaArkansasCaliforniaColoradoConnecticutDelawareDistrict of ColumbiaFloridaGeorgiaIdahoIllinoisIndianaIowaKansasKentuckyLouisianaMaineMarylandMassachusettsMichiganMinnesotaMississippiMissouriMontanaNebraskaNevadaNew HampshireNew JerseyNew MexicoNew YorkNorth CarolinaNorth DakotaOhioOklahomaOregonPennsylvaniaRhode IslandSouth CarolinaSouth DakotaTennesseeTexasUtahVermontVirginiaWashingtonWest VirginiaWisconsinWyomingPuerto RicoVirgin Islands |
1 Life Pharmacy | [email protected] | (888) 560-0430 (US); +91 (807 ) 127-9990 (India) | 1lifepharmacy.net | NO | 302, Pride Plaza, Rajkot, 360002Rajkot, Gujarat 360002; 84118India | NO | YES | YES | AlabamaAlaskaArizonaArkansasCaliforniaColoradoConnecticutDelawareDistrict of ColumbiaFloridaGeorgiaHawaiiIdahoIllinoisIndianaIowaKansasKentuckyLouisianaMaineMarylandMassachusettsMichiganMinnesotaMississippiMissouriMontanaNebraskaNevadaNew HampshireNew JerseyNew MexicoNew YorkNorth CarolinaNorth DakotaOhioOklahomaOregonPennsylvaniaRhode IslandSouth CarolinaSouth DakotaTennesseeTexasUtahVermontVirginiaWashingtonWest VirginiaWisconsinWyoming |
1-2-3 RX Global Pharmacy | [email protected] | (516) 758-2630 | 123rx.net | NO | 2967 Dundas St. W.Toronto, Ontario M6P 1Z2Canada | NO | YES | YES | AlabamaAlaskaArizonaArkansasCaliforniaColoradoConnecticutDelawareDistrict of ColumbiaFloridaGeorgiaHawaiiIdahoIllinoisIndianaIowaKansasKentuckyLouisianaMaineMarylandMassachusettsMichiganMinnesotaMississippiMissouriMontanaNebraskaNevadaNew HampshireNew JerseyNew MexicoNew YorkNorth CarolinaNorth DakotaOhioOklahomaOregonPennsylvaniaRhode IslandSouth CarolinaSouth DakotaTennesseeTexasUtahVermontVirginiaWashingtonWest VirginiaWisconsinWyoming |
12 Angel Pharmacy Store | [email protected] | (908) 866-4260 | 12angel.store | NO | 1050 Bharat Diamond BourseBandra Kurla ComplexMumbai, Maharashtra 400051India | NO | YES | YES | AlabamaAlaskaArizonaArkansasCaliforniaColoradoConnecticutDelawareDistrict of ColumbiaFloridaGeorgiaHawaiiIdahoIllinoisIndianaIowaKansasKentuckyLouisianaMaineMarylandMassachusettsMichiganMinnesotaMississippiMissouriMontanaNebraskaNevadaNew HampshireNew JerseyNew MexicoNew YorkNorth CarolinaNorth DakotaOhioOklahomaOregonPennsylvaniaRhode IslandSouth CarolinaSouth DakotaTennesseeTexasUtahVermontVirginiaWashingtonWest VirginiaWisconsinWyomingGuamPuerto RicoVirgin IslandsArmed Forces AmericasArmed Forces EuropeArmed Forces Pacific |
24 x 7 Pharma | [email protected] | (851) 127-5721 | 24x7pharma.com | NO | Mahek IconSumul Diary Road, KatargamSurat, Gujarat 395003India | NO | YES | YES | AlabamaAlaskaArizonaArkansasCaliforniaColoradoConnecticutDelawareDistrict of ColumbiaFloridaGeorgiaHawaiiIdahoIllinoisIndianaIowaKansasKentuckyLouisianaMaineMarylandMassachusettsMichiganMinnesotaMississippiMissouriMontanaNebraskaNevadaNew HampshireNew JerseyNew MexicoNew YorkNorth CarolinaNorth DakotaOhioOklahomaOregonPennsylvaniaRhode IslandSouth CarolinaSouth DakotaTennesseeTexasUtahVermontVirginiaWashingtonWest VirginiaWisconsinWyomingGuamPuerto RicoVirgin IslandsArmed Forces AmericasArmed Forces EuropeArmed Forces Pacific |
...
Answered By - HedgeHog
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.