Issue
Im trying to scrape data from https://www.macrotrends.net/stocks/charts/AAPL/apple/financial-statements to get all the values on the table. Im using selenium and it is only capable of getting the first 6 values, the rest seen to be hidden somehow.
Code:
!pip install selenium
import selenium
from selenium import webdriver
from selenium.webdriver.common.by import By
import requests
driver = webdriver.Firefox()
url = 'https://www.macrotrends.net/stocks/charts/AAPL/apple/financial-statements'
hdr = {"User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.75 Safari/537.36",
"X-Requested-With": "XMLHttpRequest"}
driver.get(url)
a = driver.find_element(By.CSS_SELECTOR, "#contenttablejqxgrid").text
print(a)
Output:
Revenue
$365.817
$274.515
$260.174
$265.595
$229.234
$215.639
Cost Of Goods Sold
$212.981
$169.559
$161.782
$163.756
$141.048
$131.376
Gross Profit
$152.836
$104.956
$98.392
$101.839
$88.186
$84.263
Research And Development Expenses
$21.914
$18.752
$16.217
$14.236
$11.581
$10.045
SG&A Expenses
$21.973
$19.916
$18.245
$16.705
$15.261
$14.194
Other Operating Income Or Expenses
-
-
-
-
-
-
Operating Expenses
$256.868
$208.227
$196.244
$194.697
$167.890
$155.615
Operating Income
$108.949
$66.288
$63.930
$70.898
$61.344
$60.024
Total Non-Operating Income/Expense
$258
$803
$1.807
$2.005
$2.745
$1.348
Pre-Tax Income
$109.207
$67.091
$65.737
$72.903
$64.089
$61.372
Income Taxes
$14.527
$9.680
$10.481
$13.372
$15.738
$15.685
Income After Taxes
$94.680
$57.411
$55.256
$59.531
$48.351
$45.687
Other Income
-
-
-
-
-
-
Income From Continuous Operations
$94.680
$57.411
$55.256
$59.531
$48.351
$45.687
Income From Discontinued Operations
-
-
-
-
-
-
Net Income
$94.680
$57.411
$55.256
$59.531
$48.351
$45.687
EBITDA
$120.233
$77.344
$76.477
$81.801
$71.501
$70.529
EBIT
$108.949
$66.288
$63.930
$70.898
$61.344
$60.024
Basic Shares Outstanding
16.701
17.352
18.471
19.822
20.869
21.883
Shares Outstanding
16.865
17.528
18.596
20.000
21.007
22.001
Basic EPS
$5.67
$3.31
$2.99
$3.00
$2.32
$2.09
EPS - Earnings Per Share
$5.61
$3.28
$2.97
$2.98
$2.30
$2.08
When I try to get any of the missing data individually I get the error "Message: Unable to locate element: "
Code example of 1 missing data:
!pip install selenium
import selenium
from selenium import webdriver
from selenium.webdriver.common.by import By
import requests
driver = webdriver.Firefox()
url = 'https://www.macrotrends.net/stocks/charts/AAPL/apple/financial-statements'
hdr = {"User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.75 Safari/537.36",
"X-Requested-With": "XMLHttpRequest"}
driver.get(url)
b = driver.find_element(By.XPATH, "/html/body/div[2]/div[3]/div[4]/div/div/div[4]/div[2]/div/div[1]/div[9]/div").text
print(b)
Error:
NoSuchElementException: Message: Unable to locate element: /html/body/div[2]/div[3]/div[4]/div/div/div[4]/div[2]/div/div[1]/div[9]/div
Stacktrace:
WebDriverError@chrome://remote/content/shared/webdriver/Errors.jsm:181:5
NoSuchElementError@chrome://remote/content/shared/webdriver/Errors.jsm:393:5
element.find/</<@chrome://remote/content/marionette/element.js:299:16
I get the error trying xpath, css, etc, doesn't seen to matter.
Thanks in advance! Its my first question here so sorry if I missed something.
Edit:
I managed a partial solution after the comment by @PApostol, the problem is that the data is not visible in the initial layout, so I expanded the screen and made it scroll to the right, it misses the first data now, my temporary solution will be concatenate those data, heres my code now:
driver = webdriver.Firefox()
url = 'https://www.macrotrends.net/stocks/charts/AAPL/apple/financial-statements'
hdr = {"User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.75 Safari/537.36",
"X-Requested-With": "XMLHttpRequest"}
driver.get(url)
driver.set_window_size(2000, 2000)
a = driver.find_element(By.CSS_SELECTOR, "#contenttablejqxgrid").text
arrow = driver.find_element(By.CSS_SELECTOR, ".jqx-icon-arrow-right")
webdriver.ActionChains(driver).click_and_hold(arrow).perform()
time.sleep(4)
b = driver.find_element(By.CSS_SELECTOR, "#contenttablejqxgrid").text
print(a,b)
Solution
I managed a solution after the comment by @PApostol, the problem is that the data is not visible in the initial layout, so I expanded the screen and made it scroll to the right, it misses the first data by this approach, so its necessary to concatenate the data before and after scrolling, heres the code:
driver = webdriver.Firefox()
url = 'https://www.macrotrends.net/stocks/charts/AAPL/apple/financial-statements'
hdr = {"User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.75 Safari/537.36",
"X-Requested-With": "XMLHttpRequest"}
driver.get(url)
driver.set_window_size(2000, 2000)
a = driver.find_element(By.CSS_SELECTOR, "#contenttablejqxgrid").text
arrow = driver.find_element(By.CSS_SELECTOR, ".jqx-icon-arrow-right")
webdriver.ActionChains(driver).click_and_hold(arrow).perform()
time.sleep(4)
b = driver.find_element(By.CSS_SELECTOR, "#contenttablejqxgrid").text
print(a,b)
Thanks!
Answered By - Capuccino
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.