Issue
I'm trying to get this value from this web page from the "Economia" section:
I want to get all those titles. This is my current code:
html = client.get("http://larepublica.pe/")
soup = BeautifulSoup(html.text, 'html.parser')
# Obtener la noticia de portada principal
economyNews = ""
for div in soup.findAll('h2', attrs={'class':'ItemSection_itemSection__title__PleA9'}):
n = div.text
economyNews += n+"\\n"
print(economyNews )
I have tested many ways to get this, but seems that the webpage is locking this. Any idea to fix this problem guys I will appreciate it. Thanks so much.
Solution
You can try:
import requests
from bs4 import BeautifulSoup
url = "https://larepublica.pe/"
headers = {
"User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/118.0"
}
soup = BeautifulSoup(requests.get(url, headers=headers).content, "html.parser")
for h2 in soup.select("div:has(*:-soup-contains(Economía)) + div h2"):
print(h2.text)
Prints:
Banco Mundial: tasas de interés se mantendrán altas por más tiempo
Precio del dólar cierra al alza y se ubica en S/3,831 este miércoles 11 de octubre
Retiro AFP: ¿cuándo fue la última vez que se autorizó la liberación de fondos y cuánto se devolvió?
Debate sobre RETIRO AFP 2023: SBS y Congreso deliberaron sobre posible medida
Answered By - Andrej Kesely
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.