Saturday, December 4, 2021

[FIXED] BeautifulSoup Python NoneType object has no attribute 'text'

December 04, 2021 beautifulsoup, python, selenium No comments

Issue

I'm trying to scrape a javascript loaded website https://e-consulta.sunat.gob.pe/cl-at-ittipcam/tcS01Alias by using selenium and beautifulsoup 4.

However, when trying to retrieve an element or subitem (a sub-branch) from the tree, i get this error

bloquefecha=bloque.find('div[@class="date"]').text

AttributeError: 'NoneType' object has no attribute 'text'

i'm attaching HERE a snapshot of my code and the developers console for illustrative purposes

Here is my code:

def beautifulseleniumsunat2():
navegador = webdriver.Chrome()
navegador.get("https://e-consulta.sunat.gob.pe/cl-at-ittipcam/tcS01Alias")
time.sleep(7)  # esperamos 7 segundos a que cargue la pagina
pagsunat = navegador.page_source
soup = BeautifulSoup(pagsunat, "html.parser")
print (soup.prettify())

bloquesdias2 = soup.select('td[class*="table-bordered calendar-day current"]')
listafecha = []
listacompra=[]
listaventa=[]
for bloque in bloquesdias2:
    bloquefecha=bloque.find('div[@class="date"]') #ALSO tried with findall and iterating with FOR loop on each element but ERROR says it's not iterable
    listafecha.append(bloquefecha.text)
    bloquecompra=bloque.find('div[@class="event normal-all-day begin end"]') #ALSO tried with findall and iterating with FOR loop on each element but ERROR says it's not iterable
    listacompra.append(bloquecompra.text)
    bloqueventa = bloque.find('div[@class="event pap-all-day begin end"]') #ALSO tried with findall and iterating with FOR loop on each element but ERROR says it's not iterable
    listaventa.append(bloquecompra.text)

listafinal=[listacompra,listaventa,listafecha]
print (listafinal)

Solution

What happens?

As mentioned by Aziz Sonawalla you have to pass the class as separat argument to find() but that wont fix all your issues. Cause if elements not available it will raise an error again e.g. if there is no compra / ventra entry.

How to fix that ?

You have to fetch the error - try will give you the result if there is no error except will set result to empty string.

try:
        bloquecompra = day.select_one('div[class*="normal-all-day"]').get_text().split()[1]
    except:
        bloquecompra = ''

Example

You can replace all your code after print (soup.prettify()):

data = []

for day in soup.select('table.calendar-table.table.table-condensed > tbody td[class*="current"]'):
    bloquefecha = day.select_one('div.date').get_text()
    try:
        bloquecompra = day.select_one('div[class*="normal-all-day"]').get_text().split()[1]
    except:
        bloquecompra = ''
    
    try:
        bloqueventa = day.select_one('div[class*="pap-all-day"]').get_text().split()[1]
    except:
        bloqueventa = ''
    
    data.append(';'.join([bloquefecha,bloquecompra,bloqueventa]))
data

Output

['1;3.618;3.624',
 '2;;',
 '3;;',
 '4;;',
 '5;3.624;3.628',
 '6;3.627;3.631',
 '7;3.625;3.630',
 '8;3.620;3.623',
 '9;3.610;3.615',
 '10;;',
 '11;;',
 '12;3.615;3.618',
 '13;3.606;3.608',
 '14;3.610;3.615',
 '15;3.610;3.613',
 '16;3.610;3.614',
 '17;;',
 '18;;',
 '19;3.609;3.617',
 '20;3.611;3.615',
 '21;3.612;3.615',
 '22;3.618;3.622',
 '23;;',
 '24;;',
 '25;;',
 '26;;',
 '27;;',
 '28;;',
 '29;;',
 '30;;',
 '31;;']

Answered By - HedgeHog

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Saturday, December 4, 2021

[FIXED] BeautifulSoup Python NoneType object has no attribute 'text'

Issue

Solution

What happens?

How to fix that ?

0 comments:

Post a Comment

Popular Posts

Labels