Issue
The following is an html excerpt:
<div class="content cardlisting small">
<div class="collectionbox">
<div class="leftcontent">
<div>Collection Status</div>
<div>You must be logged in to track your collection</div>
</div>
<div class="checkcontrols">
<button class="checkall" title="Add all cards to your collection" onclick="collectionManager.toggleAllCheckboxes( '320', true );">Check All</button>
<button class="checknone" title="Remove all cards from your collection" onclick="collectionManager.toggleAllCheckboxes( '320', false );">Check None</button>
</div>
</div>
<div class="card ">
<span class="checkbox" id="checkbox39007" title="Toggle Card in Collection" data-cardid="39007" onclick="collectionManager.toggleCheckbox( this );"></span>
<span class="zoom" title="quick view card" onclick="siteOverlay.show('/ajax/views/card-overlay?cardid=39007');"></span>
<a href="" name="" title="">
<img class="card lazyloaded" data-src="" src="">
</a>
<div class="plaque">#1 - Weedle</div>
</div>
I am beginning to learn scraping with python and I have put together the following python file:
import requests
from bs4 import BeautifulSoup
# Collect the page
page = requests.get('www.somesite.com') #not a real site
print(page)
soup = BeautifulSoup(page.text, 'html.parser')
cards = soup.find(class_="content cardlisting small")
cards_list = cards.find_all(class_='plaque')
print(len(cards_list)
for cards in cards_list:
cards_name = cards.find(class_='plaque')
print(cards_name)
The python file is finding all the plaque elements and adding them to the cards_list variable in the code. It returns the length of the list as well. The problem is in the loop. I have tried to append .text to the cards.find(class_='plaque)
line and get an error saying there is no such attribute. What I'm wanting is to extract the data inside of the html element. In this case it should return #1 - Weedle. In the current state of the code, the value that is returned is 'None' for each element in the list. What am I missing?
Solution
try this:
from bs4 import BeautifulSoup
f = open("1.html", encoding="utf8")
soup = BeautifulSoup(f)
for cls in soup.find_all("div", {"class": "plaque"}):
print(cls)
output:
<div class="plaque">#1 - Weedle</div>
Answered By - user1740577
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.