Issue
I'm trying to use BeautifulSoup to extract the text for the items where label class="checkbox"
AND input data-filter-name="topics"
.
The end result should print out the 13 topics located here: https://www.pythondiscord.com/resources/
Topics should be:
- Data Science
- Databases
- Discord Bots
- Game Development
- General
- Microcontrollers
- Software Design
- Testing
- Tooling
- User Interface
- Web Development
- Other
I've gotten this far, but am unsure how I can combine the label and input into soup.find_all()
so as to target only the 13 above. The input I'm targeting below doesn't have text to output so I am pretty sure I need some conditional formatting and or a function to accomplish this.
page = requests.get(URL)
soup = BeautifulSoup(page.content, 'html.parser')
topics = soup.find_all('input', {"data-filter-name": "topics"})
for i in range(0, len(topics)):
print(topics[i].text)
And here is a screenshot of the HTML in quetion:
Solution
There are a couple of ways of doing that, but you should probably try using css selectors:
targets = soup.select('label.checkbox input[data-filter-name="topics"]')
for target in targets:
print(target['data-filter-item'])
Output:
algorithms-and-data-structures
data-science
databases
discord-bots
game-development
general
microcontrollers
software-design
testing
tooling
user-interface
web-development
other
Answered By - Jack Fleeting
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.