Issue
Using Python3 and trying to parse NWS weather alerts which appear to contain JSON objects using Beautiful Soup and got this far: BS outputs this (snippet from top of output)
>>> soup.body
<body><p>{
"@context": [
"https://geojson.org/geojson-ld/geojson-context.jsonld",
{
"@version": "1.1",
"wx": "https://api.weather.gov/ontology#",
"@vocab": "https://api.weather.gov/ontology#"
}
],
"type": "FeatureCollection",
"features": [
{
"id": "https://api.weather.gov/alerts/urn:oid:2.49.0.1.840.0.957a95b11de1ec54b622b137ccf43a662d44061f.001.1",
"type": "Feature",
"geometry": null,
"properties": ....(snip)
From what I understand the "@context" tag indicates that the subsequent lines within braces are JSON data; is that correct?
How do I get at the elements inside the square and curly braces?
BS apparently has a JSON parser but I haven't found any good tutorials about how-to for someone who's a noob to this situation.
Pointers would be most welcome.
Solution
Question should be improved by some additional details and as mentioned in the comments it do not look like, that response is plain HTML but rather JSON.
HTML in your
soup
is wrapping from 'lxml' parserYou do not need
beautifulsoup
for that task and no it is not a JSON parser.Instead use
.json()
on your response -> docs
Example
...
json_data = requests.get('YOUR URL').json()
for i in json_data['features']:
print(i['id'])
...
Answered By - HedgeHog
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.