Issue
I don't have a lot of experience webscraping, I can do basics so this is a little over my head. The end result I would like is a list of farmers with the markets they sell at. There's a table where you select a farmer and the markets show up below. You can select other things, but I'm only interested in selecting and saving each farmer and saving their markets.
https://www.grownyc.org/greenmarket/ourmarkets
I have tried BS4 and have looked at Selenium but I don't know where to go next. I have inspected the html and know where everything is but I don't know how to get the results I need. I can get all the option values for farmers and markets separately, but I can't generate anything that shows which values go together.
Solution
You can use their Ajax API to load the data. For example:
import json
import pandas as pd
import requests
api_url = "https://www.grownyc.org/greenmarket.php"
payload = {
"nid": "",
"address": "",
"borough": "",
"farmer": "Angel Family Farm",
"daysopen": "",
"ebt": "false",
"textile": "false",
"compost": "false",
"battery": "false",
"youthmarket": "false",
}
data = requests.post(api_url, data=payload).json()
# uncomment this to print all data:
# print(json.dumps(data, indent=4))
df = pd.DataFrame(data["sites"])
print(df[["name", "timesopen", "street"]])
Prints:
name timesopen street
0 Corona Greenmarket 8:00 am - 3:00 pm 103-28 Roosevelt Ave
1 4th Ave Sunset Park Greenmarket 8:00 am - 3:00 pm 4th Ave & 59th St
2 7th Ave Sunset Park Greenmarket 8am - 3pm
Answered By - Andrej Kesely
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.