Issue
I am attempting to scrape a table of data from: https://www.pjm.com/planning/services-requests/interconnection-queues.aspx
I am looking to automate this data pull instead of having to manually download the xls file every day. I looked through their documentation and there is no easy way to automate other than to perform a scrape. Looking at the page source, it looks like this data is stored in an "nggrid" table. Specifically, the data is under .
I created a baseline query in python with beautifulsoup to see what the initial output would be.
Here is my initial code:
from bs4 import BeautifulSoup
import requests
page_link = 'https://www.pjm.com/planning/services-requests/interconnection-queues.aspx'
page_response = requests.get(page_link, timeout=5)
page_content = BeautifulSoup(page_response.content, "html.parser")
In the data pull stored in page_content, I am not provided with the same information as the page source. Where I expect and its various sub-information, I am instead provided the following open and closed tags with no data in between:
<pjm-nggrid></pjm-nggrid>
Does anyone know how to access the data in an nggrid?
Solution
Data is loaded asynchronously via Javascript. Probably you will want to change 'api-subscription-key'
, you can see the key in Chrome/Firefox developer tools:
NOTE (This will download whole data ~10MB, you can change 'rowCount'
and 'startRow'
to load only part of data)
import json
import requests
url = 'https://services.pjm.com/PJMPlanningApi/api//Queue/GetFilteredQueues?'
payload = {'filters': [],
'rowCount':0,
'startRow':1
}
headers = {
'Origin': 'https://www.pjm.com',
'api-subscription-key': 'E29477D0-70E0-4825-89B0-43F460BF9AB4'
}
json_data = requests.post(url, headers=headers, json=payload).json()
print(json.dumps(json_data, indent=4))
Prints:
{
"items": [
{
"requestType": "GI",
"queueNumber": "A01",
"projectName": null,
"commercialName": "Ironwood",
"stateProvinceName": "PA",
"countyName": "Lebanon",
"projectStatus": "In Service",
"transmissionOwner": "ME",
"mw": 720.0,
"mwe": 720.0,
"mwc": 673.0,
"mweInservice": 673.0,
...and so on.
Answered By - Andrej Kesely
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.