Issue
I currently have a GET request to a URL that returns three things: .zip file, .zipsig file, and a .txt file.
I'm only interested in the .zip file which has dozens of .json files. I would like to extract all these .json files, preferable directly into a single pandas data frame, but extracting them into a folder also works.
Code so far, mostly stolen:
license = requests.get(url, headers={'Authorization': "Api-Token " + 'blah'})
z = zipfile.ZipFile(io.BytesIO(license.content))
billingRecord = z.namelist()[0]
z.extract(billingRecord, path = "C:\\Users\\Me\\Downloads\\Json license")
This extracts the entire .zip file to the path. I would like to extract the individual .json files from said .zip file to the path.
Solution
import io
import zipfile
import pandas as pd
import json
dfs = []
with zipfile.ZipFile(io.BytesIO(license.content)) as zfile:
for info in zfile.infolist():
if info.filename.endswith('.zip'):
zfiledata = io.BytesIO(zfile.read(info.filename))
with zipfile.ZipFile(zfiledata) as json_zips:
for info in json_zips.infolist():
if info.filename.endswith('.json'):
json_data = pd.json_normalize(json.loads(json_zips.read(info.filename)))
dfs.append(json_data)
df = pd.concat(dfs, sort=False)
print(df)
Answered By - Tanuj
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.