Issue
The problem here is different than those defined in the questions here, here, and here. Specifically, the transformation and form of the output I want is different than any specified in those questions, and I also want a DateTime index. These differences cause the answers on those pages fail.
I have data formatted as a json like this:
{
"Meta Data": {
"1. Information": "Daily Prices (open, high, low, close) and Volumes",
"2. Symbol": "ABC",
},
"Time Series (Daily)": {
"2001-06-31": {
"1. open": "113.2000",
"4. close": "113.8000",
},
"2001-07-01": {
"1. open": "114.2000",
"4. close": "114.2000",
}
}
}
I want the output to look be a pandas dataframe like this:
"Time Series (Daily)" | "1. open" | "4. close"
"2001-06-31" | 113.2000 | 113.8000
"2001-07-01" | 114.2000 | 114.2000
I wrote a function that works, but the for
loop leaves performance wanting and I find it hard to read.
def convert_json_to_dataframe(all_json_data):
json_data = all_json_data["Time Series (Daily)"]
dates = []
open = []
close = []
for key in json_data.keys():
dates.append(key)
open.append(json_data[key]["1. open"])
close.append(json_data[key]["4. close"])
df = pd.DataFrame(
list(zip(open, close)),
columns=["1. open", "4. close"],
index=dates,
)
df = df.apply(pd.to_numeric, errors="ignore")
return df
There's got to be a simpler, easier to read, higher-performing way to do this, maybe with json_normalize
in pandas, but I can't figure it out.
UPDATE AFTER ANSWERS & RESOLUTION. All I had to do was:
df = pd.DataFrame(json_data["Time Series (Daily)"]).T
Pandas discovered the index and column names automatically, so I didn't need the reset_index
portion of the answers.
The orient
approach also worked:
df = pd.DataFrame.from_dict(json_data["Time Series (Daily)", orient="index")
To get all the numbers as floats instead of strings, I did need the apply
line:
df = df.apply(pd.to_numeric, errors="coerce")
Thank you everyone.
Solution
Why don't you just do this?
pd.DataFrame(data["Time Series (Daily)"]).T.reset_index().rename(columns = {"index":"Time Series (Daily)"})
Output -
Time Series (Daily) | 1. open | 4. close | |
---|---|---|---|
0 | 2001-06-31 | 113.2000 | 113.8000 |
1 | 2001-07-01 | 114.2000 | 114.2000 |
Answered By - Zero
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.