Issue
Basically I am doing the following and I think there is probably a faster way than doing pd.concat and df.append in every loop?
final_df = pd.DataFrame()
for (key, data, date) in data_tuples:
df = pd.DataFrame(data, columns=['Price', 'Quantity'])
timestamp = datetime.strptime(date, '%a, %d %b %Y %H:%M:%S GMT')
df = pd.concat([df], axis=0, keys=[timestamp])
df = pd.concat([df], axis=0, keys=[key])
final_df = final_df.append(df)
final_df.index = final_df.index.rename(['symbol', 'time', 'row'])
final_df['Price'] = final_df['Price'].apply(float)
final_df['Quantity'] = final_df['Quantity'].apply(float)
Solution
to avoid append and concat at each step you could:
- create iterator from your data of tuples
- apply method on the iterator to parse dataframe and prepare it to the required format
- apply pd.concat once on the list of dataframes.
of course, you need to modify the logic to meet your desired output but i hope it gives you a grasp of the approach.
import pandas as pd
from datetime import datetime
data_tuples = (("1", {"Price": [1,2], "Quantity":[1,2]}, "20:20:20"), ("1", {"Price": [3,4], "Quantity":[3,4]}, "20:20:30"))
def parse_values(data, date):
df = pd.DataFrame(data, columns=['Price', 'Quantity'])
df["date"] = date
return df
df = pd.concat([parse_values(data,date) for _,data,date in data_tuples])
Answered By - Dariusz Krynicki
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.