Issue
I have a csv file with a column whose type is a dictionary (column 'b' in the example below). But b in df is a string type even though I define it as the dictionary type. I didn't find the solution to this question. Any suggestions?
a = pd.DataFrame({'a': [1,2,3], 'b':[{'a':3, 'haha':4}, {'c':3}, {'d':4}]})
a.to_csv('tmp.csv', index=None)
df = pd.read_csv('tmp.csv', dtype={'b':dict})
Solution
I wonder if your CSV column is actually meant to be a Python dict column, or rather JSON. If it's JSON, you can read the column as dtype=str
, then use json_normalize()
on that column to explode it into multiple columns. This is an efficient solution assuming the column contains valid JSON.
Answered By - John Zwinck
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.