Issue
Having now checked a multitude of Stack Overflow threads on this, I'm struggling to apply the answers to my particular use case so hoping someone can help me on my specific problem.
I'm trying to explode data out of a dictionary into two separate columns while maintaining a multi-index.
Here is what I currently have:
| short_url | platform | css_problem_files |
|-----------|----------|----------------------------------------------------------------------|
| /url_1/ | desktop | {css_file_1: css_value, css_file_2: css_value, css_fle_3: css_value} |
| | mobile | {css_file_1: css_value, css_file_2: css_value, css_fle_3: css_value} |
| /url_2/ | desktop | {css_file_1: css_value, css_file_2: css_value, css_fle_3: css_value} |
| | mobile | {css_file_1: css_value, css_file_2: css_value, css_fle_3: css_value} |
and here is what I would like to achieve:
| short_url | platform | css_file | css_value |
|-----------|----------|------------|-----------|
| /url_1/ | desktop | css_file_1 | css_value |
| | | css_file_2 | css_value |
| | | css_file_3 | css_value |
| | mobile | css_file_1 | css_value |
| | | css_file_2 | css_value |
| | | css_file_3 | css_value |
| /url_2/ | desktop | css_file_1 | css_value |
| | | css_file_2 | css_value |
| | | css_file_3 | css_value |
| | mobile | css_file_1 | css_value |
| | | css_file_2 | css_value |
| | | css_file_3 | css_value |
The only thing I've come up with that's remotely close to what I need is the below, however this is creating over 200K rows when I'd expect it to be only in the thousands (and I've not included platform yet):
m = pd.DataFrame([*df['css_problem_files']], df.index).stack()\
.rename_axis([None,'css_files']).reset_index(1, name='pct usage')
out = df[['short_url']].join(m)
Any assistance or a point in the right direction would be greatly appreciated
Solution
If you turn the dictionaries into lists of key-value pairs, you can explode them and then transform the result into two new columns with .apply(pd.Series)
(and rename them to your liking) like so:
df = (df
.css_problem_files.apply(dict.items) # turn into key value list
.explode() # explode
.apply(pd.Series) # turn into columns
.rename(columns={0: "css_file", 1: "css_value"}) # rename
)
Answered By - fsimonjetz
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.