Issue
I'm not very practical with python data series, can someone explain what I'm doing wrong?
I have this DataFrame:
Model | Year | Brand | Number |
---|---|---|---|
Sorento | 2008 | Kia | 19UYA427X1A382858 |
Corvette | 1968 | Chevrolet | 1C3CDFEB2FD382486 |
Sienna | 1933 | Toyota | YV440MBK0F1112644 |
Corvette | 1968 | Kia | 45UYA427X1A382858 |
and I need this output:
{
"Kia": {
"2008": [
[
"Sorento",
"19UYA427X1A382858"
],
[
"Sorento",
"45UYA427X1A382858"
]
]
},
"Chevrolet": {
"1968": [
[
"Corvette",
"1C3CDFEB2FD382486"
]
]
}
}
So I need to group the items with the same value into subList, but I'm not figuring out how to do it. The Model and Year are grouping correctly but I don't know how to fill the last list with the Brand and Number values
## this is my partial solution
d = {k: f.groupby('Year').apply(list).to_dict()
for k, f in df_clean.groupby('Brand')}
pprint.pprint(d)
Output:
'Mitsubishi': {1994.0: ['Model', 'Year', 'Brand', 'Number'],
2005.0: ['Model', 'Year', 'Brand', 'Number'],
'NO_DATA': ['Model', 'Year', 'Brand', 'Number']},
'Nissan': {1993.0: ['Model', 'Year', 'Brand', 'Number'],
1996.0: ['Model', 'Year', 'Brand', 'Number'],
2009.0: ['Model', 'Year', 'Brand', 'Number'],
2011.0: ['Model', 'Year', 'Brand', 'Number'],
2012.0: ['Model', 'Year', 'Brand', 'Number']},
'Oldsmobile': {1999.0: ['Model', 'Year', 'Brand', 'Number']},
Solution
You could use a nested groupby.apply
; once for the brands and again for the years:
out = (df.groupby('Brand')
.apply(lambda x: x.groupby('Year')[['Model','Number']]
.apply(lambda y: y.to_numpy().tolist())
.to_dict())
.to_dict())
Output:
{'Chevrolet': {1968: [['Corvette', '1C3CDFEB2FD382486']]},
'Kia': {1968: [['Corvette', '45UYA427X1A382858']],
2008: [['Sorento', '19UYA427X1A382858']]},
'Toyota': {1933: [['Sienna', 'YV440MBK0F1112644']]}}
Answered By - enke
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.