Saturday, March 26, 2022

[FIXED] How to create a nested dictionary from pandas DataFrame

March 26, 2022 dataframe, dictionary, list, pandas, python No comments

Issue

I'm not very practical with python data series, can someone explain what I'm doing wrong?

I have this DataFrame:

Model	Year	Brand	Number
Sorento	2008	Kia	19UYA427X1A382858
Corvette	1968	Chevrolet	1C3CDFEB2FD382486
Sienna	1933	Toyota	YV440MBK0F1112644
Corvette	1968	Kia	45UYA427X1A382858

and I need this output:

{
   "Kia": {
        "2008": [
            [
                "Sorento",
                "19UYA427X1A382858"
            ],
            [ 
                "Sorento",
                "45UYA427X1A382858"
            ]
        ]
    }, 
    "Chevrolet": {
      "1968": [ 
         [
          "Corvette",
          "1C3CDFEB2FD382486"
         ] 
       ]
    }
}

So I need to group the items with the same value into subList, but I'm not figuring out how to do it. The Model and Year are grouping correctly but I don't know how to fill the last list with the Brand and Number values

 ## this is my partial solution
d = {k: f.groupby('Year').apply(list).to_dict()
     for k, f in df_clean.groupby('Brand')}

pprint.pprint(d)

Output:
 'Mitsubishi': {1994.0: ['Model', 'Year', 'Brand', 'Number'],
                2005.0: ['Model', 'Year', 'Brand', 'Number'],
                'NO_DATA': ['Model', 'Year', 'Brand', 'Number']},
 'Nissan': {1993.0: ['Model', 'Year', 'Brand', 'Number'],
            1996.0: ['Model', 'Year', 'Brand', 'Number'],
            2009.0: ['Model', 'Year', 'Brand', 'Number'],
            2011.0: ['Model', 'Year', 'Brand', 'Number'],
            2012.0: ['Model', 'Year', 'Brand', 'Number']},
 'Oldsmobile': {1999.0: ['Model', 'Year', 'Brand', 'Number']},

Solution

You could use a nested groupby.apply; once for the brands and again for the years:

out = (df.groupby('Brand')
       .apply(lambda x: x.groupby('Year')[['Model','Number']]
              .apply(lambda y: y.to_numpy().tolist())
              .to_dict())
       .to_dict())

Output:

{'Chevrolet': {1968: [['Corvette', '1C3CDFEB2FD382486']]},
 'Kia': {1968: [['Corvette', '45UYA427X1A382858']],
  2008: [['Sorento', '19UYA427X1A382858']]},
 'Toyota': {1933: [['Sienna', 'YV440MBK0F1112644']]}}

Answered By - enke

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Saturday, March 26, 2022

[FIXED] How to create a nested dictionary from pandas DataFrame

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels