Issue
There's something wrong with pandas, and I would like your opinion,
I've this Dataframe where I need to get the max values, code is just below,
df_stack=pd.DataFrame([[1.0, 2016.0, 'NonResidential', 'Hotel', 98101.0, 'DOWNTOWN',
47.6122, -122.33799, 1927.0, 57.85220900338872,
59.91269863912585],
[1.0, 2016.0, 'NonResidential', 'Hotel', 98101.0, 'DOWNTOWN',
47.61317, -122.33393, 1996.0, 55.82342114189166,
56.86951201265458],
[3.0, 2016.0, 'NonResidential', 'Hotel', 98101.0, 'DOWNTOWN',
47.61393, -122.3381, 1969.0, 76.68191235628086,
77.37931271575705],
[5.0, 2016.0, 'NonResidential', 'Hotel', 98101.0, 'DOWNTOWN',
47.61412, -122.33664, 1926.0, 68.53505428597694,
71.00764283155655],
[8.0, 2016.0, 'NonResidential', 'Hotel', 98121.0, 'DOWNTOWN',
47.61375, -122.34047, 1980.0, 67.01346098859122,
68.34485815906346]], columns=['OSEBuildingID', 'DataYear', 'BuildingType', 'PrimaryPropertyType',
'ZipCode', 'Neighborhood', 'Latitude', 'Longitude', 'YearBuilt',
'SourceEUI(KWm2)', 'SourceEUIWN(KWm2)' ])
When I run the code below :
df_stack[['OSEBuildingID',
'DataYear',
'BuildingType',
'PrimaryPropertyType',
'ZipCode', 'Neighborhood', 'Latitude', 'Longitude',
'YearBuilt', 'SourceEUI(KWm2)', 'SourceEUIWN(KWm2)']].groupby('OSEBuildingID').max()
I get an error, "AssertionError: " the same you'll probably get if you try this. But, when I comment this two columns and I run the code again
df_stack[['OSEBuildingID',
'DataYear',
#'BuildingType',
#'PrimaryPropertyType',
'ZipCode', 'Neighborhood', 'Latitude', 'Longitude',
'YearBuilt', 'SourceEUI(KWm2)', 'SourceEUIWN(KWm2)']].groupby('OSEBuildingID').max()
I get the results
DataYear ZipCode Neighborhood Latitude Longitude YearBuilt SourceEUI(KWm2) SourceEUIWN(KWm2)
OSEBuildingID
1.0 2016.0 98101.0 DOWNTOWN 47.61317 -122.33393 1996.0 57.852209 59.912699
3.0 2016.0 98101.0 DOWNTOWN 47.61393 -122.33810 1969.0 76.681912 77.379313
5.0 2016.0 98101.0 DOWNTOWN 47.61412 -122.33664 1926.0 68.535054 71.007643
8.0 2016.0 98121.0 DOWNTOWN 47.61375 -122.34047 1980.0 67.013461 68.344858
If I replace max() by mean() I can uncomment those 2 lines and runc the code with no problem. This behaviour it only happens with max() and min(), well I just test max, mean and min, But I need to get the max.
Thank you if can help.
Solution
This was a regression in 1.0.0
that was fixed with '1.0.1'
, so I suggest you upgrade your version.
Fixed regression in .groupby().agg() raising an AssertionError for some reductions like min on object-dtype columns (GH31522)
Answered By - ALollz
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.