Tuesday, March 15, 2022

[FIXED] How to fix this classification report warning?

March 15, 2022 machine-learning, python, scikit-learn No comments

Issue

I created a model for multiclass classification. Everything went good, got a validation accuracy of 84% but when I printed the classification report I got this warning:

 UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))

classification report:

              precision    recall  f1-score   support

           0       0.84      1.00      0.91     51890
           1       0.67      0.04      0.08      8706
           2       0.00      0.00      0.00      1605

    accuracy                           0.84     62201
   macro avg       0.50      0.35      0.33     62201
weighted avg       0.79      0.84      0.77     62201

Source Code -

import pandas as pd

df=pd.read_csv('Crop_Agriculture_Data_2.csv')
df=df.drop('ID',axis=1)

dummies=pd.get_dummies(df[['Crop_Type', 'Soil_Type', 'Pesticide_Use_Category', 'Season']],drop_first=True)
df=df.drop(['Crop_Type', 'Soil_Type', 'Pesticide_Use_Category', 'Season'],axis=1)
df=pd.concat([df,dummies],axis=1)

df['Crop_Damage']=df['Crop_Damage'].map({'Minimal Damage':0,'Partial Damage':1,'Significant Damage':2})

x=df.drop('Crop_Damage',axis=1).values
y=df.Crop_Damage.values
from sklearn.model_selection import train_test_split
x_train,x_test,y_train,y_test=train_test_split(x,y,train_size=0.3,random_state=101)

from sklearn.preprocessing import MinMaxScaler
mms=MinMaxScaler()
x_train=mms.fit_transform(x_train)
x_test=mms.transform(x_test)

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense,Dropout,Flatten

model=Sequential()
model.add(Flatten())
model.add(Dense(10,activation='relu'))
model.add(Dropout(0.3))
model.add(Dense(6,activation='relu'))
model.add(Dropout(0.3))
model.add(Dense(3,activation='softmax'))

model.compile(loss='sparse_categorical_crossentropy',optimizer='adam',metrics=['accuracy'])
model.fit(x_train,y_train,validation_data=(x_test,y_test),epochs=13)

import numpy as np
pred=np.argmax(model.predict(x_test),axis=-1)

from sklearn.metrics import classification_report
print(classification_report(y_test,pred))

I think it might be because most of the data is in one category but I'm not sure. Is there anything I can do to solve this ?

Solution

You don't want to get rid this warning as it says that your class 2 are not on the predictions as there were no samples in the training set

you got an imbalance classification problem and the class 2 has realy low number of samples, and it was present in the test data only

I suggest you 2 things

StratifiedKFold So when you split for training and test, it consider all classes

Oversampling you might need adjust your data by randomly resample the training dataset to duplicate examples from the minority class

Answered By - Yefet

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Tuesday, March 15, 2022

[FIXED] How to fix this classification report warning?

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels