Issue
I'm trying to finds the best estimator using GridSearchCV and I'm using refit = True as per default. Given that the documentation states:
The refitted estimator is made available at the best_estimator_ attribute and permits using predict directly on this GridSearchCV instance
Should I do .fit
on the training data afterwards as such:
classifier = GridSearchCV(estimator=model,param_grid = parameter_grid['param_grid'], scoring='balanced_accuracy', cv = 5, verbose=3, n_jobs=4,return_train_score=True, refit=True)
classifier.fit(x_training, y_train_encoded_local)
predictions = classifier.predict(x_testing)
balanced_error = balanced_accuracy_score(y_true=y_test_encoded_local,y_pred=predictions)
Or should I do it like this instead:
classifier = GridSearchCV(estimator=model,param_grid = parameter_grid['param_grid'], scoring='balanced_accuracy', cv = 5, verbose=3, n_jobs=4,return_train_score=True, refit=True)
predictions = classifier.predict(x_testing)
balanced_error = balanced_accuracy_score(y_true=y_test_encoded_local,y_pred=predictions)
Solution
You should do it like your first verison. You need to always call classifier.fit
otherwise it doesn't do anything. Refit=True
means that it trains on the entire training set after the cross validation is done.
Answered By - bpfrd
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.