Issue
I was trying to improve my random forest classifier parameters, but the output I was getting, does not look like the output I expected after looking at some examples from other people.
The code I'm using:
train_x, test_x, train_y, test_y = train_test_split(df, avalanche, shuffle=False)
# Create the random forest
rf = RandomForestClassifier()
rf_random = RandomizedSearchCV(estimator=rf, param_distributions=random_grid, n_iter=100, cv=3, verbose=2, random_state=42, n_jobs=-1)
# Train the model
rf_random.fit(train_x, train_y)
print(rf_random.best_params_)
The output I'm getting (this is just a few lines, but it gives me several hundred lines):
Fitting 3 folds for each of 100 candidates, totalling 300 fits
[CV] END bootstrap=True, max_depth=30, max_features=sqrt, min_samples_leaf=1, min_samples_split=5, n_estimators=400; total time= 1.3s
[CV] END bootstrap=True, max_depth=30, max_features=sqrt, min_samples_leaf=1, min_samples_split=5, n_estimators=400; total time= 1.3s
[CV] END bootstrap=True, max_depth=30, max_features=sqrt, min_samples_leaf=1, min_samples_split=5, n_estimators=400; total time= 1.4s
[CV] END bootstrap=False, max_depth=10, max_features=sqrt, min_samples_leaf=2, min_samples_split=5, n_estimators=1200; total time= 3.8
The output I was expecting:
{'bootstrap': True,
'max_depth': 70,
'max_features': 'auto',
'min_samples_leaf': 4,
'min_samples_split': 10,
'n_estimators': 400}
as from this website.
Does anyone know what I'm doing wrong or what I should change so that the output becomes what I expected it to be?
Solution
You are getting that output because of verbose=2
. The higher its value, the more text it will print. These text prompts are not the results. They just tell you what models the search is currently fitting to the data.
This is useful to see the current progress of your search (sometimes it can take days, so it's nice to know what part of the process the search is currently at). If you do not want this text to appear, set verbose=0
.
You have not gotten the expected result yet because rf_random
is still fitting models to the data.
Once your search has finished use rf_random.best_params_
to get the output you want.
Answered By - Arturo Sbr
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.