Issue
I'm trying to build an ensemble of some models using VotingClassifier() from Sklearn to see if it works better than the individual models. I'm trying it in 2 different ways.
- I'm trying to do it with individual Random Forest, Gradient Boosting, and XGBoost models.
- I'm trying to build it using an ensemble of many Random Forest models (using different parameters for n_estimators and max_depth.
In the first condition, I'm doing this
estimator = []
estimator.append(('RF', RandomForestClassifier(bootstrap=True, ccp_alpha=0.0, class_weight=None,
criterion='gini', max_depth=8, max_features='auto',
max_leaf_nodes=None, max_samples=None,
min_impurity_decrease=0.0, min_impurity_split=None,
min_samples_leaf=1, min_samples_split=2,
min_weight_fraction_leaf=0.0, n_estimators=900,
n_jobs=-1, oob_score=True, random_state=66, verbose=0,
warm_start=True)))
estimator.append(('GB', GradientBoostingClassifier(ccp_alpha=0.0, criterion='friedman_mse', init=None,
learning_rate=0.03, loss='deviance', max_depth=5,
max_features=None, max_leaf_nodes=None,
min_impurity_decrease=0.0, min_impurity_split=None,
min_samples_leaf=1, min_samples_split=2,
min_weight_fraction_leaf=0.0, n_estimators=1000,
n_iter_no_change=None, presort='deprecated',
random_state=66, subsample=1.0, tol=0.0001,
validation_fraction=0.1, verbose=0,
warm_start=False)))
estimator.append(('XGB', xgb.XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
colsample_bynode=1, colsample_bytree=1, gamma=0,
learning_rate=0.1, max_delta_step=0, max_depth=9,
min_child_weight=1, n_estimators=1000, n_jobs=1,
nthread=None, objective='binary:logistic', random_state=0,
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, seed=None,
silent=None, subsample=1, verbosity=1)))
And when I do
ensemble_model_churn = VotingClassifier(estimators = estimator, voting ='soft')
and display ensemble_model_churn, I get everything in the output.
But under the second condition, I'm doing this
estimator = []
estimator.append(('RF_1',RandomForestClassifier(n_estimators=500,max_depth=5,warm_start=True)))
estimator.append(('RF_2',RandomForestClassifier(n_estimators=500,max_depth=6,warm_start=True)))
estimator.append(('RF_3',RandomForestClassifier(n_estimators=500,max_depth=7,warm_start=True)))
estimator.append(('RF_4',RandomForestClassifier(n_estimators=500,max_depth=8,warm_start=True)))
And so on. I have 30 different models like that.
But this time, when I do
ensemble_model_churn = VotingClassifier(estimators = estimator, voting ='soft')
and display it, I get only the first one, and not the other ones.
print(ensemble_model_churn)
>>>VotingClassifier(estimators=[('RF_1',
RandomForestClassifier(bootstrap=True,
ccp_alpha=0.0,
class_weight=None,
criterion='gini',
max_depth=5,
max_features='auto',
max_leaf_nodes=None,
max_samples=None,
min_impurity_decrease=0.0,
min_impurity_split=None,
min_samples_leaf=1,
min_samples_split=2,
min_weight_fraction_leaf=0.0,
n_estimators=500,
n_jobs=None,
oob_score=...
criterion='gini',
max_depth=5,
max_features='auto',
max_leaf_nodes=None,
max_samples=None,
min_impurity_decrease=0.0,
min_impurity_split=None,
min_samples_leaf=1,
min_samples_split=2,
min_weight_fraction_leaf=0.0,
n_estimators=500,
n_jobs=None,
oob_score=False,
random_state=None,
verbose=0,
warm_start=True))],
flatten_transform=True, n_jobs=None, voting='soft',
weights=None)
Why is this happening? Is it not possible to run an ensemble of the same model?
Solution
You are seeing more than one of the estimators, it's just a little hard to tell. Notice the ellipses (...
) after the first oob_score
parameter, and that after those some of the hyperparameters are repeated. Python just doesn't want to print such a giant wall of text, and has trimmed out most of the middle. You can check that len(ensemble_model_churn.estimators) > 1
.
Another note: sklearn is very against doing any validation at model initiation, preferring to do such checking at fit time. (This is because of the way they clone estimators in grid searches and such.) So it's very unlikely that anything will be changed from your explicit input until you call fit
.
Answered By - Ben Reiniger
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.