Issue
I'm using a custom regressor for gridsearchCV but it is behaving strangely. It does the gridsearch with the default parameters instead of the given parameter grid, and then in the end runs it once with the parameter grid. I made a dummy example with mnist fashion (I know, not regression but it shows the problem) to demonstrate the problem (see code and output below).
As you can see in the output, the first two models that are used use the default parameters (one layer, no drop rate), even though it output the CV line ([CV 1/2]...) with the correct parameters... and if I print the self.drop_rate in the fit method it prints the correct drop_rate while the model clearly doesnt use it...
Code:
import tensorflow as tf
print("tf version: ", tf.__version__)
from sklearn.model_selection import GridSearchCV
from sklearn.base import BaseEstimator, RegressorMixin
fashion_mnist = tf.keras.datasets.fashion_mnist
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()
def createNNModel(unit1, unit2, drop_rate,lr):
if unit2==0:
if drop_rate==0:
model = tf.keras.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(unit1, activation='relu'),
tf.keras.layers.Dense(10)
])
else:
model = tf.keras.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(unit1, activation='relu'),
tf.keras.layers.Dropout(drop_rate),
tf.keras.layers.Dense(10)
])
else:
if drop_rate==0:
model = tf.keras.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(unit1, activation='relu'),
tf.keras.layers.Dense(unit2, activation='relu'),
tf.keras.layers.Dense(10)
])
else:
model = tf.keras.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(unit1, activation='relu'),
tf.keras.layers.Dropout(drop_rate),
tf.keras.layers.Dense(unit2, activation='relu'),
tf.keras.layers.Dropout(drop_rate),
tf.keras.layers.Dense(10)
])
model.compile(loss=tf.losses.MeanSquaredError(),
optimizer=tf.optimizers.Adam(learning_rate=lr),
metrics=[tf.metrics.MeanAbsoluteError()])
return model
class MyRegressor(BaseEstimator, RegressorMixin):
def __init__(self, unit1=32, unit2=0, drop_rate=0, lr=0.001):
"""
Called when initializing the regressor
"""
self.unit1=unit1
self.unit2=unit2
self.drop_rate=drop_rate
self.lr=lr
print("INIT DR:", self.drop_rate)
self.model_=createNNModel(unit1, unit2, drop_rate,lr)
def fit(self, X, y, max_epochs=100):
"""
This should fit regressor. All the "work" should be done here.
Note: assert is not a good choice here and you should rather
use try/except blog with exceptions. This is just for short syntax.
"""
print("FIT DR: ", self.drop_rate)
self.history_ = self.model_.fit(X,y, epochs=max_epochs,
verbose=1)
self.model_.summary()
return self
def predict(self, X, y=None):
predictions = self.model_.predict(X)
return predictions
def score(self, X, y=None):
performance = self.model_.evaluate(X) #mae
return(1-performance[1])#the bigger the better
## TUNING
units1=[64]
units2=[64]
drop_outs=[0.8]
lrs=[0.01]
param_grid={'unit1': units1, 'unit2': units2, 'drop_rate': drop_outs, 'lr': lrs}
gs= GridSearchCV(MyRegressor(), param_grid, cv=2, verbose=3)
gs.fit(X=train_images, y=train_labels, max_epochs=2)
Output:
tf version: 2.9.0
INIT DR: 0
INIT DR: 0
Fitting 2 folds for each of 1 candidates, totalling 2 fits
INIT DR: 0
FIT DR: 0.8
Epoch 1/2
938/938 [==============================] - 1s 1ms/step - loss: 95.2783 - mean_absolute_error: 5.1844
Epoch 2/2
938/938 [==============================] - 1s 1ms/step - loss: 21.4664 - mean_absolute_error: 3.7982
Model: "sequential_46"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
flatten_8 (Flatten) (None, 784) 0
dense_55 (Dense) (None, 32) 25120
dense_56 (Dense) (None, 10) 330
=================================================================
Total params: 25,450
Trainable params: 25,450
Non-trainable params: 0
_________________________________________________________________
938/938 [==============================] - 1s 673us/step - loss: 0.0000e+00 - mean_absolute_error: 0.0000e+00
[CV 1/2] END drop_rate=0.8, lr=0.01, unit1=64, unit2=64;, score=1.000 total time= 3.1s
INIT DR: 0
FIT DR: 0.8
Epoch 1/2
938/938 [==============================] - 2s 1ms/step - loss: 60.8985 - mean_absolute_error: 4.7083
Epoch 2/2
938/938 [==============================] - 1s 1ms/step - loss: 20.7136 - mean_absolute_error: 3.7330
Model: "sequential_47"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
flatten_9 (Flatten) (None, 784) 0
dense_57 (Dense) (None, 32) 25120
dense_58 (Dense) (None, 10) 330
=================================================================
Total params: 25,450
Trainable params: 25,450
Non-trainable params: 0
_________________________________________________________________
938/938 [==============================] - 1s 679us/step - loss: 0.0000e+00 - mean_absolute_error: 0.0000e+00
[CV 2/2] END drop_rate=0.8, lr=0.01, unit1=64, unit2=64;, score=1.000 total time= 3.4s
INIT DR: 0
INIT DR: 0.8
FIT DR: 0.8
Epoch 1/2
1875/1875 [==============================] - 3s 2ms/step - loss: 731.5312 - mean_absolute_error: 3.8732
Epoch 2/2
1875/1875 [==============================] - 3s 2ms/step - loss: 8.3729 - mean_absolute_error: 2.5103
Model: "sequential_49"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
flatten_11 (Flatten) (None, 784) 0
dense_61 (Dense) (None, 64) 50240
dropout_8 (Dropout) (None, 64) 0
dense_62 (Dense) (None, 64) 4160
dropout_9 (Dropout) (None, 64) 0
dense_63 (Dense) (None, 10) 650
=================================================================
Total params: 55,050
Trainable params: 55,050
Non-trainable params: 0
_________________________________________________________________
Solution
See this section of the sklearn developer's guide: you shouldn't set self.model_
in the __init__
method; putting that line into the fit
method probably works for what you want.
The problem is that the grid search clone
s its estimator, and that operates by creating a new instance of the same class (without specifying any __init__
parameters!) and then setting its parameters with set_params
. So by defining model_
in __init__
, your clones all get the default parameters; then you set the parameters for your custom class, but they never make it through to the model_
object itself.
Answered By - Ben Reiniger
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.