Issue
I'm working on a regression probem using Tensorflow, and have created two models with slight differences in their first Dense layer.
The Models
# Create some regression data
X_regression = tf.range(0, 1000, 5)
y_regression = tf.range(100, 1100, 5) # Y = X+ 100
# Split regression data into training and test sets
X_reg_train = X_regression[:150]
X_reg_test = X_regression[150:]
y_reg_train = y_regression[:150]
y_reg_test = y_regression[150:]
Model 1
# Setup random seed
tf.random.set_seed(42)
model_1_reg = tf.keras.Sequential([
tf.keras.layers.Dense(100),
tf.keras.layers.Dense(10),
tf.keras.layers.Dense(1)
])
model_1_reg.compile(loss=tf.keras.losses.mae,
optimizer=tf.keras.optimizers.Adam(),
metrics=['mae'])
model_1_reg.fit(tf.expand_dims(X_reg_train, axis=-1), y_reg_train, epochs=100)
Model 2
# Setup random seed
tf.random.set_seed(42)
model_2_reg = tf.keras.Sequential([
tf.keras.layers.Dense(100, input_shape=(None, 1)),
tf.keras.layers.Dense(10),
tf.keras.layers.Dense(1)
])
model_2_reg.compile(loss=tf.keras.losses.mae,
optimizer=tf.keras.optimizers.Adam(),
metrics=['mae'])
model_2_reg.fit(tf.expand_dims(X_reg_train, axis=-1), y_reg_train, epochs=100)
I'm confused about whether I should add the input_shape
or not. Model 1's input shape becomes (None, 1)
and Model 2's input becomes (None, None, 1)
.
Both of them run, but perform differently.
Model 2 makes sense since we're inputting an array, but if I think about it, does that mean I only have a single node in the input layer? Since I'm giving it a whole ndarray instead of the instances it self. Model 1 makes sense too since I want to give each number into it.
So, which one makes sense more? Or what case should I use each model for? Also, for model 2's fit why does doing
tf.expand_dims(X_reg_train, axis=-1)
for the X of
model_2_reg.fit(tf.expand_dims(X_reg_train, axis=-1), y_reg_train, epochs=100
work? I thought we're suppose to put it in as a batch or like an array of the data so it should be inside an ndarray
?
Solution
When you give the input shape with the input_shape
parameter, you exclude the batch dimension. That's why you get (None, None, 1)
for Model 2
, because TF inserts the first None
batch dimension in addition to the shape you provide. I'm actually a bit surprised that Model 2
runs with the additional None
dimension. If you provide no input_shape
, TensorFlow will try to pick it from the x
parameter of model.fit
(so your tf.expand_dims(X_reg_train, axis=-1)
).
As for your second question
why does
tf.expand_dims(X_reg_train, axis=-1)
work?
It is actually required. For a dense network, the input shape is expected to be (samples, features)
, even with only one feature. The tf.expand_dims
provides that shape, because you go from shape (200,)
to (200, 1)
. You could do the same with np.expand_dims
, as TF accepts numpy arrays as input. Under the hood TF would convert it to Tensors
though, so it makes no (real) difference if you provide numpy arrays or TensorFlow tensors.
Answered By - mhenning
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.