Issue
I created the following CNN model:
model = tf.keras.Sequential([
tf.keras.layers.Conv2D(16, 4, strides=4, input_shape=(12,12,1)),
tf.keras.layers.SeparableConv2D( 1, 3, depth_multiplier=1),
tf.keras.layers.Conv2D(8, 1, activation='relu'),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(2)
])
print(model.summary())
My intension was that the seperableConv2D layer will create a single 3x3 kernel that will operate separately on each one of the 16 3x3 input images and will result in 16 single numbers. However, the result was that it learned 3x3x16 kernel and resulted in a single number. After reading the explanation regarding Seperable2D in here, I understood that it is training different 3x3 for each one of the channels (which I could live with) but then merges these 16 numbers to 1. My questions are:
- Is there a way (using SeperableConv2 or any other way) to avoid the last step of going from 1X1X16 to 1X1X1?
- Is there a way to train a single 3x3 kernel that will work on all 16 channels separately and create 1x1x16 output?
Solution
1.You can use a DepthwiseConv2D
layer, it is the first part of the seperableConv2D layer - it convolutes each channel seperately, each with its own kernel.
2.Yes. You can reshape your data to one channel, such that each channel is next to each other. Then perform a convolution, with one 3x3 kernel, and a stride of 3. Then reshape it again to 16 channels. That way each channel will be convoluted with the same kernel. You just need to use a stride of 3, so that two channels are not mixed in the convolution.
model = tf.keras.Sequential([
tf.keras.layers.Conv2D(16, 4, strides=4, input_shape=(12,12,1)),# -> (3,3,16)
tf.keras.layers.Reshape((-1, 3, 3*16, 1)), # -> (3,48,1)
tf.keras.layers.Conv2D(1, 3, strides=3), # -> (1,16,1)
tf.keras.layers.Reshape((-1, 1, 1, 16)) # -> (1,1,16)
....
In this case, there is no difference between DepthwiseConv2D
and Conv2
layer, since we had just one channel.
Alternatively, using the functional api, you could create a convolution layer, then get each channel and perform the convolution on that channel alone (all with the same conv layer), and then concatenate the channels back together. I think this should work:
inputs = Input(shape=(12,12,1))
x = inputs
x = Conv2D(16, 4, strides=4)(x)
conv_layer = Conv2D(1,3)
channels = []
for i in range(16):
channel = tf.expand_dims(x[:,:,:,i], -1)
channels.append(conv_layer(channel))
x = Concatenate(axis=-1)(channels)
...
model = Model(inputs=inputs, outputs=x)
Answered By - AndrzejO
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.