Saturday, November 6, 2021

[FIXED] How Conv2D works in Tensorflow/PyTorch when two layers are connected with different filter numbers?

November 06, 2021 conv-neural-network, pytorch, tensorflow No comments

Issue

After reviewing LeNet5 architecture description, when Max Pooling layer (with 6 filters) is connected with Conv2D layer (16 filters), there is a requirement of special filter mapping, in the following way:

Taking inputs from every contiguous subset of 3 feature maps
Taking inputs from every contiguous subset of 4 feature maps
Taking inputs from every contiguous subset of 4 feature maps
Taking inputs from the discontinuous subset of 4 feature maps
Taking all the feature maps

A slightly annotated image from LeNet paper, courtesy of TowardsAI

Here recall this mapping is pretty specialized to 10 <-> 16, whereas Tensorflow/PyTorch is pretty flexible, just wonder how Tensorflow/PyTorch handles it exactly?

Solution

I don't know about the actual implementation of this network in those frameworks. However, here's one way you can implement such an operation in PyTorch.

You can look at this operation as a change of basis: going from feature maps in the 'S2' space to feature maps in the 'C3' space using a transform matrix M. The whole objective is to construct that matrix, it is composed of ones and zeros, where the ones are positioned such that you construct vectors in C3 space using components of vectors in S2 space.

For instance, let's look at the discontinuous subsets of 4 of the table: column #12 requires maps n°0, 1, 3, and 4. The corresponding row in M for vector #12 will therefore be [1,1,0,1,1,0]. Essentially, the 1s here correspond to the crosses shown in the figure. For this particular portion of the transition M will look like:

tensor([[1., 0., 1.],
        [1., 1., 0.],
        [0., 1., 1.],
        [1., 0., 1.],
        [1., 1., 0.],
        [0., 1., 1.]])

To actually perform the matrix multiplication, you can use torch.einsum:

torch.einsum('bchw,cd->bdhw', x, M)

Here's an example: starting from a 6-channel 2x2 map and transitioning to a 3-channel 2x2 map (defined by columns #12, #13, and #14 of the Table I):

>>> x = torch.rand(1,6,2,2)
tensor([[[[0.3134, 0.2468],
          [0.2759, 0.4971]],

         [[0.4150, 0.8735],
          [0.6726, 0.0463]],

         [[0.9547, 0.5338],
          [0.0654, 0.7458]],

         [[0.4099, 0.1984],
          [0.0930, 0.8054]],

         [[0.1695, 0.1586],
          [0.7961, 0.3894]],

         [[0.5535, 0.0678],
          [0.1484, 0.7735]]]])

>>> torch.einsum('bchw,cd->bdhw', x, M)
tensor([[[[1.3077, 1.4773],
          [1.8377, 1.7382]],

         [[2.0926, 1.6338],
          [1.6825, 1.9550]],

         [[2.2315, 1.0467],
          [0.5828, 2.8219]]]])

You can of course expand this operation to the whole of Table I, this would result in a matrix M of size 6x16.

Answered By - Ivan

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Saturday, November 6, 2021

[FIXED] How Conv2D works in Tensorflow/PyTorch when two layers are connected with different filter numbers?

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels