Monday, November 27, 2023

[FIXED] How do I build a layer, so a tensor of dynamic shape is transformed into a fixed size in PyTorch?

November 27, 2023 deep-learning, python, pytorch, regression No comments

Issue

I am trying to build a PyTorch module based on dynamic dimension sizes.

import torch

from random import randint
from torch.nn import Linear, BatchNorm1d, ReLU, Dropout, Sequential


batch_size = 2
embed_size = 128

fc_1 = Sequential(
    Sequential(
        Linear(1, 64),
        BatchNorm1d(64),
        ReLU(),
        Dropout(0.1),
    ),
    Linear(64, embed_size),
    Linear(embed_size, embed_size)
)

# Secondary Features
p = torch.randn(batch_size).unsqueeze(1) # torch.Size([2, 1])
q = torch.randn(batch_size).unsqueeze(1) # torch.Size([2, 1])
r = torch.randn(batch_size).unsqueeze(1) # torch.Size([2, 1])
s = torch.randn(batch_size).unsqueeze(1) # torch.Size([2, 1])

secondary = torch.cat([                  # torch.Size([8, 1])
    p, q, r, s
], dim=0)

# Random Dimension Size
x = randint(2, 400)                      # 239

# Primary Features
a = torch.rand(batch_size, embed_size)   # torch.Size([2, 128])
b = fc_1(secondary)                      # torch.Size([8, 128])
c = torch.rand(x, embed_size)            # torch.Size([239, 128])

How do I collapse all the information from a, b,and c into a variable y so that's it's size is (batch_size, embed_size)?

I am trying to do regression analysis, so it would be important not to lose any information in the process of collapsing it. Obviously, torch.cat is not possible. Any method that uses learnable layers to collapse it is fine.

Solution

Given the differing dimensions, especially the dynamic size of c, simple concatenation won't work here. A common approach in such scenarios is to use attention mechanisms which can handle varying input sizes and aggregate information in a learnable manner. Attention mechanisms can weigh different parts of the input differently, allowing the model to focus on more informative parts of the input.

The following code is an implementation of a simple attention mechanism. The idea is to compute attention scores for each of a, b, and c, and then use these scores to weight and sum these tensors

import torch
import torch.nn as nn
import torch.nn.functional as F

class AttentionAggregator(nn.Module):
    def __init__(self, embed_size):
        super().__init__()
        self.embed_size = embed_size
        self.query = nn.Parameter(torch.randn(embed_size))
        self.key = nn.Linear(embed_size, embed_size)

    def forward(self, a, b, c):
        # concatenate all inputs for computing attention
        combined = torch.cat([a, b, c], dim=0)

        # compute keys
        keys = self.key(combined)

        # compute attention scores
        attention_scores = torch.matmul(keys, self.query) / (self.embed_size ** 0.5)
        attention_weights = F.softmax(attention_scores, dim=0).unsqueeze(-1)

        # apply attention weights
        weighted = combined * attention_weights

        # aggregate information
        aggregated = weighted.sum(dim=0)

        return aggregated

embed_size = 128 # your example

# instantiate the attention aggregator
attention_aggregator = AttentionAggregator(embed_size)

# aggregate the information (a, b, c already defined/computed in your example)
y = attention_aggregator(a, b, c)

Answered By - inverted_index

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Monday, November 27, 2023

[FIXED] How do I build a layer, so a tensor of dynamic shape is transformed into a fixed size in PyTorch?

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels