Wednesday, January 5, 2022

[FIXED] Extracting Autoencoder features from the hidden layer

January 05, 2022 autoencoder, pytorch No comments

Issue

I have developed some code to apply Autoencoder on my dataset, in order to extract hidden features from it. I have a dataset that consists of 84 variables, and they have been normalised.

epochs = 10
batch_size = 128
lr = 0.008

# Convert Input and Output data to Tensors and create a TensorDataset 
input = torch.Tensor(input.to_numpy())  
output = torch.tensor(output.to_numpy())  
data = torch.utils.data.TensorDataset(input, output)

# Split to Train, Validate and Test sets using random_split   
number_rows = len(input)    # The size of our dataset or the number of rows in excel table.  

test_split = int(number_rows*0.3)  
train_split = number_rows - test_split
train_set, test_set = random_split(data, [train_split, test_split])   

# Create Dataloader to read the data within batch sizes and put into memory. 
train_loader = torch.utils.data.DataLoader(train_set, batch_size=batch_size, shuffle = True) 
test_loader = torch.utils.data.DataLoader(test_set, batch_size=batch_size)

The model structure:

# Model structure
class AutoEncoder(nn.Module):
    def __init__(self):
        super(AutoEncoder, self).__init__()

        # Encoder
        self.encoder = nn.Sequential(
            nn.Linear(84, 128),
            nn.Tanh(),
            nn.Linear(128, 64),
            nn.Tanh(),
            nn.Linear(64, 16),
            nn.Tanh(),
            nn.Linear(16, 2),
        )

        # Decoder
        self.decoder = nn.Sequential(
            nn.Linear(2, 16),
            nn.Tanh(),
            nn.Linear(16, 64),
            nn.Tanh(),
            nn.Linear(64, 128),
            nn.Tanh(),
            nn.Linear(128, 84),
            nn.Sigmoid()
        )

    def forward(self, inputs):
        codes = self.encoder(inputs)
        decoded = self.decoder(codes)

        return codes, decoded

Optimiser and Loss function

# Optimizer and loss function
model = AutoEncoder()
optimizer = torch.optim.Adam(model.parameters(), lr=lr)
loss_function = nn.MSELoss()

The training steps:

# Train
for epoch in range(epochs):
    for data, labels in train_loader:
        inputs = data.view(-1, 84)

        # Forward
        codes, decoded = model(inputs)

        # Backward
        optimizer.zero_grad()
        loss = loss_function(decoded, inputs)
        loss.backward()
        optimizer.step()

    # Show progress
    print('[{}/{}] Loss:'.format(epoch+1, epochs), loss.item())

The Autoencoder model is saved as:

# Save
torch.save(model,'autoencoder.pth')

At this point, I would like to ask some help to understand how I could extract the features from the hidden layer. These features extracted from the hidden layer will be used in another classification algorithm.

Solution

You need to place an hook to your model. And you can use this hook to extract features from any layer. However it is a lot easier if you don't use nn.Sequential because it combines the layer together and they act as one. I run your code using this function:

There is a function for Feature Extraction which basically takes model as an input and place a hook using index of layer.

class FE(nn.Module):
  def __init__(self,model_instance, output_layers, *args):
    super().__init__(*args)
    self.output_layers = output_layers
  
    self.selected_out = OrderedDict()
 
    self.pretrained = model_instance
  
    self.fhooks = []
    print("model_instance._modules.keys():",model_instance._modules.keys())

    for i,l in enumerate(list(self.pretrained._modules.keys())):
        print("index:",i, ", keys:",l )
        if i in self.output_layers:
          
            
            print("------------------------ > Hook is placed output of :" , l )
           
            self.fhooks.append(getattr(self.pretrained,l).register_forward_hook(self.forward_hook(l)))

  def forward_hook(self,layer_name):
    def hook(module, input, output):
        self.selected_out[layer_name] = output
    return hook

  def forward(self, x):
    out = self.pretrained(x,None)
    return out, self.selected_out

And to use:

model_hooked=FE(model ,output_layers = [0])

model_instance._modules.keys(): odict_keys(['encoder', 'decoder'])

index: 0 , keys: encoder

------------------------ > Hook is placed output of : encoder

index: 1 , keys: decoder

After placing the hook you can simply put data to new hooked model and it will output 2 values.First one is original output from last layer and second output will be the output from hooked layer

out, layerout = model_hooked(data_sample)

If you want to extract features from a loaders you can use this function:

def extract_features(FE ,layer_name, train_loader, test_loader):
  extracted_features=[]
  lbls=[]

  extracted_features_test=[]
  lbls_test=[]

  for data , target in train_loader:

    
    out, layerout = FE(data)
  
    a=layerout[layer_name]
  
    extracted_features.extend(a)
    lbls.extend(target)

  for data , target in test_loader:

        out, layerout = FE(data)
       
        a=layerout[layer_name]
        extracted_features_test.extend(a)
        lbls_test.extend(target)


  extracted_features = torch.stack(extracted_features)
  extracted_features_test = torch.stack(extracted_features_test)
  lbls = torch.stack(lbls)
  lbls_test = torch.stack(lbls_test)

  return extracted_features, lbls  ,extracted_features_test, lbls_test

And usage is like this :

Features_TRAINLOADER , lbls ,  Features_TESTLOADER, lbls_test  =extract_features(model_hooked, "encoder",  train_loader, test_loader)

Answered By - Enes Kuz

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Wednesday, January 5, 2022

[FIXED] Extracting Autoencoder features from the hidden layer

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels