Issue
- How to delete some layers from pretrained network (for example remove single ReLU activation layer)?
- How to replace some layers by type in pretrained network (for example replace MaxPool2d with AvrPool)?
Solution
Assuming you know the structure of your model, you can:
>>> model = torchvision.models(pretrained=True)
Select a submodule and interact with it as you would with any other
nn.Module
. This will depend on your model's implementation. For example, submodule are often accessible via attributes (e.g.model.features
), however this is not always the case, for instance nn.Sequential use indices:model.features[18]
to select one of the relu activations. Also do note: not all layers are registered inside thenn.Module
, non-parametric functions such as most activation functions can be applied via the functional approach directly in the forward of the module.For a given
nn.Module
m
you can extract its layer name by usingtype(m).__name__
. A canonical approach is to filter the layers ofmodel.modules
and only keep the max pool layers, then replace those with average pool layers:>>> maxpools = [k for k, m in model.named_modules() ... if type(m).__name__ == 'MaxPool2d'] ['features.4', 'features.9', 'features.16', 'features.23', 'features.30']
We can extract the parent module name for each of those layers:
>>> maxpools = [k.split('.') for k, m in model.named_modules() ... if type(m).__name__ == 'MaxPool2d'] [['features', '4'], ['features', '9'], ['features', '16'], ['features', '23'], ['features', '30']]
Here they all come from the same parent module
model.features
. Finally, we can fetch the layer reference in order to overwrite their value:>>> for *parent, k in maxpools: ... model.get_submodule('.'.join(parent))[int(k)] = nn.AvgPool2d(2,2)
Resulting in:
VGG( (features): Sequential( (0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (1): ReLU(inplace=True) (2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (3): ReLU(inplace=True) (4): AvgPool2d(kernel_size=2, stride=2, padding=0) (5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (6): ReLU(inplace=True) (7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (8): ReLU(inplace=True) (9): AvgPool2d(kernel_size=2, stride=2, padding=0) (10): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (11): ReLU(inplace=True) (12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (13): ReLU(inplace=True) (14): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (15): ReLU(inplace=True) (16): AvgPool2d(kernel_size=2, stride=2, padding=0) (17): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (18): ReLU(inplace=True) (19): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (20): ReLU(inplace=True) (21): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (22): ReLU(inplace=True) (23): AvgPool2d(kernel_size=2, stride=2, padding=0) (24): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (25): ReLU(inplace=True) (26): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (27): ReLU(inplace=True) (28): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (29): ReLU(inplace=True) (30): AvgPool2d(kernel_size=2, stride=2, padding=0) ) (avgpool): AdaptiveAvgPool2d(output_size=(7, 7)) (classifier): Sequential( (0): Linear(in_features=25088, out_features=4096, bias=True) (1): ReLU(inplace=True) (2): Dropout(p=0.5, inplace=False) (3): Linear(in_features=4096, out_features=4096, bias=True) (4): ReLU(inplace=True) (5): Dropout(p=0.5, inplace=False) (6): Linear(in_features=4096, out_features=1000, bias=True) ) )
Answered By - Ivan
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.