Pretrained Networks

Pretrained Networks

Pretrained Network for Object Recognition

Use pretrained network in `TorchVision`

contains a few of the best-performing neural network architectures for computer vision, such as
- AlexNet (http://mng.bz/lo6z)
- ResNet (https://arxiv.org/pdf/1512.03385.pdf)
- Inception v3 (https://arxiv.org/pdf/1512.00567.pdf)
has easy access to datasets like ImageNet and other utilities for getting up to speed with computer vision applications in PyTorch.

The predefined models can be found in torchvision.models

from torchvision import models

dir(models)

['AlexNet',
 'DenseNet',
 'GoogLeNet',
 'GoogLeNetOutputs',
 'Inception3',
 'InceptionOutputs',
 'MNASNet',
 'MobileNetV2',
 'ResNet',
 'ShuffleNetV2',
 'SqueezeNet',
 'VGG',
 '_GoogLeNetOutputs',
 '_InceptionOutputs',
 '__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__path__',
 '__spec__',
 '_utils',
 'alexnet',
 'densenet',
 'densenet121',
 'densenet161',
 'densenet169',
 'densenet201',
 'detection',
 'googlenet',
 'inception',
 'inception_v3',
 'mnasnet',
 'mnasnet0_5',
 'mnasnet0_75',
 'mnasnet1_0',
 'mnasnet1_3',
 'mobilenet',
 'mobilenet_v2',
 'quantization',
 'resnet',
 'resnet101',
 'resnet152',
 'resnet18',
 'resnet34',
 'resnet50',
 'resnext101_32x8d',
 'resnext50_32x4d',
 'segmentation',
 'shufflenet_v2_x0_5',
 'shufflenet_v2_x1_0',
 'shufflenet_v2_x1_5',
 'shufflenet_v2_x2_0',
 'shufflenetv2',
 'squeezenet',
 'squeezenet1_0',
 'squeezenet1_1',
 'utils',
 'vgg',
 'vgg11',
 'vgg11_bn',
 'vgg13',
 'vgg13_bn',
 'vgg16',
 'vgg16_bn',
 'vgg19',
 'vgg19_bn',
 'video',
 'wide_resnet101_2',
 'wide_resnet50_2']

The capitalized names (e.g. ResNet) refer to Python classes that implement a number of popular models. They differ in their architecture—that is, in the arrangement of the operations occurring between the input and the output.
- E.g.: create an instance of the AlexNet class.
```
# create an instance of AlexNet class
alexnet = models.AlexNet()
```
  But wait! If we did that, we would be feeding data through the whole network to produce … garbage!!! 😢
  That’s because the network is uninitialized: its weights, the numbers by which inputs are added and multiplied, have not been trained on anything—the network itself is a blank (or rather, random) slate. We’d need to either train it from scratch or load weights from prior training.
  To use models with predefined numbers of layers and units and optionally download and load pretrained weights into them, we need to use the lowercase name in models module.
The lowercase names are convenience functions that return models instantiated from those classes, sometimes with different parameter sets.
- For instance, resnet101 returns an instance of ResNet with 101 layers, resnet18 has 18 layers, and so on.
- Create an instance of the network and pass an argument that will instruct the function to download the weights of resnet101 trained on the ImageNet dataset, with 1.2 million images and 1,000 categories:
```
resnet = models.resnet101(pretrained=True)
```

Load and show an image from the local filesystem

Use Pillow (https://pillow.readthedocs.io/en/stable), an image-manipulation module for Python:

from PIL import Image

# assume that the variable IMG_PATH holds the path of the image
img = Image.open(IMG_PATH)
img # show the image inline

Set `eval` mode before inference

In order to do inference, we need to put the network in eval mode:

resnet.eval()

(If we forget to do that, some pretrained models, like batch normalization and dropout, will not produce meaningful answers, just because of the way they work internally.)

Retrieve image label

load a text file listing the labels in the same order they were presented to the network during training
Pick out the label at the index that produced the highest score from the network.

(Almost all models meant for image recognition have output in a form similar to that)

Torch Hub

Torch Hub is a mechanism through which authors can publish a model on GitHub, with or without pretrained weights, and expose it through an interface that PyTorch understands. This makes loading a pretrained model from a third party as easy as loading a TorchVision model.

All it takes is to place a file named hubconf.py in the root directory of the GitHub repository. An example is TorchVision, we can notice that it contains a hubconf.py.

Torch Hub is quite new, and there are only a few models published this way. We can get at them by Googling “github.com hubconf.py.”

Last updated on 2024-09-05

PyTorch Tensor 2020-10-19 →