728x90

Deep learning models have achieved impressive performance on various tasks such as image classification, object detection, and natural language processing. However, these models require a large amount of data and computational resources to train, making it challenging for individuals and organizations with limited resources to build these models from scratch. Transfer learning offers a solution to this problem by leveraging pre-trained models that have been trained on large datasets and have learned features that are relevant to a wide range of tasks.
In this post, we will explore transfer learning with PyTorch, one of the most popular deep learning frameworks, and show how to use pre-trained models, modify them, and fine-tune them for your specific tasks.

Introduction to Transfer Learning

Transfer learning is a technique that allows us to use pre trained models as a starting point for our specific task. The idea behind transfer learning is that the knowledge learned by a model on one task can be transferred to another task, as long as the two tasks share some commonalities. In practice, we usually freeze the pre-trained model's weights and add new layers on top of it to adapt it to our specific task. The new layers are typically randomly initialized and trained on the new task while the pre-trained layers are kept fixed.

Transfer learning has several advantages:

  • It reduces the amount of data required to train a model, as pre trained models have already learned relevant features from large datasets.
  • It reduces the time required to train a model, as pre-trained models have already learned the low level features that can take many epochs to learn from scratch.
  • It improves the generalization of the model, as pre trained models have learned features that are relevant to a wide range of tasks.

Using Pre-trained Models in PyTorch

PyTorch provides several pre trained models that can be easily used for transfer learning. These models have been trained on large datasets such as ImageNet and have achieved state-of-the-art performance on various tasks. To use a pre-trained model in PyTorch, we need to download the pre trained weights and load them into the model.

Let's take the example of the ResNet-18 model, one of the most popular pre-trained models for image classification. We can load the pre-trained weights using the following code:

import torch
import torchvision.models as models

resnet18 = models.resnet18(pretrained=True)

Here, we import the models module from torchvision and create an instance of the ResNet-18 model with pretrained=True. This will automatically download the pre trained weights and load them into the model. We can use this pre trained model to classify images using the following code:

# Assuming we have an image tensor `img` of size (3, 224, 224)
outputs = resnet18(img)

Modifying Pre-trained Models

One of the benefits of transfer learning is that it allows us to reuse the knowledge learned by pre-trained models on a new dataset. However, sometimes the pre-trained models may not fit our specific task perfectly, and we may need to modify them to better suit our needs. In this section, we will explore how to modify pre-trained models in PyTorch.

Modifying the Last Layer:
The most common modification to pre trained models is to replace the last layer(s) of the model with a new layer that fits the new task. This is because the last layer(s) of a neural network are typically task-specific and have learned to extract features that are useful for a specific problem.

For example, if we have a pre trained model for image classification with 1,000 output classes, but our new task only has 10 output classes, we can replace the last layer with a new layer that has 10 output units. To do this in PyTorch, we can simply access the last layer of the pre-trained model and replace it with a new layer:

import torch.nn as nn
import torchvision.models as models

# Load the pre-trained model
model = models.resnet18(pretrained=True)

# Modify the last layer for a new task
num_ftrs = model.fc.in_features
model.fc = nn.Linear(num_ftrs, 10)  # replace the last layer

Here, we first load the pre-trained ResNet18 model, and then we replace its last layer with a new linear layer that has 10 output units.

Fine-tuning a Pre-trained Model

Another way to modify a pre-trained model is to fine-tune it for the new task. Fine-tuning refers to training the pre-trained model with the new dataset while keeping the pre-trained weights fixed for some layers and updating the weights for other layers. This allows us to reuse the knowledge learned by the pre-trained model while adapting it to the new task.

To fine-tune a pre-trained model, we need to do the following:
1. Freeze the weights of some layers in the pre-trained model
2. Replace the last layer(s) of the model with a new layer for the new task
3. Train the model on the new dataset


Here is an example of fine-tuning a pre-trained model in PyTorch:

import torch.optim as optim
from torch.optim import lr_scheduler

# Freeze the weights of the pre-trained layers
for param in model.parameters():
    param.requires_grad = False

# Replace the last layer for a new task
num_ftrs = model.fc.in_features
model.fc = nn.Linear(num_ftrs, 10)

# Define the optimizer and the learning rate scheduler
optimizer = optim.SGD(model.fc.parameters(), lr=0.001, momentum=0.9)
scheduler = lr_scheduler.StepLR(optimizer, step_size=7, gamma=0.1)

# Train the model on the new dataset
model = train_model(model, dataloaders, criterion, optimizer, scheduler, num_epochs=10)

In this example, we first freeze the weights of all layers in the pre-trained ResNet18 model by setting requires_grad to False. Then we replace the last layer with a new linear layer that has 10 output units. We define an optimizer and a learning rate scheduler, and then we train the model on the new dataset using the train_model function.

Example code

import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
from torchvision import datasets, models, transforms

# Load the pre-trained VGG16 model
model = models.vgg16(pretrained=True)

# Freeze all the layers in the model
for param in model.parameters():
    param.requires_grad = False

# Replace the last fully-connected layer with a new one
num_features = model.classifier[-1].in_features
model.classifier[-1] = nn.Linear(num_features, 2)

# Define the loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.classifier.parameters(), lr=0.001, momentum=0.9)

# Define the data transformations
data_transforms = {
    'train': transforms.Compose([
        transforms.RandomResizedCrop(224),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
    'val': transforms.Compose([
        transforms.Resize(256),
        transforms.CenterCrop(224),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ])
}

# Load the dataset
data_dir = 'data/dogs_vs_cats'
image_datasets = {x: datasets.ImageFolder(os.path.join(data_dir, x), data_transforms[x])
                  for x in ['train', 'val']}
dataloaders = {x: torch.utils.data.DataLoader(image_datasets[x], batch_size=4, shuffle=True, num_workers=4)
               for x in ['train', 'val']}

# Train the model
num_epochs = 10
for epoch in range(num_epochs):
    for phase in ['train', 'val']:
        if phase == 'train':
            model.train()
        else:
            model.eval()

        running_loss = 0.0
        running_corrects = 0

        for inputs, labels in dataloaders[phase]:
            optimizer.zero_grad()

            with torch.set_grad_enabled(phase == 'train'):
                outputs = model(inputs)
                _, preds = torch.max(outputs, 1)
                loss = criterion(outputs, labels)

                if phase == 'train':
                    loss.backward()
                    optimizer.step()

            running_loss += loss.item() * inputs.size(0)
            running_corrects += torch.sum(preds == labels.data)

        epoch_loss = running_loss / len(image_datasets[phase])
        epoch_acc = running_corrects.double() / len(image_datasets[phase])

        print('{} Loss: {:.4f} Acc: {:.4f}'.format(phase, epoch_loss, epoch_acc))

This code loads the pre-trained VGG16 model, replaces the last fully connected layer with a new one, and freezes all the other layers in the model. It also defines the data transformations, loads the dataset, and trains the model on the dataset. By fine tuning the last layer of the pre-trained model, we can achieve good performance on a new task with relatively little training data.


Related posts :

2023.02.22 - PyTorch : #1 Tensors

2023.02.23 - PyTorch : #2 Autograd

2023.02.25 - PyTorch : #3 DataSet and DataLoader

2023.02.25 - PyTorch : #4 Building Neural Network

 


728x90
  • 네이버 블러그 공유하기
  • 네이버 밴드에 공유하기
  • 페이스북 공유하기
  • 카카오스토리 공유하기
반응형