Neural networks are a powerful machine learning tool used in a variety of applications such as image recognition, natural language processing, and robotics. PyTorch is a popular deep learning library that provides a flexible and efficient platform for building and training neural networks.
In this tutorial, we'll go over the basics of creating a neural network in PyTorch, including defining the architecture, performing the forward and backward pass, and optimizing the model using gradient descent.
Introduction to Neural Networks
Neural networks are a class of machine learning models inspired by the structure and function of the human brain. They consist of layers of interconnected nodes, or neurons, which receive input, apply a set of weights to that input, and pass the result through an activation function. The output of one layer becomes the input to the next layer, and the process is repeated until the final output is generated.
Neural networks are capable of learning complex non-linear relationships between inputs and outputs, making them a powerful tool for a variety of applications. PyTorch makes it easy to create and train neural networks using its built-in functionality for automatic differentiation and GPU acceleration.
Creating a Neural Network in PyTorch
To create a neural network in PyTorch, we first need to define the architecture of the network. This involves specifying the number of layers, the number of neurons in each layer, and the activation functions to be used.
In PyTorch, we can define a neural network by creating a subclass of the nn.Module class. The nn.Module class provides a convenient way to organize the parameters of the network and define the forward pass through the network.
Here's an example of a simple neural network with two hidden layers:
import torch
import torch.nn as nn
# Define the neural network architecture
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.fc1 = nn.Linear(2, 3)
self.fc2 = nn.Linear(3, 1)
def forward(self, x):
x = torch.sigmoid(self.fc1(x))
x = torch.sigmoid(self.fc2(x))
return x
# Create an instance of the network
net = Net()
In this example, we define a neural network with two input nodes, three hidden nodes, and one output node. The network is defined as a subclass of the nn.Module class, which is a base class for all neural network modules in PyTorch.
The __init__ method of the Net class defines the architecture of the network by creating two fully connected layers with the nn.Linear class. The first layer has two input nodes and three output nodes, and the second layer has three input nodes and one output node.
The forward method of the Net class defines the forward pass of the network by applying a sigmoid activation function to the output of each fully connected layer. The torch.sigmoid function is a nonlinear activation function that is commonly used in neural networks to introduce nonlinearity into the model.
Finally, we create an instance of the Net class and store it in the net variable.
Defining the Architecture
In PyTorch, we define the architecture of a neural network by creating a class that inherits from the nn.Module class. The nn.Module class provides a set of pre-defined functions and methods that can be used to define the architecture of a neural network.
Each layer of a neural network is defined as a separate object in PyTorch. There are many different types of layers that can be used, including fully connected layers, convolutional layers, and recurrent layers. These layers are defined using the classes provided by PyTorch.
Here's an example of defining a simple feedforward neural network using PyTorch:
import torch.nn as nn
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.fc1 = nn.Linear(10, 5)
self.fc2 = nn.Linear(5, 1)
def forward(self, x):
x = nn.functional.relu(self.fc1(x))
x = self.fc2(x)
return x
In this example, we define a class called Net that inherits from nn.Module. The constructor method (__init__) initializes two fully connected layers (nn.Linear), with 10 input neurons and 5 output neurons in the first layer, and 5 input neurons and 1 output neuron in the second layer. The forward method defines the forward pass of the network, which consists of a ReLU activation function applied to the output of the first layer, followed by the second layer.
Note that we use the nn.functional module to apply the ReLU activation function to the output of the first layer. This is because the nn.functional module provides a set of functions that can be used to define the forward pass of a neural network. These functions are similar to the layers provided by the nn module, but do not have any parameters to learn.
Defining the Loss Function
A loss function measures how well the neural network is performing. It computes the difference between the predicted output and the actual output. The goal is to minimize this difference during training so that the model can accurately predict the output for new input data.
PyTorch provides various built-in loss functions. Some of the commonly used ones are:
- nn.MSELoss(): Computes the mean squared error between the predicted and actual output. This loss function is commonly used in regression problems.
- nn.CrossEntropyLoss(): Computes the negative log likelihood loss between the predicted and actual output. This loss function is commonly used in classification problems.
- nn.BCELoss(): Computes the binary cross entropy loss between the predicted and actual output. This loss function is commonly used in binary classification problems.
Here's an example of how to use the nn.CrossEntropyLoss() function:
import torch
import torch.nn as nn
# Define model architecture
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.fc1 = nn.Linear(10, 5)
self.fc2 = nn.Linear(5, 2)
def forward(self, x):
x = torch.relu(self.fc1(x))
x = self.fc2(x)
return x
# Define loss function
criterion = nn.CrossEntropyLoss()
# Define input and target data
input_data = torch.randn(3, 10)
target_data = torch.tensor([0, 1, 0])
# Create model instance
model = Net()
# Compute output
output = model(input_data)
# Compute loss
loss = criterion(output, target_data)
print(loss)
In this example, we first define the neural network architecture using the nn.Module class. We then define the loss function using the nn.CrossEntropyLoss() function. We create some input and target data, and then create an instance of our model. We compute the output of our model using the input data, and then compute the loss using the output and target data. Finally, we print out the value of the loss.
Defining the Optimizer
Optimizers are used to update the parameters of the neural network during training. They are used to minimize the loss function and find the set of weights and biases that give the best predictions. In PyTorch, there are several built-in optimizers available to use.
PyTorch provides various built-in optimizers. Some of the commonly used ones are:
- torch.optim.SGD: Stochastic gradient descent optimizer.
- torch.optim.Adam: Adaptive moment estimation optimizer.
To use an optimizer, we need to define its hyperparameters and pass them to the optimizer. The hyperparameters can be different for each optimizer. The most commonly used hyperparameters are learning rate, weight decay, and momentum.
Here is an example of defining an optimizer in PyTorch:
import torch.optim as optim
# define the optimizer
optimizer = optim.Adam(model.parameters(), lr=0.001)
In this example, we are using the Adam optimizer with a learning rate of 0.001. The model.parameters() method returns a list of all the parameters of the model that need to be updated during training. This list is passed as an argument to the optimizer.
Another commonly used optimizer is the stochastic gradient descent (SGD) optimizer. Here is an example of using the SGD optimizer:
import torch.optim as optim
# define the optimizer
optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9)
In this example, we are using the SGD optimizer with a learning rate of 0.001 and momentum of 0.9. The momentum hyperparameter helps the optimizer to keep moving in the same direction and avoid getting stuck in local minima.
Once the optimizer is defined, we can use it to update the parameters of the model during training. Here is an example of the training loop with the optimizer:
for epoch in range(num_epochs):
running_loss = 0.0
for i, data in enumerate(train_loader, 0):
inputs, labels = data
optimizer.zero_grad()
# forward + backward + optimize
outputs = model(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
running_loss += loss.item()
# print statistics
print('[%d] loss: %.3f' %
(epoch + 1, running_loss / len(train_loader)))
Training the Model
# import libraries
import torch
import torch.nn as nn
import torch.optim as optim
# define model architecture
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.fc1 = nn.Linear(8, 10)
self.fc2 = nn.Linear(10, 1)
self.sigmoid = nn.Sigmoid()
def forward(self, x):
x = self.fc1(x)
x = self.sigmoid(x)
x = self.fc2(x)
x = self.sigmoid(x)
return x
# instantiate the model
model = Net()
# define loss function
criterion = nn.BCELoss()
# define optimizer
optimizer = optim.SGD(model.parameters(), lr=0.1)
# define training function
def train(model, optimizer, criterion, data_loader, num_epochs):
for epoch in range(num_epochs):
running_loss = 0.0
for i, data in enumerate(data_loader, 0):
inputs, labels = data
optimizer.zero_grad()
outputs = model(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
running_loss += loss.item()
print(f"Epoch {epoch+1}, loss: {running_loss/len(data_loader)}")
# generate sample data
X = torch.randn(100, 8)
y = torch.randint(0, 2, (100, 1)).float()
# create data loader
dataset = torch.utils.data.TensorDataset(X, y)
data_loader = torch.utils.data.DataLoader(dataset, batch_size=4, shuffle=True)
# train the model
train(model, optimizer, criterion, data_loader, 10)
In this code, we define the model architecture, loss function, optimizer, and training function as we did in the previous sections. We also generate sample data using torch.randn() and torch.randint() and create a data loader using torch.utils.data.TensorDataset and torch.utils.data.DataLoader.
We then call the train() function to train the model for a specified number of epochs, passing in the model, optimizer, loss function, data loader, and number of epochs as arguments. The train() function iterates through each epoch, iterating through the data loader and updating the model parameters using backpropagation.
Finally, we call the train() function with our instantiated model, optimizer, loss function, data loader, and number of epochs as arguments to train the model.
Related posts :
2023.02.22 - PyTorch : #1 Tensors
2023.02.23 - PyTorch : #2 Autograd
2023.02.25 - PyTorch : #3 DataSet and DataLoader
'Data Science > Deep Learning' 카테고리의 다른 글
[Intro to Deep Learning] Dropout and Batch Normalization (2) | 2023.03.13 |
---|---|
[NLP] 자연어 작업 종류 (0) | 2023.03.04 |
PyTorch : #3 DataSet and DataLoader (0) | 2023.02.25 |
PyTorch : #2 Autograd (0) | 2023.02.23 |
PyTorch : #1 Tensors (0) | 2023.02.22 |
최근댓글