PyTorch : #2 Autograd

Autograd is a key feature of PyTorch that enables automatic differentiation of tensors. It is a define-by-run framework, which means that the computation graph is defined on-the-fly as the code runs. This is in contrast to define-and-run frameworks like TensorFlow, which define the computation graph beforehand and then execute it.

Automatic Differentiation

In machine learning, we often need to compute the gradients of a function with respect to its inputs. These gradients are used to optimize the parameters of the function using methods like gradient descent. In traditional programming, we would manually compute the gradients of the function using calculus. However, in deep learning, we deal with functions that have millions of parameters, which makes manual computation of gradients impractical. This is where automatic differentiation comes in.

Automatic differentiation is a technique used to automatically compute the gradients of a function with respect to its inputs. It does this by decomposing the function into a series of elementary functions for which the gradients are known, and then applying the chain rule to compute the gradients of the composite function. PyTorch's Autograd package provides this functionality for tensors.

How Autograd Works

When you create a tensor in PyTorch and perform operations on it, Autograd keeps track of the operations and the inputs to those operations. It then uses this information to compute the gradients of the output with respect to the inputs.

To enable automatic differentiation, PyTorch associates a computational graph with each tensor. The nodes in the graph represent operations performed on the tensor, and the edges represent the flow of data between the nodes. The computational graph is created on-the-fly as you execute your code, and it is dynamically updated as new operations are performed.

When you call the .backward() method on a tensor, Autograd traverses the computational graph backwards, computing the gradients of the output with respect to each input tensor. The gradients are accumulated in the .grad attribute of each tensor.

Computation Graph

Let's take a look at an example of a computation graph. Consider the following code:

import torch

# Define input tensor x and create a tensor y that depends on x
x = torch.tensor([[1., 2.], [3., 4.]], requires_grad=True)
y = x**2

# Compute gradients of y with respect to x
y.backward(torch.ones_like(x))

# Print out gradients of x
print(x.grad)

In this example, we define an input tensor x and create a tensor y that depends on x. We set the requires_grad attribute of x to True so that we can compute gradients with respect to x later.

We then compute the gradients of y with respect to x using the backward method. We pass in a tensor of ones with the same shape as x to backward to compute the gradients.

Finally, we print out the gradients of x using the grad attribute of x.

This example demonstrates the power of Autograd in automatically computing gradients of arbitrary functions with respect to their inputs. It can greatly simplify the implementation of machine learning models, especially neural networks, by automatically computing the gradients needed for backpropagation during the training process.

Here is the computation graph for this example:

x -------------> y ---------------> z 
    (grad_fn)        (grad_fn)

In this graph, the edges represent the flow of data between the nodes. The nodes represent the operations performed on the data. The grad_fn attribute of each node stores a reference to the function that created that node. This information is used by Autograd to compute the gradients.

Backpropagation

Backpropagation is the process of computing the gradients of the loss function with respect to the parameters of a model. This is done using the chain rule of calculus to propagate the gradients backwards through the computation graph. In PyTorch, this is done automatically using the .backward() method.

Here is an example of backpropagation:

import torch

x = torch.tensor([2.0], requires_grad=True)
y = torch.tensor([5.0], requires_grad=True)

z = x * y
z.backward()

print(x.grad) # tensor([5.])
print

2023.02.22 - Introduce PyTorch : #1 Tensors

References :

TORCH.TENSOR : https://pytorch.org/docs/stable/tensors.html

'Data Science > Deep Learning' 카테고리의 다른 글

PyTorch : #4 Building Neural Network (0)	2023.02.25
PyTorch : #3 DataSet and DataLoader (0)	2023.02.25
PyTorch : #1 Tensors (0)	2023.02.22
Image Generate AI Testing (0)	2023.02.14
[Intro to Deep Learning] Underfitting and Overfitting (0)	2022.09.15