배우고 느끼고 생각하고 사랑하라

그리고 즐겨라

정리/Pytorch

[Pytorch 기본] Neural Network

gyubinc 2023. 1. 21. 02:19
해당 게시글은 개인적인 복습을 위해 PyTorch tutorial을 기반으로 설명을 덧붙이고 코드를 수정하며 정리한 글입니다.

Neural Networks


typical training procedure for a neural network

1. learnable parameters(or weights) 정의
2. 입력 데이터셋에 대해 아래 과정 반복
3. network에 input
4. loss 계산
5. network's parameters에 gradient 역전파
6. weight update(weight = weight - learning_rate*gradient)
 
 

 

Define the network


부모 클래스인 nn.Module은 parameters를 캡슐화 하여 GPU에서 사용할 수 있게끔 도와준다.
import torch
import torch.nn as nn
import torch.nn.functional as F


class Net(nn.Module):

    def __init__(self):
        super(Net, self).__init__()
        # 1 input image channel, 6 output channels, 5x5 square convolution
        # kernel
        self.conv1 = nn.Conv2d(1, 6, 5)
        self.conv2 = nn.Conv2d(6, 16, 5)
        # an affine operation: y = Wx + b
        self.fc1 = nn.Linear(16 * 5 * 5, 120)  # 5*5 from image dimension
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        # Max pooling over a (2, 2) window
        x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))
        # If the size is a square, you can specify with a single number
        x = F.max_pool2d(F.relu(self.conv2(x)), 2)
        x = torch.flatten(x, 1) # flatten all dimensions except the batch dimension
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x


net = Net()
print(net)
 
backward function에 대해서는 autograd에 의해 forward function만 존재한다면 자동적으로 정의된다.
params = list(net.parameters())
print(len(params))
print(params[0].size())  # conv1's .weight
 
임의의 32x32 size의 데이터를 생성해 입력해본다
input = torch.randn(1, 1, 32, 32)
out = net(input)
print(out)

net.zero_grad()
out.backward(torch.randn(1, 10))

 

Loss Function


loss function은 (output, target)을 쌍으로 받아 두 값이 얼마나 벌어져 있는지를 측정한다.

loss function의 종류는 다양하며 가장 simple한 loss로는 MSE(Mean-squared error)가 있다.
output = net(input)
target = torch.randn(10)  # a dummy target, for example
target = target.view(1, -1)  # make it the same shape as output
criterion = nn.MSELoss()

loss = criterion(output, target)
print(loss)

 

input -> conv2d -> relu -> maxpool2d -> conv2d -> relu -> maxpool2d -> flatten -> linear -> relu -> linear -> relu -> linear
      -> MSELoss
      -> loss

작성된 neural network는 아래와 같은 과정으로 이루어져있다.
print(loss.grad_fn)  # MSELoss
print(loss.grad_fn.next_functions[0][0])  # Linear
print(loss.grad_fn.next_functions[0][0].next_functions[0][0])  # ReLU

 

Backprop

역전파 과정을 진행하기 이전에 필수적으로 gradient 값을 0으로 초기화해주지 않으면 gradient값이 accumulated된다.
net.zero_grad()     # zeroes the gradient buffers of all parameters

print('conv1.bias.grad before backward')
print(net.conv1.bias.grad)

loss.backward()

print('conv1.bias.grad after backward')
print(net.conv1.bias.grad)

 

Update the weights

 
weight를 update 하는 데에는 많은 기법들이 존재하지만 가장 기본적인 기법으로 경사하강법(Stochastic Gradient Descent)이 있다.

SGD : weight = weight - learning_rate * gradient
learning_rate = 0.01
for f in net.parameters():
    f.data.sub_(f.grad.data * learning_rate)

 

SGD 외에도 weight를 update 하는 여러 기법들이 많이 존재하며 torch.optim 패키지를 통해 확인할 수 있다.
import torch.optim as optim

# create your optimizer
optimizer = optim.SGD(net.parameters(), lr=0.01)

# in your training loop:
optimizer.zero_grad()   # zero the gradient buffers
output = net(input)
loss = criterion(output, target)
loss.backward()
optimizer.step()    # Does the update

'정리 > Pytorch' 카테고리의 다른 글

[Pytorch 꿀팁] Cuda:out of memory  (0) 2023.03.24
[Pytorch 기본] Classifier  (0) 2023.01.21
[Pytorch 기본] Autograd  (0) 2023.01.17
[Pytorch 기본] Tensor  (2) 2023.01.17