[Pytorch 기본] Neural Network

정리/Pytorch

[Pytorch 기본] Neural Network

gyubinc 2023. 1. 21. 02:19

해당 게시글은 개인적인 복습을 위해 PyTorch tutorial을 기반으로 설명을 덧붙이고 코드를 수정하며 정리한 글입니다.

Neural Networks

typical training procedure for a neural network

1. learnable parameters(or weights) 정의

2. 입력 데이터셋에 대해 아래 과정 반복

3. network에 input

4. loss 계산

5. network's parameters에 gradient 역전파

6. weight update(weight = weight - learning_rate*gradient)

Define the network

부모 클래스인 nn.Module은 parameters를 캡슐화 하여 GPU에서 사용할 수 있게끔 도와준다.

import torch
import torch.nn as nn
import torch.nn.functional as F


class Net(nn.Module):

    def __init__(self):
        super(Net, self).__init__()
        # 1 input image channel, 6 output channels, 5x5 square convolution
        # kernel
        self.conv1 = nn.Conv2d(1, 6, 5)
        self.conv2 = nn.Conv2d(6, 16, 5)
        # an affine operation: y = Wx + b
        self.fc1 = nn.Linear(16 * 5 * 5, 120)  # 5*5 from image dimension
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        # Max pooling over a (2, 2) window
        x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))
        # If the size is a square, you can specify with a single number
        x = F.max_pool2d(F.relu(self.conv2(x)), 2)
        x = torch.flatten(x, 1) # flatten all dimensions except the batch dimension
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x


net = Net()
print(net)

backward function에 대해서는 autograd에 의해 forward function만 존재한다면 자동적으로 정의된다.

params = list(net.parameters())
print(len(params))
print(params[0].size())  # conv1's .weight

임의의 32x32 size의 데이터를 생성해 입력해본다

input = torch.randn(1, 1, 32, 32)
out = net(input)
print(out)

net.zero_grad()
out.backward(torch.randn(1, 10))

Loss Function

loss function은 (output, target)을 쌍으로 받아 두 값이 얼마나 벌어져 있는지를 측정한다.

loss function의 종류는 다양하며 가장 simple한 loss로는 MSE(Mean-squared error)가 있다.

output = net(input)
target = torch.randn(10)  # a dummy target, for example
target = target.view(1, -1)  # make it the same shape as output
criterion = nn.MSELoss()

loss = criterion(output, target)
print(loss)

input -> conv2d -> relu -> maxpool2d -> conv2d -> relu -> maxpool2d -> flatten -> linear -> relu -> linear -> relu -> linear

-> MSELoss

-> loss

작성된 neural network는 아래와 같은 과정으로 이루어져있다.

print(loss.grad_fn)  # MSELoss
print(loss.grad_fn.next_functions[0][0])  # Linear
print(loss.grad_fn.next_functions[0][0].next_functions[0][0])  # ReLU

Backprop

역전파 과정을 진행하기 이전에 필수적으로 gradient 값을 0으로 초기화해주지 않으면 gradient값이 accumulated된다.

net.zero_grad()     # zeroes the gradient buffers of all parameters

print('conv1.bias.grad before backward')
print(net.conv1.bias.grad)

loss.backward()

print('conv1.bias.grad after backward')
print(net.conv1.bias.grad)

Update the weights

weight를 update 하는 데에는 많은 기법들이 존재하지만 가장 기본적인 기법으로 경사하강법(Stochastic Gradient Descent)이 있다.

SGD : weight = weight - learning_rate * gradient

learning_rate = 0.01
for f in net.parameters():
    f.data.sub_(f.grad.data * learning_rate)

SGD 외에도 weight를 update 하는 여러 기법들이 많이 존재하며 torch.optim 패키지를 통해 확인할 수 있다.

import torch.optim as optim

# create your optimizer
optimizer = optim.SGD(net.parameters(), lr=0.01)

# in your training loop:
optimizer.zero_grad()   # zero the gradient buffers
output = net(input)
loss = criterion(output, target)
loss.backward()
optimizer.step()    # Does the update

'정리 > Pytorch' 카테고리의 다른 글

[Pytorch 꿀팁] Cuda:out of memory (0)	2023.03.24
[Pytorch 기본] Classifier (0)	2023.01.21
[Pytorch 기본] Autograd (0)	2023.01.17
[Pytorch 기본] Tensor (2)	2023.01.17

현재글[Pytorch 기본] Neural Network

나는 언젠가 멋쟁이가 될테야

코딩, tensor, 우울, pytorch, 머신러닝, 만추, 파이썬, 탕웨이, autograd, 꿀팁, classifier, 딥러닝, 타오르는여인의초상, 텐서, 파이토치, Python, 부모, 영화, 리뷰, 회귀,

Today :
Yesterday :

일	월	화	수	목	금	토
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

규비니의 발자취