Rnn 1 (신경망 복습)

2023-04-19 23 분 소요

신경망 복습

1. 수학과 파이썬 복습

1.1 벡터와 행렬

from IPython.display import Image
Image('./deep_learning_2_images/fig 1-1.png', width=250)

png

import numpy as np

x = np.array([1, 2, 3])

x.__class__

numpy.ndarray

x.shape

(3,)

x.ndim

W = np.array([[1, 2, 3], [4, 5, 6]])

W.shape

(2, 3)

W.ndim

1.2 행렬의 원소별 연산

W = np.array([[1, 2, 3], [4, 5, 6]])
X = np.array([[0, 1, 2], [3, 4, 5]])

W + X

array([[ 1,  3,  5],
       [ 7,  9, 11]])

W * X

array([[ 0,  2,  6],
       [12, 20, 30]])

1.3 브로드캐스트

Image('./deep_learning_2_images/fig 1-3.png', width=400)

png

A = np.array([[1, 2], [3, 4]])

A * 10

array([[10, 20],
       [30, 40]])

from IPython.display import Image
Image('./deep_learning_2_images/fig 1-4.png', width=400)

png

A = np.array([[1, 2], [3, 4]])
B = np.array([10, 20])

A * B

array([[10, 40],
       [30, 80]])

1.4 벡터의 내적과 행렬의 곱

from IPython.display import Image
Image('./deep_learning_2_images/e 1-1.png', width=300)

png

a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

np.dot(a, b)

Image('./deep_learning_2_images/fig 1-5.png', width=400)

png

A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])

np.matmul(A, B)

array([[19, 22],
       [43, 50]])

1.5 행렬 형상 확인

Image('./deep_learning_2_images/fig 1-6.png', width=400)

png

2. 신경망의 추론

2.1 신경망 추론 전체 그림

Image('./deep_learning_2_images/fig 1-7.png', width=400)

png

Image('./deep_learning_2_images/e 1-2.png', width=200)

png

Image('./deep_learning_2_images/e 1-3.png', width=400)

png

Image('./deep_learning_2_images/fig 1-8.png', width=300)

png

Image('./deep_learning_2_images/fig 1-9.png', width=300)

png

N = 10
x = np.random.randn(N, 2) # 입력
W1 = np.random.randn(2, 4) # 가중치
b1 = np.random.randn(4) # 편향
h = np.matmul(x, W1) + b1 # 어파인연산, 파이토치에서 nn.Linear()

h.shape # (N, 4)

(10, 4)

시그모이드 함수

def sigmoid(x):
    return 1 / (1 + np.exp(-x))

a = sigmoid(h) # 0~1 사이의 값

a.shape

(10, 4)

신경망 추론 전체 코드

N = 10
x = np.random.randn(N, 2) # 입력
W1 = np.random.randn(2, 4) # 가중치
b1 = np.random.randn(4) # 편향
W2 = np.random.randn(4, 3)
b2 = np.random.randn(3)

h = np.matmul(x, W1) + b1 # 어파인연산, 파이토치에서 nn.Linear()
a = sigmoid(h) 
s = np.matmul(a, W2) + b2 # 어파인연산

s.shape # (N, 3)

(10, 3)

2.2 계층으로 클래스화 및 순전파 구현

각 계층의 순전파 구현

class Sigmoid:
    def __init__(self):
        self.params = []

    def forward(self, x):
        return 1 / (1 + np.exp(-x))


class Affine:
    def __init__(self, W, b):
        self.params = [W, b]

    def forward(self, x):
        W, b = self.params
        out = np.dot(x, W) + b
        return out

Image('./deep_learning_2_images/fig 1-11.png', width=500)

png

TwoLayerNet 구현

class TwoLayerNet:
    def __init__(self, input_size, hidden_size, output_size):
        I, H, O = input_size, hidden_size, output_size
        
        # 가중치와 편향 초기화
        W1 = np.random.randn(I, H)
        b1 = np.random.randn(H)
        W2 = np.random.randn(H, O)
        b2 = np.random.randn(O)
        
        # 계층 생성
        self.layers = [
            Affine(W1, b1),
            Sigmoid(),
            Affine(W2, b2)
        ]
        
        # 가중치 관리
        self.params = []
        for layer in self.layers:
            self.params += layer.params # self.params <= [W1, b1, W2, b2]
                            
    def predict(self, x):      
        for layer in self.layers:
            x = layer.forward(x)            
        return x # (N, O)

TwoLayerNet 클래스를 이용한 신경망 추론

x = np.random.randn(10, 2)
model = TwoLayerNet(input_size=2, hidden_size=4, output_size=3)
s = model.predict(x) # s.shape : (10, 3)
s.shape

(10, 3)

3. 신경망의 학습

3.1 손실 함수

Image('./deep_learning_2_images/fig 1-12.png', width=500)

png

소프트맥스 함수

Image('./deep_learning_2_images/e 1-6.png', width=150)

png

교차 엔트로피

Image('./deep_learning_2_images/e 1-7.png', width=150)

png

미니 배치를 고려한 교차 엔트로피

Image('./deep_learning_2_images/e 1-8.png', width=200)

png

학습을 고려한 신경망의 계층 구성

Image('./deep_learning_2_images/fig 1-13.png', width=400)

png

3.2 미분과 기울기

Image('./deep_learning_2_images/fig 1-14.png', width=200)

png

벡터 x의 미분

Image('./deep_learning_2_images/e 1-9.png', width=180)

png

행렬 W의 미분

Image('./deep_learning_2_images/e 1-10.png', width=200)

png

3.3 연쇄 법칙

합성 함수의 미분

Image('./deep_learning_2_images/e 1-11.png', width=100)

png

3.4 계산 그래프

Image('./deep_learning_2_images/fig 1-15.png', width=250)

png

Image('./deep_learning_2_images/fig 1-16.png', width=450)

png

Image('./deep_learning_2_images/fig 1-17.png', width=450)

png

Image('./deep_learning_2_images/fig 1-18.png', width=400)

png

곱셈 노드

Image('./deep_learning_2_images/fig 1-19.png', width=450)

png

분기 노드

Image('./deep_learning_2_images/fig 1-20.png', width=450)

png

Repeat 노드

Image('./deep_learning_2_images/fig 1-21.png', width=450)

png

D, N = 8, 7
x = np.random.randn(1, D) # 입력
y = np.repeat(x, N, axis=0) # 순전파
dy = np.random.randn(N, D) # y와 동일한 형상의 무작위 기울기값
dx = np.sum(dy, axis=0, keepdims=True) # 역전파

(1, 8)

Sum 노드

Image('./deep_learning_2_images/fig 1-22.png', width=450)

png

D, N = 8, 7
x = np.random.randn(N, D)
y = np.sum(x, axis=0, keepdims=True)
dy = np.random.randn(1, D)
dx = np.repeat(dy, N, axis=0)

MatMul 노드

Image('./deep_learning_2_images/fig 1-23.png', width=300)

png

x의 i번째 원소에 대한 미분

Image('./deep_learning_2_images/e 1-12.png', width=130)

png

Image('./deep_learning_2_images/e 1-13.png', width=220)

png

벡터 x에 대한 미분

Image('./deep_learning_2_images/e 1-14.png', width=110)

png

Image('./deep_learning_2_images/fig 1-24.png', width=300)

png

Image('./deep_learning_2_images/fig 1-25.png', width=200)

png

Image('./deep_learning_2_images/fig 1-26.png', width=300)

png

MatMul 계층

class MatMul:
    def __init__(self, W):
        self.params = [W]
        self.grads = [np.zeros_like(W)]
        self.x = None

    def forward(self, x):
        W, = self.params
        out = np.dot(x, W)        
        self.x = x
        return out

    def backward(self, dout):
        W, = self.params
        dx = np.dot(dout, W.T)
        dW = np.dot(self.x.T, dout)     
        self.grads[0][...] = dW
        return dx

ellipsis (…) : 깊은 복사

3.5 기울기 도출과 역전파 구현

Sigmoid 계층

Image('./deep_learning_2_images/e 1-15.png', width=100)

png

Image('./deep_learning_2_images/fig 1-28.png', width=200)

png

Sigmoid 계층 구현

class Sigmoid:
    def __init__(self):
        self.params, self.grads = [], []
        self.out = None

    def forward(self, x):
        out = 1/(1 + np.exp(-x))
        self.out = out
        return out

    def backward(self, dout):
        dx = dout * (1.0 - self.out) * self.out
        return dx

Affine 계층

Image('./deep_learning_2_images/fig 1-29.png', width=300)

png

Affine 계층 구현

class Affine:
    def __init__(self, W, b):
        self.params = [W, b]
        self.grads = [np.zeros_like(W), np.zeros_like(b)]
        self.x = None

    def forward(self, x):
        W, b = self.params
        out = np.dot(x, W) + b
        self.x = x
        return out

    def backward(self, dout):
        W, b = self.params
        dx = np.dot(dout, W.T)
        dW = np.dot(self.x.T, dout)
        db = np.sum(dout, axis=0)

        self.grads[0][...] = dW
        self.grads[1][...] = db
        return dx

Softmax with Loss 계층

Image('./deep_learning_2_images/fig 1-30.png', width=350)

png

3.6 가중치 갱신

확률적 경사하강법

Image('./deep_learning_2_images/e 1-16.png', width=150)

png

확률적 경사하강법 구현

class SGD:
    '''
    확률적 경사하강법(Stochastic Gradient Descent)
    '''
    def __init__(self, lr=0.01):
        self.lr = lr
        
    def update(self, params, grads):
        for i in range(len(params)):
            params[i] -= self.lr * grads[i]

4. 신경망 학습

4.1 스파이럴 데이터셋

import sys
sys.path.append('..')  # 부모 디렉터리의 파일을 가져올 수 있도록 설정
from dataset import spiral
import matplotlib.pyplot as plt

x, t = spiral.load_data()
print('x', x.shape)  # (300, 2)
print('t', t.shape)  # (300, 3)

x (300, 2)
t (300, 3)

# 데이터점 플롯
N = 100
CLS_NUM = 3
markers = ['o', 'x', '^']
for i in range(CLS_NUM):
    plt.scatter(x[i*N:(i+1)*N, 0], x[i*N:(i+1)*N, 1], s=40, marker=markers[i])
plt.show()

png

4.2 신경망 구현

def softmax(x):
    if x.ndim == 2:
        x = x - x.max(axis=1, keepdims=True)
        x = np.exp(x)
        x /= x.sum(axis=1, keepdims=True)
    elif x.ndim == 1:
        x = x - np.max(x)
        x = np.exp(x) / np.sum(np.exp(x))

    return x

def cross_entropy_error(y, t):
    if y.ndim == 1:
        t = t.reshape(1, t.size)
        y = y.reshape(1, y.size)
        
    # 정답 데이터가 원핫 벡터일 경우 정답 레이블 인덱스로 변환
    if t.size == y.size:
        t = t.argmax(axis=1)
             
    batch_size = y.shape[0]

    return -np.sum(np.log(y[np.arange(batch_size), t] + 1e-7)) / batch_size

class SoftmaxWithLoss:
    def __init__(self):
        self.params, self.grads = [], []
        self.y = None  # softmax의 출력
        self.t = None  # 정답 레이블

    def forward(self, x, t):
        self.t = t
        self.y = softmax(x)

        # 정답 레이블이 원핫 벡터일 경우 정답의 인덱스로 변환
        if self.t.size == self.y.size:
            self.t = self.t.argmax(axis=1)

        loss = cross_entropy_error(self.y, self.t)
        return loss

    def backward(self, dout=1):
        batch_size = self.t.shape[0]

        dx = self.y.copy()
        dx[np.arange(batch_size), self.t] -= 1
        dx *= dout
        dx = dx / batch_size

        return dx

class TwoLayerNet:
    def __init__(self, input_size, hidden_size, output_size):
        I, H, O = input_size, hidden_size, output_size

        # 가중치와 편향 초기화
        W1 = 0.01 * np.random.randn(I, H)
        b1 = np.zeros(H)
        W2 = 0.01 * np.random.randn(H, O)
        b2 = np.zeros(O)

        # 계층 생성
        self.layers = [
            Affine(W1, b1),
            Sigmoid(),
            Affine(W2, b2)
        ]
        self.loss_layer = SoftmaxWithLoss()

        # 모든 가중치와 기울기를 리스트에 모은다.
        self.params, self.grads = [], []
        for layer in self.layers:
            self.params += layer.params
            self.grads += layer.grads

    def predict(self, x):
        for layer in self.layers:
            x = layer.forward(x)
        return x

    def forward(self, x, t):
        score = self.predict(x)
        loss = self.loss_layer.forward(score, t)
        return loss

    def backward(self, dout=1):
        dout = self.loss_layer.backward(dout)
        for layer in reversed(self.layers):
            dout = layer.backward(dout)
        return dout

4.3 학습용 코드

# 1. 하이퍼파라미터 설정
max_epoch = 300
batch_size = 30
hidden_size = 10
learning_rate = 1.0

# 2. 데이터 읽기, 모델과 옵티마이저 생성
x, t = spiral.load_data()
model = TwoLayerNet(input_size=2, hidden_size=hidden_size, output_size=3)
optimizer = SGD(lr=learning_rate)

# 학습에 사용하는 변수
data_size = len(x)
max_iters = data_size // batch_size
total_loss = 0
loss_count = 0
loss_list = []

for epoch in range(max_epoch):
    # 3. 데이터 뒤섞기
    idx = np.random.permutation(data_size)
    x = x[idx]
    t = t[idx]

    for iters in range(max_iters):
        batch_x = x[iters*batch_size:(iters+1)*batch_size]
        batch_t = t[iters*batch_size:(iters+1)*batch_size]

        # 4. 기울기를 구해 매개변수 갱신
        loss = model.forward(batch_x, batch_t)        
        model.backward()
        optimizer.update(model.params, model.grads)

        total_loss += loss
        loss_count += 1

        # 5. 정기적으로 학습 경과 출력
        if (iters+1) % 10 == 0:
            avg_loss = total_loss / loss_count
            print('| 에폭 %d |  반복 %d / %d | 손실 %.2f'
                  % (epoch + 1, iters + 1, max_iters, avg_loss))
            loss_list.append(avg_loss)
            total_loss, loss_count = 0, 0

| 에폭 1 |  반복 10 / 10 | 손실 1.13
| 에폭 2 |  반복 10 / 10 | 손실 1.13
| 에폭 3 |  반복 10 / 10 | 손실 1.12
| 에폭 4 |  반복 10 / 10 | 손실 1.12
| 에폭 5 |  반복 10 / 10 | 손실 1.11
| 에폭 6 |  반복 10 / 10 | 손실 1.14
| 에폭 7 |  반복 10 / 10 | 손실 1.16
| 에폭 8 |  반복 10 / 10 | 손실 1.11
| 에폭 9 |  반복 10 / 10 | 손실 1.12
| 에폭 10 |  반복 10 / 10 | 손실 1.13
| 에폭 11 |  반복 10 / 10 | 손실 1.12
| 에폭 12 |  반복 10 / 10 | 손실 1.11
| 에폭 13 |  반복 10 / 10 | 손실 1.09
| 에폭 14 |  반복 10 / 10 | 손실 1.08
| 에폭 15 |  반복 10 / 10 | 손실 1.04
| 에폭 16 |  반복 10 / 10 | 손실 1.03
| 에폭 17 |  반복 10 / 10 | 손실 0.96
| 에폭 18 |  반복 10 / 10 | 손실 0.92
| 에폭 19 |  반복 10 / 10 | 손실 0.92
| 에폭 20 |  반복 10 / 10 | 손실 0.87
| 에폭 21 |  반복 10 / 10 | 손실 0.85
| 에폭 22 |  반복 10 / 10 | 손실 0.82
| 에폭 23 |  반복 10 / 10 | 손실 0.79
| 에폭 24 |  반복 10 / 10 | 손실 0.78
| 에폭 25 |  반복 10 / 10 | 손실 0.82
| 에폭 26 |  반복 10 / 10 | 손실 0.78
| 에폭 27 |  반복 10 / 10 | 손실 0.76
| 에폭 28 |  반복 10 / 10 | 손실 0.76
| 에폭 29 |  반복 10 / 10 | 손실 0.78
| 에폭 30 |  반복 10 / 10 | 손실 0.75
| 에폭 31 |  반복 10 / 10 | 손실 0.78
| 에폭 32 |  반복 10 / 10 | 손실 0.77
| 에폭 33 |  반복 10 / 10 | 손실 0.77
| 에폭 34 |  반복 10 / 10 | 손실 0.78
| 에폭 35 |  반복 10 / 10 | 손실 0.75
| 에폭 36 |  반복 10 / 10 | 손실 0.74
| 에폭 37 |  반복 10 / 10 | 손실 0.76
| 에폭 38 |  반복 10 / 10 | 손실 0.76
| 에폭 39 |  반복 10 / 10 | 손실 0.73
| 에폭 40 |  반복 10 / 10 | 손실 0.75
| 에폭 41 |  반복 10 / 10 | 손실 0.76
| 에폭 42 |  반복 10 / 10 | 손실 0.76
| 에폭 43 |  반복 10 / 10 | 손실 0.76
| 에폭 44 |  반복 10 / 10 | 손실 0.74
| 에폭 45 |  반복 10 / 10 | 손실 0.75
| 에폭 46 |  반복 10 / 10 | 손실 0.73
| 에폭 47 |  반복 10 / 10 | 손실 0.72
| 에폭 48 |  반복 10 / 10 | 손실 0.73
| 에폭 49 |  반복 10 / 10 | 손실 0.72
| 에폭 50 |  반복 10 / 10 | 손실 0.72
| 에폭 51 |  반복 10 / 10 | 손실 0.72
| 에폭 52 |  반복 10 / 10 | 손실 0.72
| 에폭 53 |  반복 10 / 10 | 손실 0.74
| 에폭 54 |  반복 10 / 10 | 손실 0.74
| 에폭 55 |  반복 10 / 10 | 손실 0.72
| 에폭 56 |  반복 10 / 10 | 손실 0.72
| 에폭 57 |  반복 10 / 10 | 손실 0.71
| 에폭 58 |  반복 10 / 10 | 손실 0.70
| 에폭 59 |  반복 10 / 10 | 손실 0.72
| 에폭 60 |  반복 10 / 10 | 손실 0.70
| 에폭 61 |  반복 10 / 10 | 손실 0.71
| 에폭 62 |  반복 10 / 10 | 손실 0.72
| 에폭 63 |  반복 10 / 10 | 손실 0.70
| 에폭 64 |  반복 10 / 10 | 손실 0.71
| 에폭 65 |  반복 10 / 10 | 손실 0.73
| 에폭 66 |  반복 10 / 10 | 손실 0.70
| 에폭 67 |  반복 10 / 10 | 손실 0.71
| 에폭 68 |  반복 10 / 10 | 손실 0.69
| 에폭 69 |  반복 10 / 10 | 손실 0.70
| 에폭 70 |  반복 10 / 10 | 손실 0.71
| 에폭 71 |  반복 10 / 10 | 손실 0.68
| 에폭 72 |  반복 10 / 10 | 손실 0.69
| 에폭 73 |  반복 10 / 10 | 손실 0.67
| 에폭 74 |  반복 10 / 10 | 손실 0.68
| 에폭 75 |  반복 10 / 10 | 손실 0.67
| 에폭 76 |  반복 10 / 10 | 손실 0.66
| 에폭 77 |  반복 10 / 10 | 손실 0.69
| 에폭 78 |  반복 10 / 10 | 손실 0.64
| 에폭 79 |  반복 10 / 10 | 손실 0.68
| 에폭 80 |  반복 10 / 10 | 손실 0.64
| 에폭 81 |  반복 10 / 10 | 손실 0.64
| 에폭 82 |  반복 10 / 10 | 손실 0.66
| 에폭 83 |  반복 10 / 10 | 손실 0.62
| 에폭 84 |  반복 10 / 10 | 손실 0.62
| 에폭 85 |  반복 10 / 10 | 손실 0.61
| 에폭 86 |  반복 10 / 10 | 손실 0.60
| 에폭 87 |  반복 10 / 10 | 손실 0.60
| 에폭 88 |  반복 10 / 10 | 손실 0.61
| 에폭 89 |  반복 10 / 10 | 손실 0.59
| 에폭 90 |  반복 10 / 10 | 손실 0.58
| 에폭 91 |  반복 10 / 10 | 손실 0.56
| 에폭 92 |  반복 10 / 10 | 손실 0.56
| 에폭 93 |  반복 10 / 10 | 손실 0.54
| 에폭 94 |  반복 10 / 10 | 손실 0.53
| 에폭 95 |  반복 10 / 10 | 손실 0.53
| 에폭 96 |  반복 10 / 10 | 손실 0.52
| 에폭 97 |  반복 10 / 10 | 손실 0.51
| 에폭 98 |  반복 10 / 10 | 손실 0.50
| 에폭 99 |  반복 10 / 10 | 손실 0.48
| 에폭 100 |  반복 10 / 10 | 손실 0.48
| 에폭 101 |  반복 10 / 10 | 손실 0.46
| 에폭 102 |  반복 10 / 10 | 손실 0.45
| 에폭 103 |  반복 10 / 10 | 손실 0.45
| 에폭 104 |  반복 10 / 10 | 손실 0.44
| 에폭 105 |  반복 10 / 10 | 손실 0.44
| 에폭 106 |  반복 10 / 10 | 손실 0.41
| 에폭 107 |  반복 10 / 10 | 손실 0.40
| 에폭 108 |  반복 10 / 10 | 손실 0.41
| 에폭 109 |  반복 10 / 10 | 손실 0.40
| 에폭 110 |  반복 10 / 10 | 손실 0.40
| 에폭 111 |  반복 10 / 10 | 손실 0.38
| 에폭 112 |  반복 10 / 10 | 손실 0.38
| 에폭 113 |  반복 10 / 10 | 손실 0.36
| 에폭 114 |  반복 10 / 10 | 손실 0.37
| 에폭 115 |  반복 10 / 10 | 손실 0.35
| 에폭 116 |  반복 10 / 10 | 손실 0.34
| 에폭 117 |  반복 10 / 10 | 손실 0.34
| 에폭 118 |  반복 10 / 10 | 손실 0.34
| 에폭 119 |  반복 10 / 10 | 손실 0.33
| 에폭 120 |  반복 10 / 10 | 손실 0.34
| 에폭 121 |  반복 10 / 10 | 손실 0.32
| 에폭 122 |  반복 10 / 10 | 손실 0.32
| 에폭 123 |  반복 10 / 10 | 손실 0.31
| 에폭 124 |  반복 10 / 10 | 손실 0.31
| 에폭 125 |  반복 10 / 10 | 손실 0.30
| 에폭 126 |  반복 10 / 10 | 손실 0.30
| 에폭 127 |  반복 10 / 10 | 손실 0.28
| 에폭 128 |  반복 10 / 10 | 손실 0.28
| 에폭 129 |  반복 10 / 10 | 손실 0.28
| 에폭 130 |  반복 10 / 10 | 손실 0.28
| 에폭 131 |  반복 10 / 10 | 손실 0.27
| 에폭 132 |  반복 10 / 10 | 손실 0.27
| 에폭 133 |  반복 10 / 10 | 손실 0.27
| 에폭 134 |  반복 10 / 10 | 손실 0.27
| 에폭 135 |  반복 10 / 10 | 손실 0.27
| 에폭 136 |  반복 10 / 10 | 손실 0.26
| 에폭 137 |  반복 10 / 10 | 손실 0.26
| 에폭 138 |  반복 10 / 10 | 손실 0.26
| 에폭 139 |  반복 10 / 10 | 손실 0.25
| 에폭 140 |  반복 10 / 10 | 손실 0.24
| 에폭 141 |  반복 10 / 10 | 손실 0.24
| 에폭 142 |  반복 10 / 10 | 손실 0.25
| 에폭 143 |  반복 10 / 10 | 손실 0.24
| 에폭 144 |  반복 10 / 10 | 손실 0.24
| 에폭 145 |  반복 10 / 10 | 손실 0.23
| 에폭 146 |  반복 10 / 10 | 손실 0.24
| 에폭 147 |  반복 10 / 10 | 손실 0.23
| 에폭 148 |  반복 10 / 10 | 손실 0.23
| 에폭 149 |  반복 10 / 10 | 손실 0.22
| 에폭 150 |  반복 10 / 10 | 손실 0.22
| 에폭 151 |  반복 10 / 10 | 손실 0.22
| 에폭 152 |  반복 10 / 10 | 손실 0.22
| 에폭 153 |  반복 10 / 10 | 손실 0.22
| 에폭 154 |  반복 10 / 10 | 손실 0.22
| 에폭 155 |  반복 10 / 10 | 손실 0.22
| 에폭 156 |  반복 10 / 10 | 손실 0.21
| 에폭 157 |  반복 10 / 10 | 손실 0.21
| 에폭 158 |  반복 10 / 10 | 손실 0.20
| 에폭 159 |  반복 10 / 10 | 손실 0.21
| 에폭 160 |  반복 10 / 10 | 손실 0.20
| 에폭 161 |  반복 10 / 10 | 손실 0.20
| 에폭 162 |  반복 10 / 10 | 손실 0.20
| 에폭 163 |  반복 10 / 10 | 손실 0.21
| 에폭 164 |  반복 10 / 10 | 손실 0.20
| 에폭 165 |  반복 10 / 10 | 손실 0.20
| 에폭 166 |  반복 10 / 10 | 손실 0.19
| 에폭 167 |  반복 10 / 10 | 손실 0.19
| 에폭 168 |  반복 10 / 10 | 손실 0.19
| 에폭 169 |  반복 10 / 10 | 손실 0.19
| 에폭 170 |  반복 10 / 10 | 손실 0.19
| 에폭 171 |  반복 10 / 10 | 손실 0.19
| 에폭 172 |  반복 10 / 10 | 손실 0.18
| 에폭 173 |  반복 10 / 10 | 손실 0.18
| 에폭 174 |  반복 10 / 10 | 손실 0.18
| 에폭 175 |  반복 10 / 10 | 손실 0.18
| 에폭 176 |  반복 10 / 10 | 손실 0.18
| 에폭 177 |  반복 10 / 10 | 손실 0.18
| 에폭 178 |  반복 10 / 10 | 손실 0.18
| 에폭 179 |  반복 10 / 10 | 손실 0.17
| 에폭 180 |  반복 10 / 10 | 손실 0.17
| 에폭 181 |  반복 10 / 10 | 손실 0.18
| 에폭 182 |  반복 10 / 10 | 손실 0.17
| 에폭 183 |  반복 10 / 10 | 손실 0.18
| 에폭 184 |  반복 10 / 10 | 손실 0.17
| 에폭 185 |  반복 10 / 10 | 손실 0.17
| 에폭 186 |  반복 10 / 10 | 손실 0.18
| 에폭 187 |  반복 10 / 10 | 손실 0.17
| 에폭 188 |  반복 10 / 10 | 손실 0.17
| 에폭 189 |  반복 10 / 10 | 손실 0.17
| 에폭 190 |  반복 10 / 10 | 손실 0.17
| 에폭 191 |  반복 10 / 10 | 손실 0.16
| 에폭 192 |  반복 10 / 10 | 손실 0.17
| 에폭 193 |  반복 10 / 10 | 손실 0.16
| 에폭 194 |  반복 10 / 10 | 손실 0.16
| 에폭 195 |  반복 10 / 10 | 손실 0.16
| 에폭 196 |  반복 10 / 10 | 손실 0.16
| 에폭 197 |  반복 10 / 10 | 손실 0.16
| 에폭 198 |  반복 10 / 10 | 손실 0.15
| 에폭 199 |  반복 10 / 10 | 손실 0.16
| 에폭 200 |  반복 10 / 10 | 손실 0.16
| 에폭 201 |  반복 10 / 10 | 손실 0.15
| 에폭 202 |  반복 10 / 10 | 손실 0.16
| 에폭 203 |  반복 10 / 10 | 손실 0.16
| 에폭 204 |  반복 10 / 10 | 손실 0.15
| 에폭 205 |  반복 10 / 10 | 손실 0.16
| 에폭 206 |  반복 10 / 10 | 손실 0.15
| 에폭 207 |  반복 10 / 10 | 손실 0.15
| 에폭 208 |  반복 10 / 10 | 손실 0.15
| 에폭 209 |  반복 10 / 10 | 손실 0.15
| 에폭 210 |  반복 10 / 10 | 손실 0.15
| 에폭 211 |  반복 10 / 10 | 손실 0.15
| 에폭 212 |  반복 10 / 10 | 손실 0.15
| 에폭 213 |  반복 10 / 10 | 손실 0.15
| 에폭 214 |  반복 10 / 10 | 손실 0.15
| 에폭 215 |  반복 10 / 10 | 손실 0.15
| 에폭 216 |  반복 10 / 10 | 손실 0.14
| 에폭 217 |  반복 10 / 10 | 손실 0.14
| 에폭 218 |  반복 10 / 10 | 손실 0.15
| 에폭 219 |  반복 10 / 10 | 손실 0.14
| 에폭 220 |  반복 10 / 10 | 손실 0.14
| 에폭 221 |  반복 10 / 10 | 손실 0.14
| 에폭 222 |  반복 10 / 10 | 손실 0.14
| 에폭 223 |  반복 10 / 10 | 손실 0.14
| 에폭 224 |  반복 10 / 10 | 손실 0.14
| 에폭 225 |  반복 10 / 10 | 손실 0.14
| 에폭 226 |  반복 10 / 10 | 손실 0.14
| 에폭 227 |  반복 10 / 10 | 손실 0.14
| 에폭 228 |  반복 10 / 10 | 손실 0.14
| 에폭 229 |  반복 10 / 10 | 손실 0.13
| 에폭 230 |  반복 10 / 10 | 손실 0.14
| 에폭 231 |  반복 10 / 10 | 손실 0.13
| 에폭 232 |  반복 10 / 10 | 손실 0.14
| 에폭 233 |  반복 10 / 10 | 손실 0.13
| 에폭 234 |  반복 10 / 10 | 손실 0.13
| 에폭 235 |  반복 10 / 10 | 손실 0.13
| 에폭 236 |  반복 10 / 10 | 손실 0.13
| 에폭 237 |  반복 10 / 10 | 손실 0.14
| 에폭 238 |  반복 10 / 10 | 손실 0.13
| 에폭 239 |  반복 10 / 10 | 손실 0.13
| 에폭 240 |  반복 10 / 10 | 손실 0.14
| 에폭 241 |  반복 10 / 10 | 손실 0.13
| 에폭 242 |  반복 10 / 10 | 손실 0.13
| 에폭 243 |  반복 10 / 10 | 손실 0.13
| 에폭 244 |  반복 10 / 10 | 손실 0.13
| 에폭 245 |  반복 10 / 10 | 손실 0.13
| 에폭 246 |  반복 10 / 10 | 손실 0.13
| 에폭 247 |  반복 10 / 10 | 손실 0.13
| 에폭 248 |  반복 10 / 10 | 손실 0.13
| 에폭 249 |  반복 10 / 10 | 손실 0.13
| 에폭 250 |  반복 10 / 10 | 손실 0.13
| 에폭 251 |  반복 10 / 10 | 손실 0.13
| 에폭 252 |  반복 10 / 10 | 손실 0.12
| 에폭 253 |  반복 10 / 10 | 손실 0.12
| 에폭 254 |  반복 10 / 10 | 손실 0.12
| 에폭 255 |  반복 10 / 10 | 손실 0.12
| 에폭 256 |  반복 10 / 10 | 손실 0.12
| 에폭 257 |  반복 10 / 10 | 손실 0.12
| 에폭 258 |  반복 10 / 10 | 손실 0.12
| 에폭 259 |  반복 10 / 10 | 손실 0.13
| 에폭 260 |  반복 10 / 10 | 손실 0.12
| 에폭 261 |  반복 10 / 10 | 손실 0.13
| 에폭 262 |  반복 10 / 10 | 손실 0.12
| 에폭 263 |  반복 10 / 10 | 손실 0.12
| 에폭 264 |  반복 10 / 10 | 손실 0.13
| 에폭 265 |  반복 10 / 10 | 손실 0.12
| 에폭 266 |  반복 10 / 10 | 손실 0.12
| 에폭 267 |  반복 10 / 10 | 손실 0.12
| 에폭 268 |  반복 10 / 10 | 손실 0.12
| 에폭 269 |  반복 10 / 10 | 손실 0.11
| 에폭 270 |  반복 10 / 10 | 손실 0.12
| 에폭 271 |  반복 10 / 10 | 손실 0.12
| 에폭 272 |  반복 10 / 10 | 손실 0.12
| 에폭 273 |  반복 10 / 10 | 손실 0.12
| 에폭 274 |  반복 10 / 10 | 손실 0.12
| 에폭 275 |  반복 10 / 10 | 손실 0.11
| 에폭 276 |  반복 10 / 10 | 손실 0.12
| 에폭 277 |  반복 10 / 10 | 손실 0.12
| 에폭 278 |  반복 10 / 10 | 손실 0.11
| 에폭 279 |  반복 10 / 10 | 손실 0.11
| 에폭 280 |  반복 10 / 10 | 손실 0.11
| 에폭 281 |  반복 10 / 10 | 손실 0.11
| 에폭 282 |  반복 10 / 10 | 손실 0.12
| 에폭 283 |  반복 10 / 10 | 손실 0.11
| 에폭 284 |  반복 10 / 10 | 손실 0.11
| 에폭 285 |  반복 10 / 10 | 손실 0.11
| 에폭 286 |  반복 10 / 10 | 손실 0.11
| 에폭 287 |  반복 10 / 10 | 손실 0.11
| 에폭 288 |  반복 10 / 10 | 손실 0.12
| 에폭 289 |  반복 10 / 10 | 손실 0.11
| 에폭 290 |  반복 10 / 10 | 손실 0.11
| 에폭 291 |  반복 10 / 10 | 손실 0.11
| 에폭 292 |  반복 10 / 10 | 손실 0.11
| 에폭 293 |  반복 10 / 10 | 손실 0.11
| 에폭 294 |  반복 10 / 10 | 손실 0.11
| 에폭 295 |  반복 10 / 10 | 손실 0.12
| 에폭 296 |  반복 10 / 10 | 손실 0.11
| 에폭 297 |  반복 10 / 10 | 손실 0.12
| 에폭 298 |  반복 10 / 10 | 손실 0.11
| 에폭 299 |  반복 10 / 10 | 손실 0.11
| 에폭 300 |  반복 10 / 10 | 손실 0.11

# 학습 결과 플롯
plt.rc("font", family="Malgun Gothic")

plt.plot(np.arange(len(loss_list)), loss_list, label='train')
plt.xlabel('반복 (x10)')
plt.ylabel('손실')
plt.show()

png

Image('./deep_learning_2_images/fig 1-33.png', width=400)

png

Reference

밑바닥부터 시작하는 딥러닝 2 (사이토 고키 저)

Twitter Facebook LinkedIn