Advanced Neural Net (2)

๊ณผ์ œ ๋‚ด์šฉ ์„ค๋ช…

  1. ์บ๊ธ€ Kannada MNIST๋ฅผ ์ด์šฉํ•œ ๋ฏธ๋‹ˆ๋Œ€ํšŒ

์šฐ์ˆ˜๊ณผ์ œ ์„ ์ • ์ด์œ 

์กฐ์ƒ์—ฐ๋‹˜์€ ์™„๋ฒฝํ•œ ๋…ธํŠธ๋ถ์ด์—ˆ์Šต๋‹ˆ๋‹ค. keras sklearn wrapper๋ฅผ ์ด์šฉํ•ด์„œ ๊ทธ๋ฆฌ๋“œ ์„œ์น˜ ํ•˜์ดํผ ํŒŒ๋ผ๋ฏธํ„ฐ ํŠœ๋‹์„ ์ง„ํ–‰ํ•˜์‹ ์ , hiplot์ด๋ผ๋Š” ๋ชจ๋“ˆ๋กœ ๊ฐ ๋ ˆ์ด์–ด๋ณ„ ์‹œ๊ฐํ™” ์ง„ํ–‰ํ•˜์‹œ๊ณ  ์ด๋ฅผ ํ†ตํ•ด ์„ฑ๋Šฅ์ด ๋†’์€ ๋ฐฐ์น˜์ •๊ทœํ™”, ๋“œ๋กญ์•„์›ƒ ๋“ฑ์„ ๋ถ„์„ํ•˜์‹ ์ , ๋˜ํ•œ SOPCNN์ด๋ผ๋Š” MNIST SOTA ๋ชจ๋ธ์„ ์ฐพ์•„ ๋ณด์‹ ์ , Autokeras ์‹คํ—˜๊นŒ์ง€ ๋ฐฐ์šธ๊ฒŒ ๋งŽ์€ ์ตœ๊ณ ์˜ ๋…ธํŠธ๋ถ์ด์—ˆ์Šต๋‹ˆ๋‹ค.

7์ฃผ์ฐจ: Deep Learning Framework

13๊ธฐ ์กฐ์ƒ์—ฐ

๊ณผ์ œ: Kannada MNIST (https://www.kaggle.com/c/tobigs13nn)

๋ฐ์ดํ„ฐ: Train: 42,000 rows, Test: 18,000 rows

๋ชฉ์ฐจ

  1. ๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌ

  2. ์—ฌ๋Ÿฌ ๋”ฅ๋Ÿฌ๋‹ ์‹ฌํ™” ๊ธฐ๋ฒ• ๋น„๊ต (with Hiplot)

    • Activation

      • Relu / Leaky Relu / PRelu

      • Softmax

    • Batch Norm

    • Weight Init

    • Optimizer

      • Rmsprop, Adam, Radam

    • Regularzation

      • Dropout, Spatial Dropout

      • Early Stopping

      • Data Augmentation

  3. ํ•™์Šต ๋ฐ ํ•˜์ดํผ ํŒŒ๋ผ๋ฏธํ„ฐ ์กฐ์ •

0. Pre-requisite & Module Import

In [ ]:

In [ ]:

In [ ]:

In [8]:

In [ ]:

In [ ]:

In [ ]:

In [3]:

Out[3]:

In [ ]:

Out[ ]:

1. Data Load & Preprocessing

In [ ]:

In [ ]:

In [ ]:

Out[ ]:

2. ๋”ฅ๋Ÿฌ๋‹ ์‹ฌํ™” ๊ธฐ๋ฒ• Grid Search (with Hiplot)

๋‹ค์–‘ํ•œ ๊ธฐ๋ฒ•๊ณผ ํ•˜์ดํผ ํŒŒ๋ผ๋ฏธํ„ฐ ํŠœ๋‹์„ ํ†ตํ•ด ๊ฐ ๊ธฐ๋ฒ• ๋“ค์˜ ํšจ๊ณผ๋ฅผ ์ •๋ฆฌ

์ด๋ฒˆ์— ๋ฐฐ์šด ์—ฌ๋Ÿฌ๊ฐ€์ง€ ๊ธฐ๋ฒ•๋“ค์˜ ์œ /๋ฌด,ํฌ๊ธฐ ์กฐ์ ˆ ๋“ฑ์„ ํ†ตํ•ด ๊ทธ ํšจ๊ณผ๋ฅผ ๋ถ„์„ํ•˜๊ณ  ๊ฐ€์žฅ ์ตœ์ ์˜ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ๊ตฌํ•ด๋ณธ๋‹ค. ํŒŒ๋ผ๋ฏธํ„ฐ๋ณ„๋กœ ์ตœ์†Œ 3๊ฐœ์—์„œ 5๊ฐœ๊นŒ์ง€ ์˜ต์…˜์„ ์ฃผ๊ณ  ์‹ถ์—ˆ์ง€๋งŒ Colab ๋ฉ”๋ชจ๋ฆฌ๊ฐ€ ๊ฐ๋‹นํ•˜์ง€ ๋ชปํ•˜๊ณ  ๊ณ„์† ํ„ฐ์ ธ์„œ ๊ฐ€๋Šฅํ•œ ์ˆ˜์ค€์œผ๋กœ ๋‚ฎ์ถ”์—ˆ๋‹ค.

In [ ]:

In [ ]:

In [ ]:

In [ ]:

Out[ ]:

In [ ]:

Out[ ]:

In [ ]:

Out[ ]:

In [ ]:

In [ ]:

In [ ]:

In [11]:

In [28]:

2.2 ๊ฒฐ๊ณผ ์‹œ๊ฐํ™”

์‹œ๊ฐํ™”๋กœ Facebook Research์˜ Hiplot( https://github.com/facebookresearch/hiplot )์„ ์‚ฌ์šฉํ•˜์˜€๋‹ค.

ํŠนํžˆ ๊ณ ์ฐจ์›์˜ ๋ฐ์ดํ„ฐ ๊ฐ„์˜ ํŒจํ„ด์„ ๋ณด๊ธฐ์— ์šฉ์ดํ•˜๋‹ค๊ณ  ํ•œ๋‹ค.

In [12]:

Out[12]:

aa

์ „์ฒด ๊ทธ๋ž˜ํ”„

aa

mean_score, Std_score ์ƒ์œ„ ๋ชจ๋ธ ๊ทธ๋ž˜ํ”„

์ธ์‚ฌ์ดํŠธ

  1. ํ™•์‹คํžˆ ์ตœ์ƒ์œ„๊ถŒ์˜ ๋ชจ๋ธ์—” BatchNorm์ด ๊ฑฐ์˜ ๋ชจ๋‘ ์ ์šฉ๋˜์–ด ์žˆ๋‹ค.

  2. Dropout์€ 0.2์—์„œ ์ข‹์€ ์„ฑ๋Šฅ์„ ๋ณด์˜€๋‹ค.

  3. Lr์€ 0.01์ด ์ข‹๊ฒŒ ๋‚˜์™”์ง€๋งŒ, ์ด๋Š” Epoch์ด 20 ๋ฐ–์— ์•ˆ๋˜์–ด ์•„์ง ํ•™์Šต์ด ์ง„ํ–‰ ์ค‘์ผ ๊ฐ€๋Šฅ์„ฑ์ด ํฌ๋‹ค.

  4. Batch Size๋Š” ๋‚ฎ์„ ์ˆ˜๋ก ์ข‹์€ ์„ฑ๋Šฅ์ด ๋‚˜์™”๋‹ค. ์™„์ „ ๋น„๋ก€๋ผ ๋ณด๊ธด ํž˜๋“ค๊ฒ ์ง€๋งŒ ๋…ผ๋ฌธ์—์„œ๋„ 256์„ ์ ์šฉํ•œ ๊ฒƒ์„ ๋ณด๋ฉด ๋„ˆ๋ฌด ํฌ๋ฉด ๊ฐ Batch์˜ ํŠน์„ฑ์ด ๋ญ‰๊ฐœ์ง€๋Š”๊ฒŒ ์•„๋‹๊นŒ ์ถ”์ธก๋œ๋‹ค.

  5. Optimizer๋Š” Adam๊ณผ RMSprop ๋ชจ๋‘ ๊ดœ์ฐฎ์€ ์„ฑ๋Šฅ์ด ๋‚˜์™”๋‹ค.

3. Model Research & Selection

MNIST ๋ฐ์ดํ„ฐ์…‹์€ ๋Œ€ํ‘œ์ ์ธ ์ด๋ฏธ์ง€ ๋ถ„๋ฅ˜ ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ๋งŽ์€ ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์ด SOTA๊ธ‰ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ฃผ๊ณ  ์žˆ๋‹ค.

๊ทธ์ค‘์—์„œ ํŠนํžˆ CNN ๋ชจ๋ธ์ด ๋น ๋ฅธ ํ•™์Šต ์†๋„์™€ ์›”๋“ฑํ•œ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ฃผ๊ณ  ์žˆ์œผ๋ฉฐ ์ด๋ฒˆ ๊ณผ์ œ์˜ ์ „์‹ ์ธ Kannada MNIST์—์„œ๋„ ๋งŽ์€ ์‚ฌ๋žŒ๋“ค์ด ํ•ด๋‹น ๋ชจ๋ธ์„ ํ†ตํ•ด ์ข‹์€ ์„ฑ์ ์„ ๊ฑฐ๋‘์—ˆ๋‹ค. ์ด๋ฅผ ์ฐธ๊ณ ํ•˜์—ฌ ๋ชจ๋ธ์„ ์„ค๊ณ„ํ•˜๊ณ , Papers with code ์—์„œ SOTA๊ธ‰ ๋…ผ๋ฌธ ๋“ค์„ ์ฐธ๊ณ ํ•˜์—ฌ ์—ฌ๋Ÿฌ ๊ธฐ๋ฒ• ๋“ค์˜ ์žฅ๋‹จ์ ์„ ์ฐธ๊ณ ํ•˜์—ฌ ์ ์šฉํ•ด๋ณธ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ์‹ค์ œ๋กœ ๊ฝค ์ข‹์€ ์ธ์‚ฌ์ดํŠธ๋ฅผ ์–ป์„ ์ˆ˜ ์žˆ์—ˆ๋‹ค.

3.1 SOPCNN

Stochastic Optimization of Plain Convolutional Neural Networks with Simple methods

2020 MNIST SOTA

MNIST 2020๋…„ SOTA์ธ SOPCNN ๋…ผ๋ฌธ์„ ๋ณด๋ฉด, ์ตœ์ ํ™” ๊ธฐ๋ฒ•์— ์ƒ๋‹นํžˆ ๊ณต์„ ๋“ค์˜€์Œ์„ ์•Œ ์ˆ˜ ์žˆ๊ณ  ํŠนํžˆ ์ด๋ฒˆ ๋‚ด์šฉ๊ณผ ๊ฒน์น˜๋Š” ๋ถ€๋ถ„์ด ๊ฐ€์ ธ์™€ ์ ์šฉํ•ด๋ณด๋ฉด ์ข‹์„ ์ ์ด ๋งŽ์•˜๋‹ค. ์ด ๋…ผ๋ฌธ์ด ์ง€ํ–ฅํ•˜๋Š” ๋ฐ”๋Š” CNN๋ชจ๋ธ์ด ์ข‹์€ ์„ฑ๋Šฅ์„ ๋ณด์ด์ง€๋งŒ epoch์ด ์ฆ๊ฐ€ํ•จ์— ๋”ฐ๋ฅธ overfitting ๋ฌธ์ œ๊ฐ€ ์‹ฌํ•˜์—ฌ ์ด๋ฅผ ์–ด๋–ป๊ฒŒ ์ž˜ ์ตœ์ ํ™”ํ• ์ง€๋กœ Data Augmentation๊ณผ ํŠนํžˆ Dropout์„ ์ฃผ๋กœ ๋‹ค๋ฃจ๊ณ  ์žˆ๋‹ค.

Architecture and Design

๊ธฐ๋ณธ์  ๋ชจ๋ธ ๊ตฌ์„ฑ์€ SimpleNet์˜ ๊ตฌ์„ฑ์„ ๋”ฐ๋ฅด๊ณ  ์žˆ๋‹ค๊ณ  ํ•œ๋‹ค. MNIST ๋ชจ๋ธ์„ ์˜ˆ๋ฅผ ๋“ค๋ฉด, ์ด 4๊ฐœ์˜ Conv2D layer ๊ฐ€ ์žˆ๊ณ  2๊ฐœ ๋งˆ๋‹ค Max Pooling Layer๊ฐ€ ๋ถ™๋Š”๋‹ค. ๋’ค์ด์–ด 2๊ฐœ์˜ Fully Connected Layer, ๋งˆ์ง€๋ง‰์—” Softmax Activation Layer๋ฅผ ๋ถ™์—ฌ ๋ชจ๋ธ์„ ์™„์„ฑ์‹œ์ผฐ๋‹ค. ์—ฌ๊ธฐ์„œ ํ•™์Šต๋ฅ ์€ 0.01๋กœ ์ฃผ์—ˆ๊ณ  Dropout ์œ„์น˜ ๋ฐ FC Layer์˜ ํฌ๊ธฐ๋Š” ํŒŒ๋ผ๋ฏธํ„ฐ ์กฐ์ •์„ ํ†ตํ•ด ๊ฒฐ์ •ํ•˜์˜€๋‹ค.

๊ฐ€์žฅ ์ธ์ƒ๊นŠ์—ˆ๋˜ ์ 

  1. Dropout์€ Softmax ์ง์ „์— ํ•˜๋‚˜๋งŒ ์žˆ๋Š” ๊ฒƒ์ด ๊ฐ€์žฅ ์ข‹๋‹ค :Maxpool ๋’ค์— ๋ฐฐ์น˜ํ•˜๊ธฐ๋„ ํ•˜๊ณ , Spatial Dropout๋„ ์ ์šฉํ•ด๋ณด์•˜์ง€๋งŒ ๊ทธ๋ƒฅ Regular Dropout์„ FC ๋‹ค์Œ์— ๋ฐฐ์น˜ํ•œ ๊ฒƒ์ด ๊ฐ€์žฅ ์„ฑ๋Šฅ์ด ์ข‹์•˜๋‹ค๊ณ  ํ•œ๋‹ค.

  2. FC Layer 2048, Drop rate 0.8์ด ๊ฐ€์žฅ ์ข‹๋‹ค: ์™œ ์ด ๋…ผ๋ฌธ ์ œ๋ชฉ์—์„œ Stochastic์ด๋ž€ ๋ง์„ ์ผ๋Š”์ง€ ์•Œ ์ˆ˜ ์žˆ๋Š” ๋Œ€๋ชฉ์ด๋‹ค. ๋ฏฟ๊ธฐ์ง€ ์•Š๋Š” ๋“œ๋กญ์œจ์ด๋ผ 5๋ฒˆ์˜ ๋ฐ˜๋ณต ์‹คํ—˜์„ ํ†ตํ•ด ํ‰๊ท  0.18%์˜ ์—๋Ÿฌ์œจ์ด ๋‚˜์˜จ๋‹ค๋Š” ๊ฒƒ์„ ์ž…์ฆํ•˜์˜€๋‹ค.

๊ทธ ์™ธ์—๋„ Data Augmentation ์…‹ํŒ… ๋“ฑ์„ ์†Œ๊ฐœํ•˜๊ณ  ์žˆ์–ด ํ•™์Šต์— ์ฐธ๊ณ ํ•  ์ˆ˜ ์žˆ์—ˆ๋‹ค.

3.1.1 ๊ตฌํ˜„

In [9]:

In [10]:

ํ…์„œํ”Œ๋กœ์šฐ๊ฐ€ ์นœ์ ˆํ•˜๊ฒŒ Drop rate๊ฐ€ 0.5๊ฐ€ ๋„˜์œผ๋‹ˆ ํ˜น์‹œ ๊ณผ๊ฑฐ์—์„œ ์˜ค์‹  ๋ถ„์ธ์ง€ ์—ฌ์ญ™๊ณ  ์žˆ์ง€๋งŒ ๋ฌด์‹œํ•˜๊ณ  ํ•™์Šต์„ ์ง„ํ–‰ํ•ด๋ณธ๋‹ค.

In [ ]:

3.1.2 ํ•™์Šต ๋ฐ ๊ฒฐ๊ณผ

์‹ค์ œ Colab์—์„œ ๋Œ๋ ค๋ณธ ๊ฒฐ๊ณผ ํ•™์Šต์ด ๋„ˆ๋ฌด ์•ˆ๋˜์—ˆ๋‹ค. Val_Acc๊ฐ€ 0.98์— ์ ‘๊ทผ์กฐ์ฐจ ๋ชปํ•˜๊ณ  ๋ชจ๋ธ์ด ์ „ํ˜€ Simpleํ•˜์ง€ ์•Š์•„ ํ•™์Šต์—๋„ ์˜ค๋žœ์‹œ๊ฐ„์ด ๊ฑธ๋ ธ๋‹ค.

๊ทธ ์ดํ›„๋กœ FC์˜ ํ•˜์ดํผ ํŒŒ๋ผ๋ฏธํ„ฐ 2048๋ฅผ 1024๋กœ, Drop Rate๋ฅผ 0.6, 0.4๋กœ ๊ฐ๊ธฐ ์‹คํ—˜ํ•ด๋ณด์•˜์„ ๋•Œ Val Acc 99.6๋กœ ๋น„์Šทํ•˜๊ฒŒ ๋†’์€ ์„ฑ๋Šฅ์ด ๋‚˜์™”๋‹ค.

์—ฌ๊ธฐ์„œ ๋ฐ์ดํ„ฐ ํฌ๊ธฐ๋‚˜ ๋ถ„๋ฅ˜ํ•  ๊ฐฏ์ˆ˜๋กœ FC ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์ข€ ๋” ์ž‘๊ฒŒ ์กฐ์ •ํ•  ํ•„์š”๊ฐ€ ์žˆ๋‹ค๋Š” ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ์—ˆ์œผ๋ฉฐ, ๋…ผ๋ฌธ์—์„  Epoch์„ 2000๊นŒ์ง€ ์ง„ํ–‰ํ•˜๋Š”๋ฐ ์ง€๊ธˆ Colab์—์„œ ๊ทธ๊ฑด ํž˜๋“ค๊ธฐ ๋•Œ๋ฌธ์— Drop rate์—์„œ ์–ด๋А์ •๋„ ํƒ€ํ˜‘์„ ๋ณด์•„ ๋น ๋ฅธ ํ•™์Šต์„ ์ง„ํ–‰ํ•ด์•ผ ๊ฒ ๋‹ค๋Š” ๋ฐฉํ–ฅ์„ฑ์„ ์„ธ์šธ ์ˆ˜ ์žˆ์—ˆ๋‹ค.

3.2 CNN (VGG + Data Augmentation)

https://www.kaggle.com/benanakca/kannada-mnist-cnn-tutorial-with-app-top-2

https://www.kaggle.com/c/Kannada-MNIST ์—์„œ ์ƒ์œ„ 2%์˜ ์„ฑ๋Šฅ์ด ๋‚˜์˜จ ๋ชจ๋ธ์„ ์ฐพ์„ ์ˆ˜ ์žˆ์—ˆ๊ณ  ์นœ์ ˆํ•˜๊ฒŒ ์—ฌ๋Ÿฌ ๊ธฐ๋ฒ•๋“ค์„ ์†Œ๊ฐœํ•˜๊ณ  ์žˆ์—ˆ๋‹ค.

๊ทธ ์ค‘์—์„œ ํŠนํžˆ ImageDataGenerator์™€ ReduceLROnPlateau์ด ์ธ์ƒ์ ์ด์˜€๋Š”๋ฐ ์ „์ž๋Š” ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ๋ฅผ ๋žœ๋ค์œผ๋กœ ๋ณ€ํ™”์‹œ์ผœ ๊ธฐ์กด ๋ฐ์ดํ„ฐ์— Overfitting๋˜๋Š” ๊ฒƒ์„ ๋ฐฉ์ง€ํ•˜๊ธฐ ์œ„ํ•œ Data Augmentationํˆด๋กœ tf.keras์—์„œ ๋ถˆ๋Ÿฌ์˜ฌ ์ˆ˜ ์žˆ๋‹ค. ํ›„์ž๋Š” ๊ทธ ๋œป๋Œ€๋กœ ์•ˆ์ •๋˜๋ฉด ํ•™์Šต๋ฅ ์„ ๋‚ฎ์ถฐ์ฃผ๋Š” ์ฝœ๋ฐฑํ•จ์ˆ˜๋กœ ํ•™์Šต ์ค‘์— ์ง€ํ‘œ๋ฅผ ๊ณ„์† ๋ชจ๋‹ˆํ„ฐ๋ง ํ•˜์—ฌ ์ผ์ • ์ˆ˜์ค€ ์ด์ƒ ์•ˆ์ •์ด ๋˜๋ฉด factor * lr๋กœ ํ˜„์žฌ์˜ ํ•™์Šต๋ฅ ์„ ์ˆœ์ฐจ์ ์œผ๋กœ ๋‚ฎ์ถ”์–ด min_lr์— ๊ทผ์ ‘ํ•˜๋„๋ก ํ•œ๋‹ค.

์ฃผ์˜ํ•  ์ ์€ ImageDataGenerator์˜ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์ˆ™์ง€ํ•˜์—ฌ ํ˜น์‹œ ๋ชจ๋ฅผ ์‹ค์ˆ˜๋ฅผ ๋ฐฉ์ง€ํ•ด์•ผํ•˜๋Š” ๋ฐ, ํŠนํžˆ Mnist์˜ ๊ฒฝ์šฐ flip์ด ์ผ์–ด๋‚˜์„  ์•ˆ๋˜๋ฉฐ cutout๋„ ์ง€์–‘ํ•œ๋‹ค.

์•„๋ž˜ ํ‘œ๋Š” SOPCNN์—์„œ ์ง„ํ–‰ํ•œ Data Augmentation ์ด๋‹ค.

technique Use

rotation

Only used with mnist

shearing

Yes

Shifting up and down

Yes

Zooming

Yes

rescale

Yes

cutout

No

flipping

No

In [ ]:

3.2.1 ๋ชจ๋ธ๋ง

VGG์™€ ์œ ์‚ฌํ•œ ๋ชจํ˜•์„ ํ•˜๊ณ  ์žˆ์œผ๋ฉฐ Conv2D(512) Layer๋Š” ์ œ๊ฑฐํ•œ ํ›„ flattenํ›„ FC(256)๋งŒ ์ฃผ์—ˆ๋‹ค๋Š” ๊ฒŒ ํŠน์ง•์ด๋‹ค.

์€๋‹‰์ธต์€ ๋ชจ๋‘ LeakyReLU๋ฅผ ์‚ฌ์šฉํ•˜์˜€๋‹ค.

In [ ]:

In [ ]:

Colab์—์„œ Epoch 40๋งŒํผ ํ•™์Šต์‹œํ‚จ ๊ฒฐ๊ณผ Acc 99.74๋ž€ ๊ฒฐ๊ณผ๊ฐ€ ๋‚˜์™”๋‹ค. ๋งค์šฐ ์ข‹์€ ๊ฒฐ๊ณผ๋ผ ์ด ๋ชจ๋ธ๊ณผ ์—ฌ๋Ÿฌ ๊ธฐ๋ฒ•์„ ์ ์šฉํ•˜์—ฌ ํ•˜์ดํผ ํŒŒ๋ผ๋ฏธํ„ฐ ํŠœ๋‹์„ ์ง„ํ–‰ํ•˜์˜€๋‹ค.

4. ํ•™์Šต ๋ฐ ํ•˜์ดํผ ํŒŒ๋ผ๋ฏธํ„ฐ ์กฐ์ •

VGG ๋ชจ๋ธ์— ๋ฐ”ํƒ•์„ ๋‘๊ณ  ์—ฌ๋Ÿฌ ์ตœ์ ํ™” ๊ธฐ๋ฒ•์„ ์ ์šฉํ•˜์˜€๋‹ค. Conv๋ ˆ์ด์–ด ์ถ”๊ฐ€, Relu๋กœ ๋ฐ”๊พธ๊ธฐ, FC Layer ์กฐ์ •, Dropout ์กฐ์ • ๋“ฑ์„ ํ•ด๋ณด์•˜๋‹ค.

Pytorch์—์„  Transformer ๋ชจ๋ธ๋„ ์‹คํ—˜ํ•ด๋ณด์•˜์ง€๋งŒ ์„ฑ๋Šฅ์ด ๋ณ„๋กœ ์ข‹์ง€ ์•Š์•˜๋‹ค. (0.96)

๊ฒฐ๊ณผ์ ์œผ๋กœ SOPCNN์™€ VGG๋ฅผ ์ ์ ˆํžˆ ํ˜ผํ•ฉํ•œ ๋ชจ๋ธ์ด ๊ฐ€์žฅ ์„ฑ๋Šฅ์ด ์ข‹์•˜๋‹ค.

  1. ๋งˆ์ง€๋ง‰์—๋งŒ Dropout: ์ด๋•Œ Drop-rate 0.2 ~ 0.6 ๊นŒ์ง€ ๋‹ค์–‘ํ•˜๊ฒŒ ์ฃผ์—ˆ์„ ๋•Œ 0.25๊ฐ€ ๊ฐ€์žฅ ๊ดœ์ฐฎ์€ ์„ฑ๋Šฅ์„ ๋ณด์˜€๋‹ค.

  2. Conv2D(512) ํ•˜๋‚˜ ์ถ”๊ฐ€: ๊ธฐ์กด VGG์—์„œ 4๊ฐœ๊ฐ€ ์Œ“์ด์ง€๋งŒ ๋ฐ์ดํ„ฐ ํฌ๊ธฐ์™€ ์ด๋ฏธ์ง€ ํฌ๊ธฐ๋ฅผ ๊ณ ๋ คํ–ˆ์„ ๋•Œ ํ•˜๋‚˜๋งŒ ์ถ”๊ฐ€ํ•˜๋Š” ๊ฒƒ์ด ๊ฐ€์žฅ ๋‚˜์€ ์„ฑ๋Šฅ์„ ๋ณด์˜€๋‹ค.

  3. FC Layer 1024: 256 ~ 2048(sopcnn) ~ 4096(vgg) ๋ชจ๋‘ ํ•ด๋ณด์•˜์„ ๋•Œ 1024๊ฐ€ ๊ฐ€์žฅ ์ข‹์€ ์„ฑ๋Šฅ์„ ๋ณด์˜€๋‹ค.

  4. Epoch์„ ์ถฉ๋ถ„ํžˆ ์ฃผ๊ณ  Early Stopping์œผ๋กœ ์ตœ์„ ์˜ ๋ชจ๋ธ์„ ์ฐพ๋Š” ๊ฒƒ์ด ์ข‹๋‹ค.

In [27]:

In [25]:

๊ฒฐ๊ณผ ๋ฐ ๋А๋‚€์ 

์•„์ง ๋ชจ๋ธ์ด ๊นŠ์ง€ ์•Š์•„ ๊ทธ๋Ÿฐ ๊ฒƒ ๊ฐ™์ง€๋งŒ ์ข‹์€ ๋ชจ๋ธ์€ ์ฒ˜์Œ๋ถ€ํ„ฐ loss ๋–จ์–ด์ง€๋Š” ๊ฒŒ ๋‹ค๋ฅด๋‹ค. ์ข‹์€ ๋ชจ๋ธ์ผ ์ˆ˜๋ก epoch 10 ์•ˆ์ชฝ์—์„œ ๋น ๋ฅด๊ฒŒ Val_Acc๊ฐ€ ์ข‹๊ฒŒ ๋‚˜์™€ ์ดํ›„๋ฅผ ๊ฐ€๋Š ํ•ด๋ณผ ์ˆ˜ ์žˆ์—ˆ๋‹ค. ๋ฌผ๋ก  epoch์„ 2000์ •๋„๋กœ ๋‘๊ณ  ๊นŠ๊ฒŒ ํ•™์Šต์‹œํ‚จ๋‹ค๋ฉด ์ข‹๊ฒ ์ง€๋งŒ ํ•œ์ •๋œ ์ž์›์—์„œ ๊ทธ๋‚˜๋งˆ ๋‚˜์€ ๋ชจ๋ธ์„ ๊ณจ๋ผ๋‚ด๊ธฐ ์œ„ํ•œ ์ตœ์„ ์ด ์•„๋‹๊นŒ ์‹ถ๋‹ค. ์ด ๋ชจ๋ธ์˜ ๊ฒฝ์šฐ epoch 8์—์„œ 0.99์˜ val_acc๋ฅผ ๋ณด์ด๊ณ  ์ค‘๊ฐ„ ์ค‘๊ฐ„ 0.998๋ฅผ ์ƒํšŒํ•˜๊ธฐ๋„ ํ•˜์˜€๋‹ค.

๊ทธ๋ฆฌ๊ณ  ์•Œ๊ฒŒ๋œ ๊ฒƒ์ด keras layer์˜ ๊ธฐ๋ณธ kernel weight initializer๊ฐ€ xavier๋ผ๋Š” ๊ฒƒ์„ ์•Œ๊ฒŒ๋˜์—ˆ๋‹ค. ์ข€ ๋” GPU ์ž์›์ด ํ—ˆ๋ฝํ–ˆ๋‹ค๋ฉด weight initializer๋„ ๋ฐ”๊ฟ”๋ณด๊ณ  batch_size๋„ ๋” ๋‹ค์–‘ํ™”์‹œํ‚ฌ ์ˆ˜ ์žˆ์ง€ ์•Š์„๊นŒ๋ž€ ์•„์‰ฌ์›€์ด ๋‚จ๋Š”๋‹ค.

๋ฒˆ์™ธ) Auto Keras

Automl๋กœ ์ž๋™์œผ๋กœ ๋ชจ๋ธ์„ ์งœ์ฃผ๋Š” ์‹œ๋Œ€์—์„œ ๊ณผ์—ฐ ๊ทธ ์„ฑ๋Šฅ์€ ์–ด๋–จ๊นŒ

In [ ]:

In [ ]:

In [ ]:

Trial complete

Trial summary

|-Trial ID: 123c8d89e202f81d1fd46a1f9201f3fe|-Score: 0.03119376050014196|-Best step: 0

Hyperparameters:

|-classification_head_1/dropout_rate: 0.5|-classification_head_1/spatial_reduction_1/reduction_type: flatten|-dense_block_1/dropout_rate: 0|-dense_block_1/num_layers: 1|-dense_block_1/units_0: 128|-dense_block_1/use_batchnorm: False|-image_block_1/augment: False|-image_block_1/block_type: vanilla|-image_block_1/conv_block_1/dropout_rate: 0.25|-image_block_1/conv_block_1/filters_0_0: 32|-image_block_1/conv_block_1/filters_0_1: 64|-image_block_1/conv_block_1/kernel_size: 3|-image_block_1/conv_block_1/max_pooling: True|-image_block_1/conv_block_1/num_blocks: 1|-image_block_1/conv_block_1/num_layers: 2|-image_block_1/conv_block_1/separable: False|-image_block_1/normalize: True|-optimizer: adam

Trial complete

Trial summary

|-Trial ID: 03db11dd05b1734a2cf3413c1ac7e197|-Score: 0.04135792684210588|-Best step: 0

Hyperparameters:

|-classification_head_1/dropout_rate: 0|-dense_block_1/dropout_rate: 0|-dense_block_1/num_layers: 2|-dense_block_1/units_0: 32|-dense_block_1/units_1: 32|-dense_block_1/use_batchnorm: False|-image_block_1/augment: True|-image_block_1/block_type: resnet|-image_block_1/normalize: True|-image_block_1/res_net_block_1/conv3_depth: 4|-image_block_1/res_net_block_1/conv4_depth: 6|-image_block_1/res_net_block_1/pooling: avg|-image_block_1/res_net_block_1/version: v2|-optimizer: adam

Trial complete

Trial summary

|-Trial ID: 2ce4926fd3ec015466417c00c29b3ca4|-Score: 0.029994319529753708|-Best step: 0

Hyperparameters:

|-classification_head_1/dropout_rate: 0.5|-classification_head_1/spatial_reduction_1/reduction_type: flatten|-dense_block_1/dropout_rate: 0|-dense_block_1/num_layers: 1|-dense_block_1/units_0: 128|-dense_block_1/use_batchnorm: False|-image_block_1/augment: False|-image_block_1/block_type: vanilla|-image_block_1/conv_block_1/dropout_rate: 0.25|-image_block_1/conv_block_1/filters_0_0: 32|-image_block_1/conv_block_1/filters_0_1: 64|-image_block_1/conv_block_1/kernel_size: 3|-image_block_1/conv_block_1/max_pooling: True|-image_block_1/conv_block_1/num_blocks: 1|-image_block_1/conv_block_1/num_layers: 2|-image_block_1/conv_block_1/separable: False|-image_block_1/normalize: True|-optimizer: adam

์ƒ์„ฑ๋œ ๋ชจ๋ธ Summary ๋ฐ Sumbit

In [ ]:

์‹ ๊ธฐํ•˜๊ฒŒ๋„ ๊ฝค ๋น„์Šทํ•œ ๋ชจ๋ธ์„ ์งฐ๋‹ค. ์›”์”ฌ ๋‹จ์ˆœํ•˜๊ณ  params๊ฐ€ 11๋งŒ ๋ฐ–์— ์•ˆ๋˜์ง€๋งŒ ์„ฑ๋Šฅ์€ Public Dashboard ๊ธฐ์ค€ 0.9911์ด ๋‚˜์™”๋‹ค.In [ ]:

In [ ]:

Validation ํ•™์Šต

In [ ]:

Validation Set๊นŒ์ง€ ํ•™์Šต์‹œํ‚ค๋‹ˆ 0.993 ์œผ๋กœ ์„ฑ๋Šฅ์ด ํ–ฅ์ƒ๋˜์—ˆ๋‹ค.

Last updated

Was this helpful?