Paper Reading ๐Ÿ“œ/Deep Learning

Paper Reading ๐Ÿ“œ/Deep Learning

Zero-shot, One-shot, Few-shot Learning์ด ๋ฌด์—‡์ผ๊นŒ?

์š”์ฆ˜ ๋จธ์‹ ๋Ÿฌ๋‹ ๋…ผ๋ฌธ๋“ค์„ ์ฝ์–ด๋ณด๋ฉด zero-shot, one-shot, few-shot ๋“ฑ์„ ๋งŽ์ด ๋ณผ ์ˆ˜ ์žˆ๋‹ค. ์ด๋ฒˆ ํฌ์ŠคํŠธ์—์„œ๋Š” ์ด ์šฉ์–ด๋“ค์— ๋Œ€ํ•ด ์•Œ์•„๋ณด๊ณ  ์ด method๋“ค์ด ์–ด๋–จ ๋•Œ ์‚ฌ์šฉ๋˜๋Š”์ง€ ์•Œ์•„๋ณด์•˜๋‹ค. Overview ๋จธ์‹ ๋Ÿฌ๋‹์—์„œ๋Š” ๋ฐ์ดํ„ฐ๊ฐ€ ์ถฉ๋ถ„ํ•˜์ง€ ์•Š์„ ๋•Œ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ๋‹ค์–‘ํ•œ ๋ฐฉ๋ฒ•์ด ์žˆ๋‹ค. N-shot learning์€ ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์ด 5๊ฐœ ์ดํ•˜์˜ ์ด๋ฏธ์ง€๋งŒ์„ ์‚ฌ์šฉํ•ด์„œ ํ•™์Šต๋  ๋•Œ์˜ ๊ฒฝ์šฐ์ด๋‹ค. N-shot learning ํ•„๋“œ๋Š” ๊ฐ $K$ ํด๋ž˜์Šค์˜ labeled sample์˜ ์ˆ˜ $n$์„ ํฌํ•จํ•˜๊ณ  ์žˆ๋‹ค. ์ด N-shot learning์€ ๋‹ค์Œ์˜ 3๊ฐœ๋กœ ๋‚˜๋‰˜์–ด์ง„๋‹ค: zero-shot learning, one-shot learning, few-shot learning. ์ด 3 method๋Š” traini..

Paper Reading ๐Ÿ“œ/Deep Learning

Prompt Engineering์ด ๋ฌด์—‡์ผ๊นŒ?

์—ฌ๋Ÿฌ LM๋“ค์˜ ๊ฐœ๋ฐœ๋กœ ์ธํ•˜์—ฌ ์‚ฌ๋žŒ๋“ค์€ ์ „๋ก€ ์—†๋Š” ์ƒˆ๋กœ์šด ๊ธฐ์ˆ ๋“ค์„ ๋งŒ๋‚˜๊ณ  ์žˆ๋‹ค. ์ด ์–˜๊ธฐ๋ฅผ ์—ฌ๋Ÿฌ ํฌ์ŠคํŠธ์—์„œ ํ–ˆ๋˜ ๊ฒƒ ๊ฐ™์€๋ฐ, ChatGPT๋Š” ์•„์ง๋„ ๋ฌด๊ถ๋ฌด์ง„ํ•œ ๊ฐ€๋Šฅ์„ฑ์„ ๋ณด์—ฌ์ฃผ๊ณ  ์žˆ๋‹ค. ์ด๋Ÿฌํ•œ LM๋“ค์˜ ์„ฑ๋Šฅ์„ ํ–ฅ์ƒ์‹œํ‚ค๋Š” ๋ฐฉ๋ฒ•์œผ๋กœ ๋งŽ์ด ์‚ฌ์šฉ๋˜๊ณ  ์žˆ๋Š” ๋ฐฉ๋ฒ•์ด Prompt Engineering์ด๋‹ค. ๋ณธ ๋ธ”๋กœ๊ทธ์—์„œ ๋ฆฌ๋ทฐํ•œ ์—ฌ๋Ÿฌ ๋…ผ๋ฌธ๋“ค์—์„œ๋„ ๋“ฑ์žฅํ–ˆ๋˜ Prompt Engineering์— ๋Œ€ํ•ด ๋”์šฑ ์ž์„ธํ•œ ์ดํ•ด๊ฐ€ ํ•„์š”ํ•  ๊ฒƒ ๊ฐ™์•„์„œ ์ด๋ ‡๊ฒŒ ํฌ์ŠคํŠธ๋ฅผ ์ž‘์„ฑํ•ด๋ณธ๋‹ค. ๐Ÿค“ ์šฐ์„  Prompt Engineering์— ๋Œ€ํ•ด์„œ ์•Œ์•„๋ณด๊ธฐ ์ „์— Prompt๊ฐ€ ๋ฌด์—‡์ธ์ง€ ๋ถ€ํ„ฐ ์•Œ์•„๋ณด๋„๋ก ํ•˜์ž! ๐Ÿ”ฅ Prompt๋ž€? Prompt๋Š” LLM์œผ๋กœ๋ถ€ํ„ฐ ์‘๋‹ต์„ ์ƒ์„ฑํ•˜๊ธฐ ์œ„ํ•œ ์ž…๋ ฅ๊ฐ’์„ ์˜๋ฏธํ•œ๋‹ค. ๋‹ค์Œ์˜ ๊ทธ๋ฆผ์ด Prompt์˜ ์˜ˆ์‹œ์ด๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด ์„ค๋ช…ํ•˜์ž๋ฉด, ..

Paper Reading ๐Ÿ“œ/Deep Learning

LSTM vs GRU ๋ญ๊ฐ€ ๋” ๋‚˜์„๊นŒ?: Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ

The overview of this paper RNN ์ฆ‰, recurrent neural network๋Š” ์˜ค๋žœ ์‹œ๊ฐ„ ๋™์•ˆ ์‚ฌ์šฉ๋˜์–ด์˜จ ์‹ ๊ฒฝ๋ง์ด๋‹ค. ํ•˜์ง€๋งŒ, ์‹œ๊ฐ„์˜ ํ๋ฆ„์— ๋”ฐ๋ผ ์ฒ˜๋ฆฌํ•ด์•ผ ํ•˜๋Š” ๋ฐ์ดํ„ฐ์˜ ์–‘์ด ๋งŽ์•„์ง๊ณผ task์˜ ๋ณต์žก๋„๊ฐ€ ์ฆ๊ฐ€ํ•จ์— ๋”ฐ๋ผ์„œ RNN์œผ๋กœ๋ถ€ํ„ฐ ์žฅ๊ธฐ์˜์กด์„ฑ์˜ ๋ฌธ์ œ๊ฐ€ ์กด์žฌํ•จ์„ ์•Œ๊ฒŒ ๋˜์—ˆ๋‹ค. $($์žฅ๊ธฐ์˜์กด์„ฑ์— ๋Œ€ํ•ด ์ž˜ ๋ชจ๋ฅด์‹ ๋‹ค๋ฉด ์—ฌ๊ธฐ๋ฅผ ์ฐธ๊ณ ํ•ด์ฃผ์„ธ์š”!!$)$ ์ด๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด gating unit์„ ๊ฐ€์ง€๊ณ  ์žˆ๋Š” ์ƒˆ๋กœ์šด RNN ๋ชจ๋ธ๋“ค์ธ LSTM๊ณผ GRU ๋“ฑ์ด ๋“ฑ์žฅํ•˜๊ฒŒ ๋˜์—ˆ๋‹ค. ์ด ๋‘˜์€ ์ƒ๋‹นํžˆ ์ข‹์€ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ฃผ์ง€๋งŒ, ๋‘˜ ์ค‘ ์–ด๋–ค ๊ฒƒ์ด ๋” ์šฐ์›”ํ•œ ๋ชจ๋ธ์ธ์ง€์— ๋Œ€ํ•ด์„œ๋Š” ์˜๊ฒฌ์ด ๋ถ„๋ถ„ํ–ˆ๋‹ค. ๊ทธ๋ž˜์„œ ์ด ๋…ผ๋ฌธ์€ ์ด๋Ÿฌํ•œ ๋…ผ๋ž€์„ ์ž ์žฌ์šฐ๊ณ ์ž ์ข€ ๋” ์„ธ๋ฐ€ํ•˜๊ฒŒ ์ด ๋‘ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์„ ๋น„๊ตํ•ด๋ณด๊ณ ์ž ํ•˜์˜€๋‹ค. Table ..

Paper Reading ๐Ÿ“œ/Deep Learning

์•Œ๊ธฐ ์‰ฝ๊ฒŒ LSTM networks ์ดํ•ดํ•˜๊ธฐ

The purpose of this post LSTM network๋Š” ๊ฐœ๋ฐœ๋œ ์ง€ ์˜ค๋ž˜๋œ ๋ชจ๋ธ์ž„์—๋„ ๋ถˆ๊ตฌํ•˜๊ณ  ์ˆ˜๋งŽ์€ ๋ถ„์•ผ์—์„œ ์˜ค๋žซ๋™์•ˆ ์‚ฌ์šฉ๋˜์—ˆ๋˜ ๋ชจ๋ธ์ด๋‹ค. ์š”์ฆ˜์—๋Š” ๋” ์„ฑ๋Šฅ์ด ์ข‹์€ ๋ชจ๋ธ๋“ค์ด ๋งŽ์ด ๋‚˜์˜ค๋ฉด์„œ ์˜ˆ์ „ ๋งŒํผ์˜ ๋ช…์„ฑ์„ ๋ณด์—ฌ์ฃผ์ง€๋Š” ๋ชปํ•˜๊ณ  ์žˆ์ง€๋งŒ, LSTM์ด ์ •๋ง ๋Œ€๋‹จํ–ˆ๋˜ ๋ชจ๋ธ์ด์—ˆ๋‹ค๋Š” ์ ์€ ๋ณ€ํ•จ์ด ์—†๋‹ค. ๊ทธ๋ž˜์„œ!! ์ด๋ฒˆ ํฌ์ŠคํŠธ์—์„œ๋Š” ์ด LSTM network์— ๋Œ€ํ•ด์„œ ์•Œ์•„๋ณด๋ ค๊ณ  ํ•œ๋‹ค. ๋ณธ ํฌ์ŠคํŠธ๋Š” colah's blog๋ฅผ ์ฐธ๊ณ ํ•˜์—ฌ ์ž‘์„ฑ๋˜์—ˆ๋‹ค. Table of Contents 1. RNN $($Recurrent Neural Networks$)$ 2. ์žฅ๊ธฐ์˜์กด์„ฑ$($Long-Term Dependencies$)$ ๋ฌธ์ œ 3. LSTM Networks 3-1. LSTM์˜ ํ•ต์‹ฌ ์•„์ด๋””์–ด 3-2. ๋‹จ๊ณ„ ๋ณ„๋กœ LSTM..

Paper Reading ๐Ÿ“œ/Deep Learning

Distilling the Knowledge in a Neural Network ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ

Why I read this paper? ์ง€๋‚œ ๋ฒˆ ํฌ์ŠคํŠธ์—์„œ ๋‹ค๋ค˜๋˜ DistilBERT์— ๋Œ€ํ•ด์„œ ์ž์„ธํžˆ ๊ณต๋ถ€ํ•˜๋˜ ์ค‘ ์ด DistilBERT์˜ ๋ฉ”์ธ์ด ๋˜๋Š” Knowledge Distillation์— ๋Œ€ํ•ด์„œ ๋”์šฑ ์ž์„ธํ•˜๊ฒŒ ์•Œ์•„๋ณด๊ณ ์ž Knowledge Distillation์— ๋Œ€ํ•ด ์ฒ˜์Œ์œผ๋กœ ์†Œ๊ฐœํ•œ ์ด ๋…ผ๋ฌธ์„ ์ฐพ์•„ ์ฝ๊ฒŒ ๋˜์—ˆ๋‹ค. ์ด Knowledge Distillation์€ ํ˜„์žฌ ์‹ ๊ฒฝ๋ง ๋ชจ๋ธ๋“ค์˜ ๋ฌธ์ œ์ ์ธ ๊ธ‰๊ฒฉํ•˜๊ฒŒ ์ฆ๊ฐ€ํ•˜๋Š” ํŒŒ๋ผ๋ฏธํ„ฐ ์ˆ˜๋กœ ์ธํ•œ ๋ชจ๋ธ์˜ ์šฉ๋Ÿ‰ ๋ฌธ์ œ๋ฅผ ์™„ํ™”์‹œ์ผœ ์ค„ ์ˆ˜ ์žˆ๋Š” ๋ฐฉ๋ฒ•์œผ๋กœ ์กฐ๊ธˆ์˜ ์„ฑ๋Šฅ ์†์‹ค์ด ์žˆ๊ธด ํ•ด๋„ ํš๊ธฐ์ ์ธ ์‹œ๊ฐ„ ์ ˆ์•ฝ์„ ๋ณด์—ฌ์คฌ๋‹ค. ๋ณธ ํฌ์ŠคํŠธ๋Š” ๊ธฐ์กด์˜ ํฌ์ŠคํŠธ๋“ค๊ณผ๋Š” ๋‹ค๋ฅด๊ฒŒ ์ข€ ๋” ์œ ์—ฐํ•œ ์ „๊ฐœ๋ฅผ ๊ฐ€์ ธ๊ฐ€ ๋ณด๊ณ ์ž ํ•œ๋‹ค. $($๋…ธ๋ ฅ์€ ํ–ˆ์œผ๋‚˜, ๊ทธ๋ ‡์ง€ ์•Š์„ ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค~ ^^$)$ The overvi..

Paper Reading ๐Ÿ“œ/Deep Learning

CNN network์˜ ์—ญ์‚ฌ

What is the purpose of this post? ์ด๋ฒˆ ํฌ์ŠคํŠธ์—์„œ๋Š” CNN network์˜ ์—ญ์‚ฌ์— ๋Œ€ํ•ด ์•Œ์•„๋ณด์•˜๋‹ค. CNN์—๋Š” ์—ฌ๋Ÿฌ๊ฐ€์ง€ network๊ฐ€ ์žˆ์—ˆ๋Š”๋ฐ, ์˜ˆ๋ฅผ ๋“ค์–ด LeNet๊ณผ AlexNet ๋“ฑ์ด ์žˆ์—ˆ๋‹ค. ์ด๋ฒˆ ํฌ์ŠคํŠธ์—์„œ๋Š” ์–ด๋– ํ•œ CNN network๋“ค์ด ์žˆ์—ˆ๋Š”์ง€ ์•Œ์•„๋ณด์•˜๋‹ค. Table of Contents 1. LeNet 2. AlexNet 3. VGGNet 4. GoogLeNet 5. ResNet 6. ResNeXt 7. Xception 8. MobileNet 9. DenseNet 10. EfficientNet 11. ConvNext 1. LeNet$($1998$)$ Gradient-Based Learning Applied to Document Recognition LeNet์€ ์†..

Cartinoe
'Paper Reading ๐Ÿ“œ/Deep Learning' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๊ธ€ ๋ชฉ๋ก