LLM

Insight ๐Ÿ˜Ž

LM์„ ๊ฐ€์žฅ ์ตœ์ ์œผ๋กœ ํ‰๊ฐ€ํ•  ์ˆ˜ ์žˆ๋Š” ๋ฐฉ๋ฒ•์€ ๋ฌด์—‡์ผ๊นŒ? ๐Ÿ˜Ž

์ด๋ฒˆ ํฌ์ŠคํŒ…์€ ๊ธฐ์กด์˜ ํฌ์ŠคํŒ…๊ณผ ์‚ด์ง ๋‹ค๋ฅด๊ฒŒ PPT ์ž๋ฃŒ๋ฅผ ํ™œ์šฉํ•˜์—ฌ ์„ค๋ช…ํ•˜๋„๋ก ํ•˜๊ฒ ๋‹ค. ์ด๋ฒˆ ํฌ์ŠคํŒ…์˜ ์ฃผ์ œ๋Š” ์ œ๋ชฉ์—์„œ ๋ณด์—ฌ์ง€๋Š” ๊ฒƒ์ฒ˜๋Ÿผ LM์˜ Evaluation metric์— ๋Œ€ํ•ด์„œ ์•Œ์•„๋ณด๋Š” ์‹œ๊ฐ„์„ ๊ฐ€์ ธ๋ณด๋ ค๊ณ  ํ•œ๋‹ค! ๐Ÿ˜Š ๊ธฐ์กด์˜ Evaluation metric์— ๋Œ€ํ•ด์„œ ์•Œ์•„๋ณด๊ณ , ๊ธฐ์กด metric๋“ค์— ์–ด๋– ํ•œ ๋ฌธ์ œ๊ฐ€ ์žˆ๋Š”์ง€ ์•Œ์•„๋ณธ ๋’ค, ๋งˆ์ง€๋ง‰์œผ๋กœ ์–ด๋–ค ๊ฐœ์„ ์•ˆ๋“ค์ด ์ƒ๊ฒจ๋‚ฌ๋Š”์ง€์— ๋Œ€ํ•ด์„œ ํ•œ ๋ฒˆ ์•Œ์•„๋ณด๋„๋ก ํ•˜๊ฒ ๋‹ค. ๋งŒ์•ฝ PPT๋ฅผ ๋ณด๋ฉด์„œ ๊ถ๊ธˆํ•˜๊ฑฐ๋‚˜ ์˜ค๋ฅ˜๊ฐ€ ์žˆ๋Š” ๊ฒƒ ๊ฐ™์€ ์‚ฌํ•ญ๋“ค์€ PPT ๋˜๋Š” ํฌ์ŠคํŒ…์— ๋Œ“๊ธ€์„ ๋‹ฌ์•„์ฃผ์‹œ๋ฉด ๋‹ต๋ณ€์„ ๋‹ฌ์•„๋†“๋„๋ก ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค! ์žฌ๋ฐŒ๊ฒŒ ๋ด์ฃผ์‹ญ์‡ผ! ๐Ÿคฉ https://docs.google.com/presentation/d/1XL_B0nI-yp2dgLDVrEzTlLcg9DpUnALBklmpJ4iOZRw/e..

Insight ๐Ÿ˜Ž

How has scaling law developed in NLP? ๐Ÿค” - NLP์—์„œ scaling law๋Š” ์–ด๋–ป๊ฒŒ ๋ฐœ์ „๋˜์—ˆ์„๊นŒ?

Before Starting.. 2017๋…„ NLP๋ฅผ ํฌํ•จํ•œ ์ง€๊ธˆ๊นŒ์ง€์˜ ๋”ฅ๋Ÿฌ๋‹์˜ ํŒ๋„๋ฅผ ๋’ค์ง‘์–ด์—Ž๋Š” ํ˜์‹ ์ ์ธ ๋ชจ๋ธ์ธ 'Transformer'๊ฐ€ ์ œ์•ˆ๋˜์—ˆ๋‹ค. ์ด๋ฒˆ ํฌ์ŠคํŒ…์—์„œ ๋‹ค๋ค„๋ณผ ๋‚ด์šฉ์€ Transformer์— ๋Œ€ํ•œ ์ž์„ธํ•œ ๋‚ด์šฉ์ด ์•„๋‹ˆ๊ธฐ์— ๋”ฐ๋กœ ๊นŠ์ด ์•Œ์•„๋ณด์ง€๋Š” ์•Š๊ฒ ์ง€๋งŒ, ์ด๋ฒˆ ํฌ์ŠคํŒ…์„ ์ดํ•ดํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ์ด ๋ชจ๋ธ์˜ ์‚ฌ์ด์ฆˆ์— ๋Œ€ํ•ด์„œ๋Š” ์•Œ์•„๋‘˜ ํ•„์š”๊ฐ€ ์žˆ๋‹ค. Transformer์˜ ์‚ฌ์ด์ฆˆ๋Š” 465M ๊ฐœ์˜ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ๊ฐ€์ง€๋Š” ๋ชจ๋ธ์ด์—ˆ๋‹ค. ํ•˜์ง€๋งŒ, ๋ถˆ๊ณผ 3๋…„ ๋งŒ์— ์ด ์‚ฌ์ด์ฆˆ๊ฐ€ ์ •๋ง ์ž‘๊ฒŒ ๋Š๊ปด์ง€๊ฒŒ ํ•  ๋งŒํผ ํฐ ์‚ฌ์ด์ฆˆ์˜ ๋ชจ๋ธ์ธ GPT-3(175B)๊ฐ€ ๋‚˜์˜ค๊ฒŒ ๋˜์—ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ํ˜„์žฌ๊นŒ์ง€๋„ ์ด๋ณด๋‹ค ๋” ํฐ ๋ชจ๋ธ๋“ค์€ ๊ณ„์† ๋‚˜์˜ค๊ณ  ์žˆ๋‹ค. LM์˜ ์‚ฌ์ด์ฆˆ๊ฐ€ ์ด๋ ‡๊ฒŒ ์ ์  ์ปค์ง€๊ฒŒ ๋œ ์ด์œ ๋Š” ๋ฌด์—‡์ผ๊นŒ? ๊ทธ ์ด์œ ๋Š” Kaplan et al. 2020..

Cartinoe
'LLM' ํƒœ๊ทธ์˜ ๊ธ€ ๋ชฉ๋ก