Paper Reading ๐Ÿ“œ/Alignment Problem of LLM

Paper Reading ๐Ÿ“œ/Alignment Problem of LLM

Aligning Large Language Models through Synthetic Feedback ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ

The overview of this paper LLM์„ human value๋กœ align ํ•˜๋Š” ๊ฒƒ์€ LLM์˜ ์ •๊ตํ•œ ์กฐ์ข…์„ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•ด ์ฃผ๊ธฐ ๋•Œ๋ฌธ์— ์ค‘์š”ํ•ด์กŒ๋‹ค. ํ•˜์ง€๋งŒ alignment๋Š” ์ƒ๋‹นํ•œ ์–‘์˜ human demonstration๊ณผ ํ”ผ๋“œ๋ฐฑ์„ ํ•„์š”๋กœ ํ•œ๋‹ค. ์ตœ๊ทผ์˜ open-source model์€ ์ด๋ฏธ align ๋œ InstructGPT์™€ ChatGPT ๊ฐ™์€ LLM์œผ๋กœ๋ถ€ํ„ฐ ๋ฐ์ดํ„ฐ๋ฅผ distill ํ•จ์œผ๋กœ์จ alignment learning ํ”„๋กœ์„ธ์Šค๋ฅผ ๋ณต์ œํ•˜์˜€๋‹ค. ์ด ํ”„๋กœ์„ธ์Šค๋Š” ์‚ฌ๋žŒ์˜ ๋…ธ๋ ฅ์„ ์ค„์—ฌ์ฃผ์ง€๋งŒ, teacher model์— ์ƒ๋‹นํžˆ ์˜์กด์ ์ด๋‹ค. ์ด ๋…ผ๋ฌธ์—์„œ๋Š” ์‚ฌ๋žŒ์˜ ๋…ธ๋™์ด ๊ฑฐ์˜ ํ•„์š”ํ•˜์ง€ ์•Š๊ณ  pre-aligned LLM์— ์˜์กดํ•˜์ง€ ์•Š๋Š” ์ƒˆ๋กœ์šด ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์†Œ๊ฐœํ•˜์˜€๋‹ค. ์ด ํ”„๋ ˆ์ž„์›Œํฌ์˜ ํ”„๋กœ์„ธ์Šค๋Š” ๋‹ค..

Paper Reading ๐Ÿ“œ/Alignment Problem of LLM

ICIL: In-Context Instruction Learning ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ

The overview of this paper instruction learning์€ instruction tuning๊ณผ RLHF๋ฅผ ํฌํ•จํ•˜๋Š” fune-tuning ๋ฌธ์ œ๋กœ ์ ‘๊ทผ๋˜์—ˆ๋‹ค. ์—ฌ๊ธฐ์„œ LLM์€ ๋‹ค์–‘ํ•œ task์—์„œ instruction๊ณผ ํ•จ๊ป˜ ๋‹ค์–‘ํ•œ task์—์„œ fine-tune ๋˜์—ˆ๋‹ค. in-context learning์„ instruction learning์— ์ ์šฉํ•œ ๊ฒƒ์ด In-Context Instruction Learning(ICIL)์ด๋‹ค. ICIL์€ pre-trained & instruction-finetned ๋ชจ๋ธ์˜ zero-shot task ์ผ๋ฐ˜ํ™” ์„ฑ๋Šฅ์„ ์ƒ๋‹นํžˆ ๊ฐœ์„ ์‹œ์ผฐ๋‹ค. ICIL์˜ ํ•œ ๊ฐ€์ง€ ํ•ต์‹ฌ ์žฅ์ ์€ ๋ชจ๋“  task๋ฅผ ํ‰๊ฐ€ํ•˜๊ธฐ ์œ„ํ•ด ์—ฌ๋Ÿฌ ๊ฐœ์˜ cross-task๋ฅผ ์—ฐ๊ฒฐํ•œ ํ•˜๋‚˜์˜ ๊ณ ์ •..

Paper Reading ๐Ÿ“œ/Alignment Problem of LLM

LIMA: Less Is More for Alignment ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ

The overview of this paper LLM์€ ๋‘ ๊ฐ€์ง€์˜ ๋‹จ๊ณ„๋กœ ํ•™์Šต๋œ๋‹ค. general-purpose representation์„ ํ•™์Šตํ•˜๊ธฐ ์œ„ํ•ด, raw text๋กœ๋ถ€ํ„ฐ unsupervised pre-training์„ ์‚ฌ์šฉ end task์™€ ์‚ฌ์šฉ์ž ์„ ํ˜ธ๋ฅผ align ํ•˜๊ธฐ ์œ„ํ•ด ๋Œ€๊ทœ๋ชจ instruction tuning & RL์„ ์‚ฌ์šฉ ์ด ๋‘ ๊ฐ€์ง€ stage์˜ ์ค‘์š”์„ฑ์„ ์ธก์ •ํ•˜๊ธฐ ์œ„ํ•ด ์–ด๋– ํ•œ RL ๋˜๋Š” human preference modeling ์—†์ด ์˜ค์ง 1000๊ฐœ์˜ ์‹ ์ค‘ํ•˜๊ฒŒ ์„ ์ •๋œ prompt & response์—์„œ ๊ธฐ์กด supervised loss๋ฅผ ์‚ฌ์šฉํ•ด์„œ fine-tune ๋œ LLaMA-65B์ธ LIMA๋ฅผ ํ•™์Šต์‹œ์ผฐ๋‹ค. LIMA๋Š” ๋ณต์žกํ•œ ์ฟผ๋ฆฌ๋ฅผ ํฌํ•จํ•˜๋Š” training ๋ฐ์ดํ„ฐ์˜ ๋ช‡ ๊ฐ€์ง€ ์˜ˆ..

Paper Reading ๐Ÿ“œ/Alignment Problem of LLM

Red Teaming Language Models with Language Models ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ

The overview of this paper LM์€ ์ข…์ข… ์˜ˆ์ƒ์น˜ ๋ชปํ•œ ๋ฐฉ๋ฒ•์œผ๋กœ ์‚ฌ์šฉ์ž์—๊ฒŒ ํ•ด๋ฅผ ๊ฐ€ํ•  ์ˆ˜๋„ ์žˆ๋‹ค. ์ด์ „์˜ ์—ฐ๊ตฌ๋“ค์—์„œ๋Š” human annotator๋กœ๋ถ€ํ„ฐ harmful์˜ ํŠน์„ฑ์„ ์ •์˜ํ•˜๊ฒŒ ํ•˜๋Š” ๋ฐฉ์‹์„ ์‚ฌ์šฉํ•˜์˜€๋‹ค. ํ•˜์ง€๋งŒ, human annotator๋Š” ๋น„์šฉ์ด ๋น„์‹ธ๊ณ , test case์˜ ๋‹ค์–‘์„ฑ๊ณผ ์ˆ˜์— ์ œ์•ฝ์ด ๊ฑธ๋ฆฐ๋‹ค๋Š” ๋‹จ์ ์ด ์žˆ๋‹ค. ์ด ๋…ผ๋ฌธ์—์„œ๋Š” ๋‹ค๋ฅธ LM์„ ์‚ฌ์šฉํ•ด์„œ "red teaming" test case๋ฅผ ์ •์˜ํ•จ์œผ๋กœ์จ ํƒ€๊นƒ LM์ด harmful way๋กœ ํ–‰๋™ํ•˜๋Š” ์ผ€์ด์Šค๋ฅผ ์ž๋™์ ์œผ๋กœ ์ฐพ๋Š” ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•˜์˜€๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๊ณต๊ฒฉ์ ์ธ ์ฝ˜ํ…์ธ ๋ฅผ ๊ฐ์ง€ํ•˜๋„๋ก ํ•™์Šต๋œ classifier๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ƒ์„ฑ๋œ ํ…Œ์ŠคํŠธ ์งˆ๋ฌธ์— ๋Œ€ํ•œ ๋Œ€์ƒ LM์˜ ์‘๋‹ต์„ ํ‰๊ฐ€ํ•˜๊ณ  280B LM ์ฑ—๋ด‡์—์„œ ์ˆ˜๋งŒ ๊ฐœ์˜ ๊ณต๊ฒฉ์ ์ธ ์‘๋‹ต์„ ๋ฐœ๊ฒฌ..

Paper Reading ๐Ÿ“œ/Alignment Problem of LLM

Training a helpful and harmless assistant with reinforcement learning from human feedback ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ

์ด๋ฒˆ ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋Š” ๊ธฐ์กด ๋ฐฉ์‹๊ณผ ๋‹ค๋ฅด๊ฒŒ powerpoint๋กœ ์ž‘์„ฑํ•˜์˜€๋‹ค. ๋…ผ๋ฌธ์˜ ๊ฐ„๋‹จํ•œ ๊ฐœ์š”๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™๊ณ , ๋…ผ๋ฌธ์— ๋Œ€ํ•œ ์ž์„ธํ•œ ๋‚ด์šฉ์€ ์ฒจ๋ถ€๋œ powerpoint ํŒŒ์ผ์„ ํ™•์ธํ•˜๊ธธ ๋ฐ”๋ž€๋‹ค. powerpoint์˜ ๋ฉ”๋ชจ์™€ ์Šฌ๋ผ์ด๋“œ ๋…ธํŠธ์— ์„ค๋ช…์„ ์ ์–ด๋’€์œผ๋‹ˆ ์ฐธ๊ณ ํ•˜๊ธธ ๋ฐ”๋ž€๋‹ค. ์ด ํฌ์ŠคํŒ…์€ ๋‹ค์Œ์˜ ์œ ํŠœ๋ธŒ๋ฅผ ์ฐธ๊ณ ํ•˜์—ฌ ์ž‘์„ฑ๋˜์—ˆ๋‹ค. The overview of this paper ๋…ผ๋ฌธ์—์„œ๋Š” LM์ด ์œ ์šฉ(helpful)ํ•˜๊ณ  ์œ ํ•ดํ•˜์ง€ ์•Š๊ฒŒ(harmless)ํ•˜๊ฒŒ ์ž‘๋™ํ•˜๋„๋ก preference modeling(PM)๊ณผ ์‚ฌ๋žŒ์˜ ํ”ผ๋“œ๋ฐฑ์œผ๋กœ๋ถ€ํ„ฐ ๊ฐ•ํ™”ํ•™์Šต(RLHF)๋ฅผ ์ ์šฉํ•˜์—ฌ fine-tune ๋˜์—ˆ๋‹ค. ๋…ผ๋ฌธ์—์„œ๋Š” ์ด๋Ÿฌํ•œ alignment training์ด ๋Œ€๋ถ€๋ถ„์˜ NLP ํ‰๊ฐ€์—์„œ ์„ฑ๋Šฅ์„ ํ–ฅ์ƒ์‹œํ‚ค๊ณ , python ์ฝ”๋”ฉ ๋˜๋Š” ์š”์•ฝ๊ณผ ๊ฐ™์€ ..

Paper Reading ๐Ÿ“œ/Alignment Problem of LLM

Exploring the Benefits of Training Expert Language Models over Instruction Tuning ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ

The overview of this paper ์ตœ๊ทผ์— multi-task prompted fine-tunig(MT)๋ผ๊ณ  ์•Œ๋ ค์ ธ ์žˆ๋Š” ๋‹ค์–‘ํ•œ task์—์„œ instruction-tuneํ•˜๋Š” LM์€ unseen task์— ๋Œ€ํ•ด ์ผ๋ฐ˜ํ™”ํ•˜๋Š” ๋Šฅ๋ ฅ์„ ๋ณด์—ฌ์ฃผ๊ณ  ์žˆ๋‹ค. ์ด์ „์˜ ์—ฐ๊ตฌ์—์„œ๋Š” ๊ฐ•๋ ฅํ•œ MT LM์„ ๋งŒ๋“œ๋Š”๋ฐ๋Š” ํ•™์Šต task์˜ ์ˆ˜๋ฅผ ๋Š˜๋ฆฌ๋Š” ๊ฒƒ์ด ์ค‘์š”ํ•œ ์š”์†Œ๋ผ๊ณ  ๋ฐํ˜”์—ˆ๋‹ค. ํ•˜์ง€๋งŒ, ์ด ๋…ผ๋ฌธ์—์„œ๋Š” ์˜ค์ง ํ•˜๋‚˜์˜ task์—์„œ ํ•™์Šต๋œ expert LM์ด 300๊ฐœ ์ด์ƒ์˜ ์„œ๋กœ ๋‹ค๋ฅธ task์—์„œ ํ•™์Šต๋œ MT LM์„ ๋Šฅ๊ฐ€ํ•œ๋‹ค๋Š” ์žˆ๋‹ค๋Š” ๊ฒƒ์„ ๋ณด์—ฌ์คฌ๋‹ค. ์ด ๋ฐœ๊ฒฌ์€ ์ด์ „์˜ task์˜ ์ˆ˜๋ฅผ ๋Š˜๋ฆฌ๋ฉด ๊ฐ•๋ ฅํ•ด์ง„๋‹ค๋Š” ๋ฏฟ์Œ์— ์˜๋ฌธ์„ ์ œ๊ธฐํ•˜์˜€๋‹ค. ์ด ๋ฐœ๊ฒฌ์„ ํ†ตํ•ด ๋…ผ๋ฌธ์—์„œ๋Š” ๋‹จ์ผ MT LM ๋Œ€์‹  ํ•™์Šต task ๋‹น ๋ณ„๋„์˜ expert LM์„ ํ•™์Šต..

Paper Reading ๐Ÿ“œ/Alignment Problem of LLM

Scaling Instruction-Finetuned Language Models ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ

The overview of this paper LM์„ instruction์œผ๋กœ ํ‘œํ˜„๋˜์–ด ์žˆ๋Š” ๋ฐ์ดํ„ฐ์…‹ ๋ชจ์Œ์—์„œ fine-tuneํ•˜๋Š” ๊ฒƒ์€ ํ–ฅ์ƒ๋œ ์„ฑ๋Šฅ๊ณผ unseen task์— ๋Œ€ํ•œ ์ผ๋ฐ˜ํ™”๋ฅผ ๋ณด์—ฌ์ฃผ๊ณ  ์žˆ๋‹ค. ์ด ๋…ผ๋ฌธ์—์„œ๋Š” instruction fine-tuning์„ ํŠน๋ณ„ํ•œ ๊ด€์ ์—์„œ ๋“ค์—ฌ๋‹ค ๋ณด์•˜๋‹ค. task์˜ ์ˆ˜ ๋Š˜๋ฆฌ๊ธฐ ๋ชจ๋ธ ์‚ฌ์ด์ฆˆ ๋Š˜๋ฆฌ๊ธฐ CoT ๋ฐ์ดํ„ฐ์—์„œ fine-tune ์œ„์˜ ์ธก๋ฉด์„ ์‚ฌ์šฉํ•œ instruction fine-tuning์€ ์„ฑ๋Šฅ์„ ์ƒ๋‹นํžˆ ํ–ฅ์ƒ์‹œํ‚ค๋Š” ๋ชจ์Šต์„ ๋ณด์—ฌ์ ”๋‹ค. ์ „๋ฐ˜์ ์œผ๋กœ instruction fine-tuning์€ ์„ฑ๋Šฅ๊ณผ pre-trained LM์˜ ๊ฐ€์šฉ์„ฑ์„ ํ–ฅ์ƒ์‹œํ‚ค๊ธฐ ์œ„ํ•œ ์ผ๋ฐ˜์ ์ธ ๋ฐฉ๋ฒ•์ด๋‹ค. Table of Contents 1. Introduction 2. Flan Finetuning..

Paper Reading ๐Ÿ“œ/Alignment Problem of LLM

Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-shot Learners ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ

The overview of this paper Meta-training์€ task instruction๊ณผ ์ž…๋ ฅ ์ธ์Šคํ„ด์Šค๊ฐ€ ์ฃผ์–ด์ง€๋ฉด ํƒ€๊นƒ ๋ผ๋ฒจ์˜ ๊ฐ€๋Šฅ์„ฑ์„ ์ตœ๋Œ€ํ™”ํ•จ์œผ๋กœ์จ ๋‹ค์–‘ํ•œ downstream task์—์„œ LM์„ fine-tune ํ•œ๋‹ค. ์ด training์€ ๋ชจ๋ธ์˜ zero-shot task ์ผ๋ฐ˜ํ™”๋ฅผ ํ–ฅ์ƒ์‹œํ‚จ๋‹ค. ํ•˜์ง€๋งŒ, meta-trained LM๋„ meta-training ์ค‘์— ๋ณธ ์  ์—†๋˜ ์ƒˆ๋กœ์šด ๋ผ๋ฒจ์„ ํฌํ•จํ•˜๋Š” task์— ๋Œ€ํ•ด์„œ ์ผ๋ฐ˜ํ™”ํ•˜๋Š”๋ฐ ์–ด๋ ค์›€์„ ๊ฒช๊ณ  ์žˆ๋‹ค. ์ด๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด์„œ ์ด ๋…ผ๋ฌธ์—์„œ๋Š” Flipped Learning์„ ์ œ์•ˆํ•˜์˜€๋‹ค. ์ด ๋ฐฉ๋ฒ•์€ ๊ธฐ์กด์˜ meta-training๊ณผ ๋ฐ˜๋Œ€๋กœ, ์ž…๋ ฅ ์ธ์Šคํ„ด์Šค์™€ ๋ผ๋ฒจ์ด ์ฃผ์–ด์ง€๋ฉด task instruction์„ ์ƒ์„ฑํ•˜๋„๋ก LM์„ ํ•™์Šต์‹œํ‚จ๋‹ค. Flipp..

Paper Reading ๐Ÿ“œ/Alignment Problem of LLM

Super-Natural Instructions: Generalization via Declarative Instructions on 1600+ NLP Tasks ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ

The overview of this paper ์–ด๋–ป๊ฒŒ NLP ๋ชจ๋ธ๋“ค์€ task instruction์ด ์ฃผ์–ด์งˆ ๋•Œ ๋‹ค์–‘ํ•œ unseen task์— ๋Œ€ํ•ด์„œ ๊ทธ๋ ‡๊ฒŒ ์ž˜ ์ผ๋ฐ˜ํ™”ํ•  ์ˆ˜ ์žˆ์„๊นŒ? ์ด ์งˆ๋ฌธ์„ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ๋…ผ๋ฌธ์—์„œ๋Š” 1,616๊ฐœ์˜ ๋‹ค์–‘ํ•œ NLP task์˜ ๋ฒค์น˜๋งˆํฌ์™€ ์ด๋“ค์˜ ์ „๋ฌธ๊ฐ€๊ฐ€ ์ž‘์„ฑํ•œ instruction์„ ํฌํ•จํ•˜๊ณ  ์žˆ๋Š” Super-Natural Instructions๋ฅผ ์†Œ๊ฐœํ•˜์˜€๋‹ค. ์ด ํฌ๊ณ  ๋‹ค์–‘ํ•œ task์˜ ๋ชจ์Œ์€ instruction ํ•˜์—์„œ cross-task ์ผ๋ฐ˜ํ™”์˜ ์ฒ ์ €ํ•œ ๋ฒค์น˜๋งˆํฌ๋ฅผ ๋ณด์—ฌ์ฃผ๊ณ  ์žˆ๋‹ค - ๋ชจ๋ธ์ด task์˜ ์„œ๋ธŒ์…‹์—์„œ instruction์„ ๋”ฐ๋ฅด๋„๋ก ํ•™์Šต์‹œํ‚ค๊ณ  ๋‚จ์•„ ์žˆ๋Š” unseen task์— ๋Œ€ํ•ด์„œ ํ‰๊ฐ€ํ•˜๋„๋ก ํ•˜์˜€๋‹ค. ๊ฒŒ๋‹ค๊ฐ€ ๋…ผ๋ฌธ์—์„œ๋Š” ๋‹ค์–‘ํ•œ ๋ฌธ๋งฅ instruction์„ ๋”ฐ๋ฅด๋„..

Paper Reading ๐Ÿ“œ/Alignment Problem of LLM

FLAN: Fine-tuned Language Models are Zero-shot Learners ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ

The overview of this paper ์ด ๋…ผ๋ฌธ์—์„œ๋Š” LM์˜ zero-shot ํ•™์Šต ๋Šฅ๋ ฅ์„ ํ–ฅ์ƒ์‹œํ‚ค๊ธฐ ์œ„ํ•œ ๊ฐ„๋‹จํ•œ method๋ฅผ ์ œ์•ˆํ•˜์˜€๋‹ค. ์ด ๊ฐ„๋‹จํ•œ method๋Š” instruction tuning์œผ๋กœ instruction์„ ํ†ตํ•ด ๋ฌ˜์‚ฌ๋œ ๋ฐ์ดํ„ฐ์…‹์˜ ๋ชจ์Œ์—์„œ LM์„ fine-tune ํ•˜๋Š”๋ฐ, unseen task์— ๋Œ€ํ•ด zero-shot ์„ฑ๋Šฅ์„ ์ƒ๋‹นํžˆ ํ–ฅ์ƒ์‹œํ‚จ๋‹ค. ๋…ผ๋ฌธ์—์„œ๋Š” 137B PLM์„ ์‚ฌ์šฉํ•ด์„œ 60๊ฐœ์˜ NLP ๋ฐ์ดํ„ฐ์…‹์—์„œ ์ž์—ฐ์–ด instruction template์„ ํ†ตํ•ด instruction tune์„ ํ•˜์˜€๋‹ค. ์ด instruction-tuned model์„ FLAN์ด๋ผ ๋ถ€๋ฅด๊ณ  unseen task ์œ ํ˜•์—์„œ ํ‰๊ฐ€ํ•˜์˜€๋‹ค. FLAN์€ ์ˆ˜์ •๋˜์ง€ ์•Š์€ counterpart์˜ ์„ฑ๋Šฅ์„ ํฌ๊ฒŒ ์ƒํšŒํ•˜..

Cartinoe
'Paper Reading ๐Ÿ“œ/Alignment Problem of LLM' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๊ธ€ ๋ชฉ๋ก