Paper Reading ๐Ÿ“œ/Natural Language Processing

Paper Reading ๐Ÿ“œ/Natural Language Processing

Llama์˜ ์ƒˆ๋กœ์šด ๋Œ€ํ•ญ๋งˆ, Mistral LM! ๐Ÿ˜ฎ

The preview of Llama3..? ์ตœ๊ทผ์— HuggingFace๋ฅผ ๋ณด๋‹ค๊ฐ€ ์•Œ๊ฒŒ ๋œ ๋ชจ๋ธ์ด ํ•˜๋‚˜ ์žˆ๋‹ค. ๋ฐ”๋กœ LLM ์‹œ์žฅ์„ ๋œจ๊ฒ๊ฒŒ ๋‹ฌ๊ตฐ ๋ชจ๋ธ์ธ Mistral LM์ด๋‹ค! ํ˜œ์„ฑ์ฒ˜๋Ÿผ Open-source LLM ๊ณ„์— ๋‚˜ํƒ€๋‚œ Mistral 7B๋Š” ๊ทธ ๋“ฑ์žฅ๋งŒ์œผ๋กœ๋„ Open-source LLM๊ณ„๋ฅผ ๋œจ๊ฒ๊ฒŒ ๋‹ฌ๊ตฌ์—ˆ๋‹ค. ๊ทธ๋ ‡๋‹ค๋ฉด Mistral 7B๋Š” ๋ฌด์—‡์„ ์–ด๋–ป๊ฒŒ ํ–ˆ๊ธธ๋ž˜ ๋ชจ๋‘์˜ ์ด๋ชฉ์„ ์ง‘์ค‘์‹œํ‚ฌ ์ˆ˜ ์žˆ์—ˆ๋˜ ๊ฒƒ์ผ๊นŒ? ๊ทธ๊ฒƒ์€ Mistral 7B๊ฐ€ ์ด๋ค„๋‚ธ ์—…์ ์„ ์‚ดํŽด๋ณด๋ฉด ์•Œ ์ˆ˜ ์žˆ๋‹ค: ๋ชจ๋“  ๋ฒค์น˜๋งˆํฌ์—์„œ Llama2 13B๋ฅผ ๋Šฅ๊ฐ€ ๋งŽ์€ ๋ฒค์น˜๋งˆํฌ์—์„œ Llama1 34B๋ฅผ ๋Šฅ๊ฐ€(๋น„๊ต ๋Œ€์ƒ์ด Llama2๊ฐ€ ์•„๋‹ˆ๋ผ Llama1์ด์—ˆ๋˜ ์ด์œ ๋Š” Llama2์˜ 34B ๋ชจ๋ธ์ด ๊ณต๊ฐœ๋˜์—ˆ์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ) ์ฝ”๋“œ ๊ด€๋ จ ๋ฒค์น˜๋งˆํฌ์—์„œ CodeLlam..

Paper Reading ๐Ÿ“œ/Natural Language Processing

SelFee: Iterative Self-Revising LLM Empowered by Self-Feedback Generation ๋ฆฌ๋ทฐ

Introduction SelFee SelFee๋Š” KAIST์˜ LK Lab์—์„œ ๋งŒ๋“  ์ƒˆ๋กœ์šด instruction-following LM์œผ๋กœ ์‘๋‹ต์—์„œ self-feedback์„ ์ƒ์„ฑํ•˜๊ณ  ํ”ผ๋“œ๋ฐฑ์— ๊ธฐ๋ฐ˜ํ•ด์„œ self-revise ํ•˜๋Š” ๋ชจ๋ธ์ด๋‹ค. ChatGPT์— ์˜ํ•ด ์ƒ์„ฑ๋œ self-feedback๊ณผ revision data๋ฅผ ํฌํ•จํ•˜๋Š” 178K ๊ฐœ์˜ training instance๋ฅผ ์‚ฌ์šฉํ•ด์„œ LLaMA model(7B & 13B)์„ fine-tune ํ•˜์˜€๋‹ค. SelFee์˜ ์ž‘๋™ ์˜ˆ์‹œ Vicuna Evaluation์—์„œ ๋‘ SelFee(7B & 13B) ๋ชจ๋ธ์€ LLaMA, Alpaca, Vicuna, Guanaco๋ฅผ ๋Šฅ๊ฐ€ํ•˜๊ณ  ChatGPT์™€ ๋น„์Šทํ•œ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์คฌ๋‹ค. SelFee๋Š” ํŠนํžˆ high-quality te..

Paper Reading ๐Ÿ“œ/Natural Language Processing

Self-Refine: Iterative Refinement with Self-Feedback ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ

The overview of this paper ์ด ๋…ผ๋ฌธ์—์„œ๋Š” Self-Refine์„ ์†Œ๊ฐœํ•˜์˜€๋‹ค. Self-Refine์€ ๋ฐ˜๋ณต์ ์ธ ํ”ผ๋“œ๋ฐฑ๊ณผ ๊ฐœ์„ ์„ ํ†ตํ•ด LLM์˜ ์ดˆ๊ธฐ output์„ ๊ฐœ์„ ์‹œํ‚ค๊ธฐ ์œ„ํ•œ ๋ฐฉ๋ฒ•์ด๋‹ค. Self-Refine์˜ ์ฃผ๋œ ์•„์ด๋””์–ด๋Š” LLM์„ ์‚ฌ์šฉํ•ด ์ดˆ๊ธฐ output์„ ์ƒ์„ฑํ•˜๊ณ , ๊ทธ๋‹ค์Œ์— ๋˜‘๊ฐ™์€ LLM์ด output์— ๋Œ€ํ•ด ํ”ผ๋“œ๋ฐฑ์„ ์ œ๊ณตํ•˜๊ณ  ์ด ํ”ผ๋“œ๋ฐฑ์„ ์‚ฌ์šฉํ•ด ๋ฐ˜๋ณต์ ์œผ๋กœ ์ž๊ธฐ ์ž์‹ ์„ ๊ฐœ์„ ํ•ด ๋‚˜๊ฐ€๋Š” ๊ฒƒ์ด๋‹ค. ํ•œ ๋งˆ๋””๋กœ Self-Refine์€ ํ•˜๋‚˜์˜ LLM์„ generator, refiner, feedback provider๋กœ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด๋‹ค. Self-Refine์€ ๋ชจ๋“  ํ‰๊ฐ€๋œ task์—์„œ Self-Refine์œผ๋กœ ์ƒ์„ฑ๋œ output์€ ๊ธฐ์กด์˜ ๋˜‘๊ฐ™์€ LLM์œผ๋กœ ์ƒ์„ฑ๋œ output๋ณด๋‹ค human..

Paper Reading ๐Ÿ“œ/Natural Language Processing

Reflexion: Language Agents with Verbal Reinforcement Learning ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ

The overview of this paper ์ด ๋…ผ๋ฌธ์—์„œ๋Š” ๊ฐ€์ค‘์น˜๋ฅผ ์—…๋ฐ์ดํŠธํ•˜์ง€ ์•Š๊ณ  ๋Œ€์‹ ์— ์–ธ์–ด์  ํ”ผ๋“œ๋ฐฑ์„ ํ†ตํ•ด language agent๋ฅผ ๊ฐ•ํ™”ํ•˜๊ธฐ ์œ„ํ•œ ์ƒˆ๋กœ์šด ํ”„๋ ˆ์ž„์›Œํฌ์ธ Reflexion์„ ์†Œ๊ฐœํ•˜์˜€๋‹ค. ๊ตฌ์ฒด์ ์œผ๋กœ, Reflexion agent๋Š” task ํ”ผ๋“œ๋ฐฑ ์‹ ํ˜ธ์— ๋Œ€ํ•ด ์–ธ์–ด๋กœ ๋‚˜ํƒ€๋‚ด๊ณ , ๊ทธ๋‹ค์Œ์— ์ดํ›„์˜ ์‹œ๋„์— ๋” ๋‚˜์€ ์˜์‚ฌ ๊ฒฐ์ •์„ ์œ ๋ฐœํ•˜๊ธฐ ์œ„ํ•ด ๋ฉ”๋ชจ๋ฆฌ ๋ฒ„ํผ์— ์ด๋“ค๋งŒ์˜ reflective text๋ฅผ ์œ ์ง€ํ•œ๋‹ค. Reflexion์€ ๋‹ค์–‘ํ•œ ํƒ€์ž…๊ณผ ์†Œ์Šค์˜ ํ”ผ๋“œ๋ฐฑ ์‹ ํ˜ธ๋ฅผ ํฌํ•จํ•  ์ˆ˜ ์žˆ์„ ์ •๋„๋กœ ์ถฉ๋ถ„ํžˆ ์œ ์—ฐํ•˜๊ณ , ๋‹ค์–‘ํ•œ task์— ๊ฑธ์ณ์„œ baseline agent์— ๋น„ํ•ด์„œ ์ƒ๋‹นํ•œ ๊ฐœ์„ ์„ ์–ป์—ˆ๋‹ค. Table of Contents 1. Introduction 2. Reflexion: reinforceme..

Paper Reading ๐Ÿ“œ/Natural Language Processing

GPT-4๋„ ์ž˜ ๋ชปํ•œ API ํ˜ธ์ถœ์„ ํ•œ๋‹ค๊ณ ?!? - Gorilla๐Ÿฆ: Large Language Model Connected with Massive APIs ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ

The overview of this paper LLM์€ ์ตœ๊ทผ์— ์—„์ฒญ ๋ฐœ์ „ํ–ˆ์œผ๋‚˜, ์ด๋“ค์˜ API ํ˜ธ์ถœ์„ ํ†ตํ•œ ํšจ๊ณผ์ ์ธ ํˆด ์‚ฌ์šฉ์— ๋Œ€ํ•œ ์ž ์žฌ์„ฑ์€ ๋งŒ์กฑ๋˜์ง€ ์•Š์€ ์ฑ„ ๋‚จ์•„์žˆ๋‹ค. ์ด ๋…ผ๋ฌธ์—์„œ๋Š” API ํ˜ธ์ถœ ์ž‘์„ฑ์—์„œ GPT-4์˜ ์„ฑ๋Šฅ์„ ๋Šฅ๊ฐ€ํ•˜๋Š” fine-tuned LLaMA-based model์ธ Gorilla๐Ÿฆ๋ฅผ ์†Œ๊ฐœํ•˜์˜€๋‹ค. Gorilla๋Š” document retriever์™€ ํ•จ๊ป˜ ์‚ฌ์šฉ๋  ๋•Œ, test-time ๋ฌธ์„œ ๋ณ€ํ™”์— ์ ์‘ํ•˜๊ธฐ ์œ„ํ•œ ๊ฐ•๋ ฅํ•œ ๋Šฅ๋ ฅ์„ ๋ณด์—ฌ์ฃผ๊ณ , ์œ ์—ฐํ•œ ์‚ฌ์šฉ์ž ์—…๋ฐ์ดํŠธ ๋˜๋Š” ๋ฒ„์ „ ๋ณ€ํ™”๋ฅผ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•ด ์ฃผ์—ˆ๋‹ค. ์ด๊ฒƒ์€ LLM์„ direct ํ•˜๊ฒŒ prompting ํ•  ๋•Œ ์ผ๋ฐ˜์ ์œผ๋กœ ๋งž๋‹ฅ๋œจ๋ฆฌ๋Š” hallucination์˜ ๋ฌธ์ œ์ ์„ ์ƒ๋‹นํžˆ ์™„ํ™”ํ•˜์˜€๋‹ค. ๋˜ํ•œ ๋…ผ๋ฌธ์—์„œ๋Š” Gorilla์˜ ๋Šฅ๋ ฅ์„ ํ‰๊ฐ€ํ•˜๊ธฐ ์œ„ํ•ด ..

Paper Reading ๐Ÿ“œ/Natural Language Processing

Open-domain instruction์˜ ํšจ๊ณผ ๐Ÿช„ - WizardLM: Empowering Large Language Models to Follow Complex Instructions ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ

The overview of this paper open-domain instruction๊ณผ ํ•จ๊ป˜ LLM์„ ํ•™์Šต์‹œํ‚ค๋Š” ๊ฒƒ์€ ์ƒ๋‹นํ•œ ์„ฑ๊ณต์„ ๊ฐ€์ ธ์™”๋‹ค. ์ด ๋…ผ๋ฌธ์—์„œ๋Š” ์‚ฌ๋žŒ ๋Œ€์‹ ์— LLM์„ ์‚ฌ์šฉํ•ด์„œ ๋‹ค์–‘ํ•œ ๋ ˆ๋ฒจ์˜ ๋ณต์žก๋„๋ฅผ ๊ฐ€์ง€๋Š” ๋งŽ์€ ์–‘์˜ instruction data๋ฅผ ์ƒ์„ฑํ•˜๊ธฐ ์œ„ํ•œ ๋ฐฉ์•ˆ์„ ๋ณด์—ฌ์ค€๋‹ค. ์ดˆ๊ธฐ instruction set์™€ ํ•จ๊ป˜ ์‹œ์ž‘ํ•ด์„œ, ์ด instruction set๋ฅผ Evol-instruct๋ฅผ ์‚ฌ์šฉํ•ด์„œ ๋”์šฑ ๋ณต์žกํ•œ instruction์œผ๋กœ step-by-step ์ž‘์„ฑํ•˜์˜€๋‹ค. ๊ทธ๋‹ค์Œ์—, ๋ชจ๋“  ์ƒ์„ฑ๋œ instruction ๋ฐ์ดํ„ฐ๋ฅผ LLaMA๋ฅผ fine-tune ํ•˜๊ธฐ ์œ„ํ•ด ์„ž์—ˆ๋‹ค. ์ด๋ ‡๊ฒŒ ํ•ด์„œ ๋‚˜์˜จ ๋ชจ๋ธ์ด ๋ฐ”๋กœ WizardLM์ด๋‹ค. Human Evaluation & Vicuna Evaluatio..

Paper Reading ๐Ÿ“œ/Natural Language Processing

ํ•„์š”ํ•œ ๊ฑด ์˜ค์ง ๊ต๊ณผ์„œ ์ˆ˜์ค€์˜ ๋ฐ์ดํ„ฐ๋ฟ!! ๐Ÿ“– - phi-1: Textbooks Are All You Need ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ

The overview of this paper ๋…ผ๋ฌธ์—์„œ๋Š” ๋‹ค๋ฅธ ๋ชจ๋ธ๋ณด๋‹ค ํ›จ์”ฌ ์ž‘๊ณ  code๋ฅผ ์œ„ํ•œ LLM์ธ phi-1์„ ์†Œ๊ฐœํ•˜์˜€๋‹ค. phi-1์€ 1.3B Transformer model์ด๊ณ , ์›น์œผ๋กœ๋ถ€ํ„ฐ textbook ํ€„๋ฆฌํ‹ฐ ๋ฐ์ดํ„ฐ์˜ ์„ ํƒ์  ๋ชจ์Œ๊ณผ ์ข…ํ•ฉ์ ์œผ๋กœ ์ƒ์„ฑ๋œ textbook์„ ์‚ฌ์šฉํ•˜๊ณ , GPT-3.5๋กœ ํ›ˆ๋ จ๋˜์—ˆ๋‹ค. phi-1์€ ์ž‘์€ ๊ทœ๋ชจ์—๋„ ๋ถˆ๊ตฌํ•˜๊ณ  ๋†’์€ pass@1 accuracy๋ฅผ ๋‹ฌ์„ฑํ•˜์˜€๋‹ค. Table of Contents 1. Introduction 2. Training details and the importance of high-quality data 3. Spikes of model capability after finetuning on CodeExercises 4. Evaluati..

Paper Reading ๐Ÿ“œ/Natural Language Processing

LM์ด ๋„๊ตฌ๋ฅผ ์‚ฌ์šฉํ•˜๊ฒŒ ๋œ๋‹ค๋ฉด? ๐Ÿ”ฌ: Large Language Models as Tool Makers ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ

The overview of this paper ์ตœ๊ทผ์˜ ์—ฐ๊ตฌ๋Š” LLM์˜ ๋ฌธ์ œ ํ•ด๊ฒฐ ๋Šฅ๋ ฅ ํ–ฅ์ƒ์˜ ์ž ์žฌ์„ฑ์„ ๋ณด์—ฌ์คฌ๋‹ค. ํ•˜์ง€๋งŒ, ์ด์ „ ์—ฐ๊ตฌ๋“ค์€ ๊ธฐ์กด ํˆด์˜ ๊ฐ€์šฉ์„ฑ์— ์ƒ๋‹นํžˆ ์˜์กดํ•œ๋‹ค. ์ด ๋…ผ๋ฌธ์—์„œ๋Š” ์ด๋Ÿฌํ•œ ์˜์กด์„ฑ์„ ์ œ๊ฑฐํ•˜๊ธฐ ์œ„ํ•ด closed-loop ํ”„๋ ˆ์ž„์›Œํฌ์ธ LLM As Tool Makers(LATM)์„ ์ œ์•ˆํ•˜์˜€๋‹ค. LATM์—์„œ LLM์€ ๋ฌธ์ œ ํ•ด๊ฒฐ์„ ์œ„ํ•œ ์ž์‹ ๋งŒ์˜ ์žฌ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ํˆด์„ ์ƒ์„ฑํ•œ๋‹ค. LATM์€ 2๊ฐœ์˜ ๋ฉ”์ธ ํŽ˜์ด์ฆˆ๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ๋‹ค: tool making & tool using. tool making์€ LLM์ด ์„œ๋กœ ๋‹ค๋ฅธ ์š”์ฒญ์— ์ ์šฉ๋  ์ˆ˜ ์žˆ๋Š” tool์„ ๊ณ„์†์ ์œผ๋กœ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ๊ฒŒ ํ•ด ์ค˜์„œ ํ–ฅํ›„ ์š”์ฒญ์€ task๋ฅผ ํ•ด๊ฒฐํ•  ๋•Œ ์šฐ์ตํ•˜๋‹ค๊ณ  ์ƒ๊ฐ๋  ๋•Œ ํ•ด๋‹น APT๋ฅผ ๋ถˆ๋Ÿฌ์˜ฌ ์ˆ˜ ์žˆ๊ฒŒ ํ•ด ์ค€๋‹ค. ์ด๋ ‡๊ฒŒ ํ•ด์„œ ์ด ..

Paper Reading ๐Ÿ“œ/Natural Language Processing

๐ŸฌOrca: Progressive Learning from Complex Explanation Traces of GPT-4 ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ

The overview of this paper ์ตœ๊ทผ์˜ ์—ฐ๊ตฌ๋“ค์€ smaller model์˜ ์—ญ๋Ÿ‰์„ ํ–ฅ์ƒ์‹œํ‚ค๊ธฐ ์œ„ํ•ด imitation learning์„ ํ†ตํ•ด large foundation models(LFM)์— ์˜ํ•ด ์ƒ์„ฑ๋œ output๊ณผ ํ•จ๊ป˜ ํ–ฅ์ƒ์‹œํ‚ค๊ณ ์ž ํ•˜์˜€๋‹ค. ํ•˜์ง€๋งŒ ์—ฌ๊ธฐ์—๋Š” ์—ฌ๋Ÿฌ ๊ฐ€์ง€ ๋ฌธ์ œ์ ๋“ค์ด ์กด์žฌํ•œ๋‹ค. ์ด๋Ÿฌํ•œ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด Orca๋ฅผ ์†Œ๊ฐœํ•˜์˜€๋‹ค. Orca๋Š” LFM์˜ ์ถ”๋ก  ํ”„๋กœ์„ธ์Šค๋ฅผ ๋ชจ๋ฐฉํ•˜๊ธฐ ์œ„ํ•ด ํ•™์Šตํ•˜๋Š” 13B ๋ชจ๋ธ์ด๋‹ค. Orca๋Š” explanation trace(step-by-step process)๋ฅผ ํฌํ•จํ•˜๋Š” GPT-4 ๋กœ๋ถ€ํ„ฐ ํ’๋ถ€ํ•œ ์‹œ๊ทธ๋„์„ ํ•™์Šตํ•˜๊ณ , ChatGPT teacher assistant์— ์˜ํ•ด ์ง€๋„๋˜๋Š” ๋‹ค๋ฅธ ๋ณต์žกํ•œ instruction์—์„œ ํ•™์Šต๋˜์—ˆ๋‹ค. ์ด๋Ÿฌํ•œ progress..

Cartinoe
'Paper Reading ๐Ÿ“œ/Natural Language Processing' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๊ธ€ ๋ชฉ๋ก