Research & Project ๐Ÿ”ฌ

Research & Project ๐Ÿ”ฌ

์–ด๋–ป๊ฒŒ Quantization์„ ์ง„ํ–‰ํ•˜๋Š” ๊ฒƒ์ด ํšจ๊ณผ์ ์ผ๊นŒ? ๐Ÿค”

Which quantization method is efficient & effective? ๐Ÿง ๋‚ ์ด ์ง€๋‚˜๋ฉด ์ง€๋‚ ์ˆ˜๋ก ์ ์  ์‚ฌ์ด์ฆˆ๊ฐ€ ์ปค์ ธ๊ฐ€๋Š” LLM์˜ ํŒ๋„์—์„œ ์ด๋“ค์„ ์†์‰ฝ๊ฒŒ ํšจ์œจ์  ๋ฐ ํšจ๊ณผ์ ์œผ๋กœ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ๋ฐฉ๋ฒ•์—๋Š” ๋ฌด์—‡์ด ์žˆ์„๊นŒ? ์š”์ฆ˜์—๋Š” ๋‹ค๋ฅธ method๋“ค๋ณด๋‹ค๋„ quantization, ์ฆ‰ ์–‘์žํ™”๋ฅผ ์ฃผ๋กœ ์‚ฌ์šฉํ•˜๋Š” ์ถ”์„ธ์ด๋‹ค. ์ด quantization์„ ํ†ตํ•ด ์‚ฌ๋žŒ๋“ค์€ ๊ณ ์šฉ๋Ÿ‰ RAM์„ ๊ฐ€์ง€๋Š” GPU์—์„œ๋„ ์‚ฌ์šฉํ•˜๊ธฐ๊ฐ€ ํž˜๋“ค๋˜ LLM์„ ํ›จ์”ฌ ํšจ์œจ์ ์œผ๋กœ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋˜์—ˆ๋‹ค! ๐Ÿค— ์ตœ์†Œํ•œ์˜ ์„ฑ๋Šฅ ๊ฐ์†Œ๋กœ ์ตœ์ ์˜ ํšจ์œจ์„ฑ์„ ๋ณด์—ฌ์ฃผ๋Š” quantization์„ ์œ„ํ•ด HuuggingFace์—์„œ๋Š” 2๊ฐ€์ง€ quantization method๋ฅผ ์ œ๊ณตํ•˜๊ณ  ์žˆ๋‹ค. ๋ฐ”๋กœ BitsAndBytes์™€ GPTQ์ด๋‹ค. ์ด๋ฅผ ํ† ๋Œ€๋กœ ๋‘ q..

Research & Project ๐Ÿ”ฌ

AlpaGasus2-QLoRA ๐Ÿฆ™๐Ÿฆ„๐Ÿค

AlpaGasus2-QLoRA!! ๐Ÿฆ„ ์ด๋ฒˆ์— ์ง„ํ–‰ํ•œ ํ”„๋กœ์ ํŠธ 'AlpaGasus2-QLoRA'์— ๋Œ€ํ•ด์„œ ์„ค๋ช…ํ•˜๊ณ ์ž ํ•œ๋‹ค. ํ”„๋กœ์ ํŠธ์— ๋Œ€ํ•ด ์•Œ์•„๋ณด๊ธฐ ์ „์— ๋จผ์ € ์ด ์—ฐ๊ตฌ๋ฅผ ์ง„ํ–‰ํ•  ์ˆ˜ ์žˆ๋„๋ก AlpaGasus๋ฅผ ์ œ์•ˆํ•ด์ฃผ์‹  Lichang Chen ์™ธ 10๋ถ„๊ป˜ ๊ฐ์‚ฌ์˜ ๋ง์”€์„ ๋“œ๋ฆฝ๋‹ˆ๋‹ค. https://arxiv.org/abs/2307.08701 AlpaGasus: Training A Better Alpaca with Fewer Data Large language models~(LLMs) obtain instruction-following capability through instruction-finetuning (IFT) on supervised instruction/response data. However, wi..

Cartinoe
'Research & Project ๐Ÿ”ฌ' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๊ธ€ ๋ชฉ๋ก