Paper Reading ๐Ÿ“œ/Natural Language Processing

Paper Reading ๐Ÿ“œ/Natural Language Processing

Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ

What is the purpose of this paper? Transformer๋Š” longer-term dependency๋ฅผ ํ•™์Šตํ•  ์ˆ˜ ์žˆ๋Š” ๋Šฅ๋ ฅ์ด ์žˆ์ง€๋งŒ, ๊ณ ์ •๋œ ๊ธธ์ด์˜ ๋ฌธ๋งฅ( fixed-length context)๋กœ ์ธํ•ด ์ œํ•œ๋ฐ›๊ฒŒ ๋œ๋‹ค. ๋”ฐ๋ผ์„œ ์ด ๋…ผ๋ฌธ์—์„œ๋Š” ์ƒˆ๋กœ์šด ์‹ ๊ฒฝ๋ง architecture์ธ Transformer-XL์„ ์†Œ๊ฐœํ•œ๋‹ค. ์ด๊ฒƒ์€ ๊ณ ์ •๋œ ๋ฌธ๋งฅ(fixed context)์˜ ๋‹จ์ ์„ ํ•ด๊ฒฐํ•˜๋˜, ์‹œ๊ฐ„์ ์ธ ์ผ๊ด€์„ฑ์„ ํ•ด์น˜์ง€ ์•Š๋Š” ์ƒˆ๋กœ์šด architecture์ด๋‹ค. ์ด Transforemr-XL์€ longer-term dependency๋ฅผ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ–ˆ์„ ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ, context fragmentation ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜์˜€๋‹ค. ์ด๋“ค์— ๋Œ€ํ•ด์„œ๋Š” ๋ณธ๋ฌธ์—์„œ ์‚ดํŽด๋ณด๋„๋ก ํ•˜๊ฒ ๋‹ค. Table of Conten..

Paper Reading ๐Ÿ“œ/Natural Language Processing

Transformer: 'Attention Is All You Need' ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ

The importance of this paper ํ˜„์žฌ ๋งŽ์€ Language model๋“ค์˜ ๊ธฐ๋ฐ˜์ด ๋˜์–ด์ฃผ๊ณ  ์žˆ๋Š” Transformer, ์ด Transformer์˜ ๋ฐœ๋ช…์œผ๋กœ NLP ๋ถ„์•ผ๋Š” ์—„์ฒญ๋‚œ ๋ฐœ์ „์„ ์ด๋ฃฉํ•˜์˜€๋‹ค. ๋Œ€ํ‘œ์ ์œผ๋กœ BERT์™€ GPT๊ฐ€ ๊ฐ๊ฐ Transformer์˜ Encoder์™€ Decoder์„ ์ด์šฉํ•˜์—ฌ ์ œ์ž‘๋˜์—ˆ๋‹ค. ๋ณธ ๋ธ”๋กœ๊ทธ์˜ ์ด์ „ ํฌ์ŠคํŠธ๋“ค์—์„œ ๋‹ค๋ค˜๋˜ Language Model paper review๋ฅผ ์ฝ์–ด๋ณด๊ธฐ ์ „์— ์ด ํฌ์ŠคํŠธ๋ฅผ ์„ ํ–‰์œผ๋กœ ์ฝ์–ด๋ณด๊ธธ ๋ฐ”๋ž€๋‹ค. ์™œ๋ƒํ•˜๋ฉด ์ด Transformer์— ๋Œ€ํ•ด ์ž์„ธํ•˜๊ฒŒ ์ดํ•ด๋ฅผ ํ•ด์•ผ๋งŒ ์ด ๋ชจ๋ธ๋“ค์„ ์ดํ•ดํ•  ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค. ๊ทธ๋Ÿฌ๋ฉด ์ด๋ฒˆ ํฌ์ŠคํŠธ๋ฅผ ์‹œ์ž‘ํ•ด๋ณด๊ฒ ๋‹ค. $($2023.02.09 ์ถ”๊ฐ€$)$ Transformer ๊ตฌํ˜„ ์ฝ”๋“œ ์‹ค์Šต https://github.c..

Paper Reading ๐Ÿ“œ/Natural Language Processing

Pre-trained Language Modeling paper reading(3) - GPT-1: Improving Language Understanding by Generative Pre-Training

Pre-trained Language Modeling paper reading ์š”์ฆ˜ NLP ๋ถ„์•ผ์—์„œ ๋œจ๊ฑฐ์šด ๊ฐ์ž์ธ pre-trained Language Modeling์— ๊ด€ํ•œ ์œ ๋ช…ํ•œ ๋…ผ๋ฌธ๋“ค์„ ์ฝ๊ณ  ๋ฆฌ๋ทฐ๋ฅผ ํ•˜์˜€๋‹ค. ์ด Pre-trained Language Modeling paper reading์€ ์ด ํฌ์ŠคํŠธ๋งŒ์œผ๋กœ ๋๋‚˜๋Š” ๊ฒƒ์ด ์•„๋‹ˆ๋ผ ์—ฐ์†๋œ ํฌ์ŠคํŠธ๋ฅผ ์ž‘์„ฑํ•  ์ƒ๊ฐ์ด๋‹ค. ์ด๋ฒˆ ํฌ์ŠคํŠธ์—์„œ๋Š” ์ €๋ฒˆ ํฌ์ŠคํŠธ์˜ BERT์— ์ด์–ด์„œ GPT-1์— ๋Œ€ํ•ด์„œ ๋ฆฌ๋ทฐํ•˜์˜€๋‹ค. ELMo: 'Deep contextualized word representations' reading & review BERT: 'Pre-training of Deep Bidirectional Transformers for Language Understanding..

Paper Reading ๐Ÿ“œ/Natural Language Processing

Pre-trained Language Modeling paper reading(2) - BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Pre-trained Language Modeling paper reading ์š”์ฆ˜ NLP ๋ถ„์•ผ์—์„œ ๋œจ๊ฑฐ์šด ๊ฐ์ž์ธ pre-trained Language Modeling์— ๊ด€ํ•œ ์œ ๋ช…ํ•œ ๋…ผ๋ฌธ๋“ค์„ ์ฝ๊ณ  ๋ฆฌ๋ทฐ๋ฅผ ํ•˜์˜€๋‹ค. ์ด Pre-trained Language Modeling paper reading์€ ์ด ํฌ์ŠคํŠธ๋งŒ์œผ๋กœ ๋๋‚˜๋Š” ๊ฒƒ์ด ์•„๋‹ˆ๋ผ ์—ฐ์†๋œ ํฌ์ŠคํŠธ๋ฅผ ์ž‘์„ฑํ•  ์ƒ๊ฐ์ด๋‹ค. ์ด๋ฒˆ ํฌ์ŠคํŠธ๋Š” ์ €๋ฒˆ ํฌ์ŠคํŠธ์˜ ELMo์— ์ด์–ด์„œ BERT์— ๋Œ€ํ•ด์„œ ๋ฆฌ๋ทฐํ•˜์˜€๋‹ค. ELMo: 'Deep contextualized word representations' reading & review BERT: 'Pre-training of Deep Bidirectional Transformers for Language Understanding' r..

Paper Reading ๐Ÿ“œ/Natural Language Processing

Pre-trained Language Modeling paper reading(1) - ELMo: Deep contextualized word representations

Pre-trained Language Modeling paper reading ์š”์ฆ˜ NLP ๋ถ„์•ผ์—์„œ ๋œจ๊ฑฐ์šด ๊ฐ์ž์ธ pre-trained Language Modeling์— ๊ด€ํ•œ ์œ ๋ช…ํ•œ ๋…ผ๋ฌธ๋“ค์„ ์ฝ๊ณ  ๋ฆฌ๋ทฐ๋ฅผ ํ•˜์˜€๋‹ค. ์ด Pre-trained Language Modeling paper reading์€ ์ด ํฌ์ŠคํŠธ๋งŒ์œผ๋กœ ๋๋‚˜๋Š” ๊ฒƒ์ด ์•„๋‹ˆ๋ผ ์—ฐ์†๋œ ํฌ์ŠคํŠธ๋ฅผ ์ž‘์„ฑํ•  ์ƒ๊ฐ์ด๋‹ค. ๊ทธ๋ž˜์„œ ์ด ํฌ์ŠคํŠธ๋Š” Pre-trained Language Modeling paper reading์˜ ์ฒซ ์„œ๋ง‰์„ ์—ฌ๋Š” ํฌ์ŠคํŠธ์ด๋‹ค. ์•ž์œผ๋กœ์˜ ํฌ์ŠคํŠธ ๊ณ„ํš์€ ์•„๋ž˜์™€ ๊ฐ™๋‹ค. ELMo: 'Deep contextualized word representations' reading & review(this post) BERT: 'Pre-training of ..

Paper Reading ๐Ÿ“œ/Natural Language Processing

Embedding Matrix ํ•™์Šต

Why I studied Embedding Matrix? NLP ๋ถ„์•ผ์—์„œ ์ค‘์š”ํ•˜๊ฒŒ ์‚ฌ์šฉ๋˜๋Š” Emebedding Matrix๋“ค์— ๋Œ€ํ•ด ํ•™์Šต์„ ํ•˜์˜€๋‹ค. Embedding Matrix๋ฅผ ํ†ตํ•ด ๋ฌธ์žฅ์—์„œ ์–ด๋– ํ•œ ๋‹จ์–ด๋‚˜ ๋ฌธ์žฅ์ด ์–ผ๋งˆ๋‚˜ ์ค‘์š”ํ•œ ์˜๋ฏธ๋ฅผ ๊ฐ€์ง€๋Š”์ง€๋ฅผ ๊ณ„์‚ฐํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋œ๋‹ค. (์ˆ˜์ •): ELMo, BERT, GPT-1์€ ๋”ฐ๋กœ ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋กœ ์˜ฌ๋ฆฌ๋„๋ก ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค. Table of Contents 1. What is Language Model? 2. Count based word representation 2-1. TF-iDF 3. Word Embedding 3-1. Word2Vec 3-2. GloVe 1. What is Language Model? ์–ธ์–ด ๋ชจ๋ธ (Language Model)์ด๋ž€, ์–ธ์–ด๋ฅผ ๋ชจ๋ธ๋งํ•˜๊ธฐ ์œ„..

Paper Reading ๐Ÿ“œ/Natural Language Processing

Better Reasoning Behind Classification Predictions with BERT for Fake News Detection ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ

Why I read this paper? fake news detection ๊ณผ์ •์—์„œ ๋ชจ๋ธ์ด ์ฃผ์–ด์ง„ ๋ฐ์ดํ„ฐ์˜ ์–ด๋– ํ•œ ๋‹จ์–ด ๋˜๋Š” ๋ฌธ์žฅ์— ๋Œ€ํ•ด ์ค‘์š”์„ฑ์„ ๋ณด์ด๋Š”์ง€ ํŒŒ์•…ํ•˜๊ธฐ ์œ„ํ•ด CAM๊ณผ Grad-CAM์„ ํ™œ์šฉํ•˜์—ฌ Visualization์„ ์ง„ํ–‰ํ•œ ๋ถ€๋ถ„์„ ์•Œ๊ธฐ ์œ„ํ•ด. Grad-CAM์€ ์ด๋ฏธ์ง€์— ์ ์šฉํ•˜๊ธฐ ์œ„ํ•ด ๊ฐœ๋ฐœ๋œ ๊ธฐ์ˆ ์ธ๋ฐ ์ด๋ฅผ ์–ด๋–ป๊ฒŒ text data์— ์ ์šฉํ•˜๋Š”์ง€ ๊ถ๊ธˆํ•ด์„œ ์ฝ๊ฒŒ ๋˜์—ˆ๋‹ค. Table of Contents 1. Introduction 2. Methodology 3. Experiments & Results(์ผ์ • ๋ถ€๋ถ„๋งŒ) 4. Conclusion 1. Introduction ์ด ๋…ผ๋ฌธ์—์„œ๋Š” representation space์˜ ํ€„๋ฆฌํ‹ฐ๋ฅผ ์„œ๋กœ ๋‹ค๋ฅธ ๊ฐ€์งœ์™€ ์ง„์งœ ๋‰ด์Šค ๋ฐ์ดํ„ฐ์…‹์— ๋Œ€ํ•œ linear sepa..

Cartinoe
'Paper Reading ๐Ÿ“œ/Natural Language Processing' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๊ธ€ ๋ชฉ๋ก (7 Page)