Cartinoe
LXMERT: Learning Cross-Modality Encoder Representations from Transformers ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ