Cartinoe
VinVL: Revisiting Visual Representations in Vision-Language Models ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ