Cartinoe
ALBEF: Vision and Language Representation Learning with Momentum Distillation ๋…ผ๋ฌธ