Cartinoe
BLIP: Bootstrapping Language-Image Pre-training fro Unified Vision-Language Understanding and Generation ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ