Cartinoe
ViT - An Image Worth 16 x 16 Words: Transformers for Image Recognition at Scale