Multimodal Autoregressive Pre-training of Large Vision Encoders
Paper
• 2411.14402 • Published
• 47
timm compatible AIM-v2 (https://huggingface.co/papers/2411.14402) image encoder weights from https://huggingface.co/apple/aimv2-huge-patch14-224