When Vision Transformers Outperform Resnets Without Pre-training or Strong Data Augmentations March 13, 2020 https://arxiv.org/pdf/2106.01548 Fullscreen Dark Mode