We added the [SimMIM]( model. It has a very similar architecture to MAE, but it uses a ViT encoder using both masked and non-masked patches as input. Furthermore it has just a simple linear layer as a decoder and uses L1 instead of L2 loss.
- [Barlow Twins: Self-Supervised Learning via Redundancy Reduction, 2021](
- [Bootstrap your own latent: A new approach to self-supervised Learning, 2020](
- [DCL: Decoupled Contrastive Learning, 2021](
- [DINO: Emerging Properties in Self-Supervised Vision Transformers, 2021](
- [MAE: Masked Autoencoders Are Scalable Vision Learners, 2021](
- [MSN: Masked Siamese Networks for Label-Efficient Learning, 2022](
- [MoCo: Momentum Contrast for Unsupervised Visual Representation Learning, 2019](
- [NNCLR: Nearest-Neighbor Contrastive Learning of Visual Representations, 2021](
- [SimCLR: A Simple Framework for Contrastive Learning of Visual Representations, 2020](
- [SimMIM, 2021](
- [SimSiam: Exploring Simple Siamese Representation Learning, 2020](
- [SMoG: Unsupervised Visual Representation Learning by Synchronous Momentum Grouping, 2022](
- [SwAV: Unsupervised Learning of Visual Features by Contrasting Cluster Assignments, M. Caron, 2020](
- [VICReg: Variance-Invariance-Covariance Regularization for Self-Supervised Learning, Bardes, A. et. al, 2022](