Self-supervised

Latest version: v1.0.4

Safety actively analyzes 622043 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

1.0.1

A Fast, Performant and Accessible Library for Training SOTA Self Supervised Algorithms!

Algorithms

Now, there are two main modules with implementations of popular **vision** and **multimodal** self-supervised algorithms:

**Vision**: SimCLR V1/V2, MoCo, BYOL and SwAV.
**Multimodal**: CLIP and CLIP-MoCo

Augmentations

New augmentations module offers helpers for easily constructing augmentation pipelines for self supervised algorithms. It's fast, extensible, and provides all the proven augmentations from the papers out of the box. It also provides an optimal combination of **torchvision/kornia/fastai** batch augmentations for improving performance and speed. It's now under a new module called `self_supervised.augmentations`

Layers

This is a new module which provides functionality to allow all the **timm** and **fastai** vision architectures to be used as backbones for training any vision self-supervised algorithm in the library. It supports gradient checkpointing for all **fastai** models, and for any **resnet** and **efficientnet** architectures from **timm**. It also makes it possible to create layers for downstream classification task and modify the **MLP** module with ease which is commonly used in self-supervised training frameworks.

Train your own CLIP Model

Support for training CLIP models either from scratch or finetuning them from open source checkpoints from OpenAI. Currently support ViT-B/32 and ViT-L/14 (Resnets are not included due to inferior performance).

Just a Thought: CLIP-MoCo

A custom implementation which combines CLIP with MoCo Queue implementation to reduce the requirement for using large batch sizes during training.

Distributed Training

CLIP and SimCLR algorithms have distributed training versions, which simply uses a distributed implementation of the underlying InfoNCE Loss. This allows an increase in effective batch size/negative samples during loss calculation. In experiments, regular `CLIPTrainer()` callback achieves a faster and better convergence than `DistributedCLIPTrainer()`. Distributed callbacks should be used with `DistributedDataParallel`.

Fastai Upgrade

Changes are compatible with latest fastai release. Library uses latest **timm** and **fastai** as it's requirement to keep up with the current improvements in both. Tests are also written based on that.

Large Scale and Large Batch Training

Now, `Learner.to_fp16()`is supported using Callback order attribute, allowing to increase the batch size x2, linearly decrease the training time by %50. Gradient checkpointing allows 25%-40% gains in GPU memory. Although gradient checkpointing trades memory with computation when batch size is increased to reclaim the freed GPU memory one can achieve %10 decrease in training time. [ZeRO](https://pytorch.org/docs/master/_modules/torch/distributed/optim/zero_redundancy_optimizer.html#ZeroRedundancyOptimizer) can be also used gain 40% GPU memory, in experiments it didn't lead to faster nor slower training as it does with gradient checkpoint, it's mainly useful for increasing batch size or training larger models.

SimCLR V1 & V2

Library provides all the utilities and helpers to choose from any augmentation pipeline, any **timm** or **fastai** vision model as backbone, any custom `MLP` layers, and more. In short, it has all the capability to switch from SimCLR V1 to V2 or to your own experimental V3.

MoCo V1 & V2 (Single and Multi GPU Support)

Similar to SimCLR, it's pretty simple to switch from MoCo v1 to v2 using the parts of the library since core algorithm / loss function stays the same. Also, MoCo implementation in this library is different from the official one in the sense that it doesn't uses Shuffle BN and rather uses both positives and negatives in the current batch. Experiments show success with this change, you can also read this [issue](https://github.com/facebookresearch/moco/issues/24) for more detail. As Shuffle BN depends on `DistributedDataParallel` it requires a multi-gpu environment but without it users can train both on a single or multi GPU.

SwAV Queue

Queue implementation complete for SwAV.



- Pypi link: https://pypi.org/project/self-supervised/1.0.1/
- Changes can be found here: https://github.com/KeremTurgutlu/self_supervised/pull/19.


v1.0.1-doi
Release to trigger Zenodo webhooks which should create a DOI for the library.

Links

Releases

Has known vulnerabilities

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.