Highlights
[[BETA](https://pytorch.org/blog/pytorch-feature-classification-changes/#beta)] New transforms API
TorchVision is extending its Transforms API! Here is what’s new:
- You can use them not only for Image Classification but also for Object Detection, Instance & Semantic Segmentation and Video Classification.
- You can use new functional transforms for transforming Videos, Bounding Boxes and Segmentation Masks.
The API is **completely backward compatible** with the previous one, and remains the same to assist the migration and adoption. We are now releasing this new API as Beta in the `torchvision.transforms.v2` namespace, and we would love to get early feedback from you to improve its functionality. Please [reach out to us](https://github.com/pytorch/vision/issues/6753) if you have any questions or suggestions.
py
import torchvision.transforms.v2 as transforms
Exactly the same interface as V1:
trans = transforms.Compose([
transforms.ColorJitter(contrast=0.5),
transforms.RandomRotation(30),
transforms.CenterCrop(480),
])
imgs, bboxes, masks, labels = trans(imgs, bboxes, masks, labels)
You can read more about these new transforms in our [docs](https://pytorch.org/vision/main/transforms.html), and you can also check out our examples:
- [End-to-end object detection example
](https://pytorch.org/vision/stable/auto_examples/plot_transforms_v2_e2e.html#sphx-glr-auto-examples-plot-transforms-v2-e2e-py)
- [Getting started with transforms v2
](https://pytorch.org/vision/stable/auto_examples/plot_transforms_v2.html#sphx-glr-auto-examples-plot-transforms-v2-py)
Note that this API is still Beta. **While we do not expect major breaking changes, some APIs may still change according to user feedback**. Please submit any feedback you may have in https://github.com/pytorch/vision/issues/6753, and you can also check out https://github.com/pytorch/vision/issues/7319 to learn more about the APIs that we suspect might involve future changes.
[[BETA](https://pytorch.org/blog/pytorch-feature-classification-changes/#beta)] New Video Swin Transformer
We added a Video SwinTransformer model is based on the [Video Swin Transformer](https://arxiv.org/abs/2106.13230) paper.
py
import torch
from torchvision.models.video import swin3d_t
video = torch.rand(1, 3, 32, 800, 600)
or swin3d_b, swin3d_s
model = swin3d_t(weights="DEFAULT")
model.eval()
with torch.inference_mode():
prediction = model(video)
print(prediction)
The model has the following accuracies on the Kinetics-400 dataset:
| Model | Acc1 | Acc5 |
| --- | ----------- | --------- |