Torchvision

Latest version: v0.20.1

Safety actively analyzes 688087 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 19 of 23

0.13.0

Highlights

Models

Multi-weight support API

0.12.0

Highlights

New Models

Four new model families have been released in the latest version along with pre-trained weights for their variants: FCOS, RAFT, Vision Transformer (ViT) and ConvNeXt.

Object Detection

[FCOS](https://arxiv.org/pdf/1904.01355.pdf) is a popular, fully convolutional, anchor-free model for object detection. In this release we include a community-contributed model implementation as well as pre-trained weights. The model was trained on COCO train2017 and can be used as follows:

python
import torch
from torchvision import models

x = [torch.rand(3, 224, 224)]
fcos = models.detection.fcos_resnet50_fpn(pretrained=True).eval()
predictions = fcos(x)


The box AP of the pre-trained model on COCO val2017 is 39.2 (see [4961](https://github.com/pytorch/vision/pull/4961) for more details).

We would like to thank [Hu Ye](https://github.com/xiaohu2015) and [Zhiqiang Wang](https://github.com/zhiqwang) for contributing to the model implementation and initial training. This was the first community-contributed model in a long while, and given its success, we decided to use the learnings from this process and create a new [model contribution guidelines](https://github.com/pytorch/vision/blob/main/CONTRIBUTING_MODELS.md).

Optical Flow support and RAFT model

Torchvision now supports optical flow! Optical flow models try to predict movement in a video: given two consecutive frames, the model predicts where each pixel of the first frame ends up in the second frame. Check out our [new tutorial on Optical Flow](https://pytorch.org/vision/0.12/auto_examples/plot_optical_flow.html#sphx-glr-auto-examples-plot-optical-flow-py)!

We implemented a torchscript-compatible [RAFT](https://arxiv.org/abs/2003.12039) model with pre-trained weights (both normal and “small” versions), and added support for [training and evaluating](https://github.com/pytorch/vision/tree/main/references/optical_flow) optical flow models. Our training scripts support distributed training across processes and nodes, leading to much faster training time than the original implementation. We also added 5 new [optical flow datasets](https://pytorch.org/vision/0.12/datasets.html#optical-flow): Flying Chairs, Flying Things, Sintel, Kitti, and HD1K.

![raft](https://github.com/pytorch/vision/releases/download/v0.12.0/raft.png "image_tooltip")

Image Classification

[Vision Transformer](https://arxiv.org/abs/2010.11929) (ViT) and [ConvNeXt](https://arxiv.org/abs/2201.03545) are two popular architectures which can be used as image classifiers or as backbones for downstream vision tasks. In this release we include 8 pre-trained weights for their classification variants. The models were trained on ImageNet and can be used as follows:

python
import torch
from torchvision import models

x = torch.rand(1, 3, 224, 224)
vit = models.vit_b_16(pretrained=True).eval()
convnext = models.convnext_tiny(pretrained=True).eval()
predictions1 = vit(x)
predictions2 = convnext(x)


The accuracies of the pre-trained models obtained on ImageNet val are seen below:

|Model |Acc1 |Acc5 |
|--- |--- |--- |
|vit_b_16|81.072|95.318|
|vit_b_32|75.912|92.466|
|vit_l_16|79.662|94.638|
|vit_l_32|76.972|93.07|
|convnext_tiny|82.52|96.146|
|convnext_small|83.616|96.65|
|convnext_base|84.062|96.87|
|convnext_large|84.414|96.976|

The above models have been trained using an adjusted version of our new [training recipe](https://pytorch.org/blog/how-to-train-state-of-the-art-models-using-torchvision-latest-primitives/) and this allows us to offer models with accuracies significantly higher than the ones on the original papers.

GPU Video Decoding

In this release, we add support for GPU video decoding in the video reading API. To use hardware-accelerated decoding, we just need to pass a cuda device to the video reading API as shown below:

python
import torchvision

reader = torchvision.io.VideoReader(file_name, device='cuda:0')
for frame in reader:
print(frame)


We also support seeking to anyframe or a keyframe in the video before reading, as shown below:

python
reader.seek(seek_time)


New Datasets

We have implemented 14 new [classification datasets](https://pytorch.org/vision/0.12/datasets.html#image-classification): CLEVR, GTSRB, FER2013, SUN397, Country211, Flowers102, fvgc_aircraft, OxfordIIITPet, DTD, Food 101, Rendered SST2, Stanford cars, PCAM, and EuroSAT.

As part of our work on Optical Flow support (see above for more details), we also added 5 new [optical flow datasets](https://pytorch.org/vision/0.12/datasets.html#optical-flow): Flying Chairs, Flying Things, Sintel, Kitti, and HD1K.

Documentation

New documentation layout

We have updated our documentation pages to be more compact and easier to browse. Each function / class is now documented in a separate page, clearing up some space in the per-module pages, and easing the discovery of the proposed APIs. Compare e.g. our [previous docs](https://pytorch.org/vision/0.11/transforms.html) vs the [new ones](https://pytorch.org/vision/0.12/transforms.html). Please let us know if you have any feedback!

Model contribution guidelines

New [model contribution guidelines](https://github.com/pytorch/vision/blob/main/CONTRIBUTING_MODELS.md) have been published following the success of the [FCOS](https://www.google.com/url?q=https://github.com/pytorch/vision/pull/4961&sa=D&source=docs&ust=1645630832795238&usg=AOvVaw3IyBB6Eso_MWxSS_R0QZMk) model which was contributed by the community. These guidelines aim to be an overview of the model contribution process for anyone who would like to suggest, implement and train a new model.

Upcoming Prototype APIs

We are currently working on a prototype API which adds Multi-weight support on all of our model builder methods. This will enable us to offer multiple pre-trained weights, associated with their meta-data and inference transforms. The API is still under review and thus was not included in the release but you can read more about it on our [blogpost](https://pytorch.org/blog/introducing-torchvision-new-multi-weight-support-api/) and provide your feedback on the dedicated [Github issue](https://github.com/pytorch/vision/issues/5088).

Changes in our deprecation policy

Up until now, torchvision would almost never remove deprecated APIs. In order to be more [aligned and consistent with pytorch core](https://github.com/pytorch/pytorch/wiki/PyTorch's-Python-Frontend-Backward-and-Forward-Compatibility-Policy), we are updating our deprecation policy. We are now following a 2-release deprecation cycle: deprecated APIs will raise a warning for 2 versions, and will be removed after that. To reflect these changes and to smooth the transition, we have decided to:

* Remove all APIs that had been deprecated before or on v0.8, released 1.5 years ago.
* Update the removal timeline of all other deprecated APIs to v0.14, to reflect the new 2-cycle policy starting now in v0.12.

Backward-incompatible changes

[models.quantization] Removed the Quantized shufflenet_v2_x1_5 and shufflenet_v2_x2_0 model builders which had no associated weights, rendering them useless. Additionally we added pre-trained weights for the shufflenet_v2_x0_5 quantized variant.. ([4854](https://github.com/pytorch/vision/pull/4854))
[ops] Change to stable sort in nms implementations - this change can lead to different behavior in rare cases therefore it has been flagged as backwards-incompatible ([4767](https://github.com/pytorch/vision/pull/4767))
[transforms] Changed the center and the parametrization of shear X/Y in Auto Augment transforms to align with the original papers ([5285](https://github.com/pytorch/vision/pull/5285)) ([#5384](https://github.com/pytorch/vision/pull/5384))

Deprecations

Note: in order to be more aligned with pytorch core, we are updating our deprecation policy. Please read more above in the “Highlights” section.

[ops] The `ops.poolers.MultiScaleRoIAlign` public methods `setup_setup_scales`, `convert_to_roi_format`, and `infer_scale` have been deprecated and will be removed in 0.14 (4951) (4810)

New Features

[datasets] New optical flow datasets added: FlyingChairs, Kitti, Sintel, FlyingThings3D, and HD1K (4860) (4845) (4858) (4890) (5004) (4889) (4888) (4870)
[datasets] New classification datasets support for FLAVA: CLEVR, GTSRB, FER2013, SUN397, Country211, Flowers102, fvgc_aircraft, OxfordIIITPet, DTD, Food 101, Rendered SST2, Stanford cars, PCAM, and EuroSAT (5120) (5130) (5117) (5132) (5138) (5177) (5178) (5116) (5115) (5119) (5220) (5166) (5203) (5114) (5164) (5280)
[models] Add VisionTransformer model (5173) (5210) (5172) (5085) (5226) (5025) (5086) (5159)
[models] Add ConvNeXt model (5330) (5253)
[models] Add RAFT models and support for optical flow model training (5022) (5070) (5174) (5381) (5078) (5076) (5081) (5079) (5026) (5027) (5082) (5060) (4868) (4657) (4732)
[models] Add FCOS model (4961) (5267)
[utils] Add utility to convert optical flow to an image (5134) (5308)
[utils] Add utility to draw keypoints (4216)
[video] Add video GPU decoder (5019) (5191) (5215) (5256) (4474) (3179) (4878) (5328) (5327) (5183) (4947) (5192)

Improvements

[datasets] Migrate mnist dataset from np.frombuffer (4598)
[io, tests] Switch from np.frombuffer to torch.frombuffer (4578)
[models] Update ResNet-50 accuracy with Repeated Augmentation (5201)
[models] Add regnet_y_128gf factory function, and several regnet model weights (5176) (4530)
[models] Adding min_size to classification and video models (5223)
[models] Remove in-place mutation in DefaultBoxGenerator (5279)
[models] Added Dropout parameter to Models Constructors (4580)
[models] Allow to use custom norm_layer (4621)
[models] Add IntermediateLayerGetter on segmentation (5298)
[models] Use FX feature extractor for segm model (4563)
[models, ops, io] Add model, ops and io usage logging (4956) (4735) (4736) (4737) (5044) (4799) (5095) (5038)
[models.quantization] Implement is_qat in TorchVision (5299)
[models.quantization] Cleanup Quantized ShuffleNet (4854)
[models.quantization] Adding new Quantized models (4969)
[ops] [FBcode->GH] Fix missing kernel guards (4620) (4743)
[ops] Expose misc ops at package level (4812)
[ops] Fix giou naming bug (5270)
[ops] Change batched NMS threshold to choose for-loop version (4990)
[ops] Add bias parameter to ConvNormActivation (5012)
[ops] Feature extraction default arguments - ops (4810)
[ops] Change to stable sort in nms implementations (4767)
[reference scripts] Support amp training (4923) (4933) (4994) (4547) (4570)
[reference scripts] Add types and improve descriptions to ArgumentParser parameters (4724)
[reference scripts] Replaced all 'no_grad()' instances with 'inference_mode()' (4629)
[reference scripts] Adding Repeated Augment Sampler (5051)
[reference scripts] Reduce variance of classification references evaluation (4609)
[reference scripts] Avoid inplace modification of target boxes in detection references (5289)
[reference scripts] Allow variable number of repetitions for RA (5084)
[reference scripts, classification] Adding gradient clipping (4824)
[reference scripts, models.quantization] Add --prototype flag to quantization scripts. (5334)
[reference scripts, ops] Additional SOTA ingredients on Classification Recipe (4493)
[transforms] Added center arg to F.affine and RandomAffine ops (5208)
[transforms] Explicitly copying array in pil_to_tensor (4566)
[transforms] Update functional_tensor.py (4852)
[transforms] Add api usage log to transforms (5007)
[utils] Support random colors by default for draw_bounding_boxes (5127)
[utils] Add API usage calls to utils (5077)
Various documentation improvements (4913) (4892) (5305) (5273) (5089) (4653) (5302) (4647) (4922) (5124) (4972) (5165) (4843) (5238) (4846) (4823) (5316) (5195) (5153) (4783) (4798) (4797) (5368) (5037) (4830) (4681) (4579) (4520) (4586) (4536) (4574)) (4565) (4822) (5315) (4546) (4522) (5312) (5372) (4833)
[tests] Set seed on several tests to reduce flakiness (4911) (4764) (4762) (4759) (4766) (4763) (4758) (4761)
[tests]Other tests improvements (4756) (4775) (4867) (4929) (4632) (5029) (4597)
Added script to sync fbcode changes with main branch (4769)
[ci] Various CI improvements (4662) (4669) (4791) (4626) (5021) (4739) (3973)(4618) (4788) (4946) (5112) (5099) (5288) (5152) (4696) (5122) (4793) (4998) (4498)
[build] Various build improvements (5261) (5190) (4945) (4920) (5024) (4571) (4742) (4944) (4989) (5179) (4516) (4661) (4695) (4939) (4954)
[io] decode_* returns contiguous tensors (4898)
[io] Revert "decode_* returns contiguous tensors (4898)" (4901)

Bug Fixes

[datasets] fix Caltech datasets (4556)
[datasets] fix UCF101 on Windows (5129)
[datasets] remove extracted archive if flag was set (5055)
[datasets] Reverted folder.py back to using complete path to file for make_dataset and is_valid_file rather than just the filename (4885)
[datasets] fix `fromfile` on windows (4980)
[datasets] fix WIDERFace download links (4649)
[datasets] fix target_type selection for Caltech101 (4637)
[io] Skip jpeg comparison tests with PIL (5169)
[io] [Windows] Workaround for loading bundled DLLs (4893)
[models] Adding missing named param check on ViT (5196)
[models] Modifying keypoint_rcnn.py for keypoint_predictor issue (5180)
[models] Fixing bug on SSD backbone freezing (4590)
[models] [FBcode->GH] Removed type annotations from rcnn (4883)
[models.quantization] Amend the weights only if quantize=True (4966)
[models.quantization] fix mobilenetv3 quantization state dict loading (4997)
[ops] Adding masks_to_boxes to **all** in ops (4779)
[ops] Update the error message on DeformConv2d (4908)
[ops, onnx] RoiAlign aligned=True (4692)
[reference scripts] Fix reduce_across_processes inconsistent return type (4733)
[reference scripts] Fix bug on EMA n_averaged estimation (4544)
[reference scripts] support random seed for RA sampler (5053)
[reference scripts] fix bug in training model by amp (4874)
[reference scripts, transforms] Fix a bug on RandomZoomOut (5278)
[tests] Skip expected checks for quantized resnet50 due to flakiness (4686)
[transforms] Fix bug on autocontrast when `min==max` (4999)
[transforms] Fix augmentation space to be uint8 compatible (4806)
[utils] Fix `draw_bounding_boxes` and `draw_keypoints `for tensors on GPU (5101) (5102)
[build] fix formatting CIRCLECI_TAG when building docs (4693)
[build] Fix nvjpeg packaging into the wheel (4752)
[build] Switch Android app to pytorch_android stable (4926)
[ci] Add libtinfo5 dependency (4931)
[ci] Revert vit_h_14 as it breaks our CI (5259)
[ci] Remove pager on git diff (4800)
[ci] Fix failing CI job for android (4912)
[ci] Add numpy as explicit dependency to build_cmake.sh (4987)

Code Quality

Various typing improvements (4603) (4172) (4173) (4631) (4619) (4583) (4602) (5182)
Add ufmt (usort + black) as code formatter (4384)
Fix formatting issues (4535) (4747)
Add pre-commit hook to fix line endings (5021)
Various imports cleanups/improvements (4533) (4879)
Use f-strings almost everywhere, and other cleanups by applying pyupgrade (4585)
Update code to Python 3.7 compliance and remove Python 3.6 references (5125) (5161)
Consolidate repr methods throughout the repo (5392)
Set allow_redefinition = True for mypy (4531)
Use `is` to compare type of objects (4605)
Various typos fixed (5031) (5092)
Fix annotations for Python >= 3.8 (5301)
Revamp log api usage method (5072)
[deprecation] Update deprecation messages stating APIs will be removed in 0.14 and remove APIs that were deprecated before 0.8 (5387) (5386)
[build] Updated setup.py to use TorchVersion object for version comparison (4307)
[ops] remove debugging asserts (5332)
[c++frontend] Fix missing Torch includes (5118)
[ci] Cleanup and removing unnecessary references and parameters (4983) (4930) (5042)
[datasets] [FBcode->GH] remove unused requests functionality (5014)
[datasets] allow single extension as str in make_dataset (5229)
[datasets] use helper function to extract archive in CelebA (4557)
[datasets] simplify QMNIST download logic (4562)
[documentation] fix `make html-noplot` docs build command (5389)
[models] Move all weight initializations from private methods to constructors (5331)
[models] simplify model builders (5001)
[models] Replace asserts with ValueErrors (5275)
[models] Use enumerate to get index of ModuleList (4534)
[models] Simplify efficientnet code by removing _efficientnet_conf (4690)
[models] Refactor Segmentation models (4646)
[models] Pass indexing param to meshgrid to avoid warning in detection models (4645)
[models] Refactor the backbone builders of detection (4656)
[models.quantization] Switch torch.quantization to torch.ao.quantization (5296) (4554)
[ops] Fixed unused variables in ops (4666)
[ops] Refactor poolers (4951)
[reference scripts] Simplify the gradient clipping code (4896)
[reference scripts] only set random generator if shuffle=true (5135)
[tests] Refactor BoxOps tests to use parameterize (5380)
[tests] rename TestWeights to appease pytest (5054)
[tests] fix and add test for sequence_to_str (5213)
[tests] remove get_bool_env_var (5222)
[models, tests] remove custom code for model output comparison (4971)
[utils, documentation] Fix annotation of draw_segmentation_masks (4527)
[video] Fix error message in demuxer (5293)

Contributors

We're grateful for our community, which helps us improve torchvision by submitting issues and PRs, and providing feedback and suggestions. The following persons have contributed patches for this release:

Abhijit Deo, Aditya Oke, Alexander Soare, Alexander Unnervik, Allen Goodman, Andrey Talman, Brian Johnson, Bruno Korbar, buckage, Carlosbogo, Chungman Lee, Daniel Falbel, David Fan, Dmytro, Eli Uriegas, Ethan White, Eugene Yurtsev, F-G Fernandez, Fedor, Francisco Massa, Guo, Harish Kulkarni, HeungwooLee, Hu Ye, Jane (Yuan) Xu, Jirka Borovec, Jithun Nair, Joao Gomes, Jopo, Kai Zhang, kbozas, Kevin Tse, Khushi Agrawal, Konstantinos Bozas, Kotchin, Kushashwa Ravi Shrimali, KyleCZH, Mark Harfouche, Marko Kohtala, Masahiro Masuda, Matti Picus, Mengwei Liu, Mohammad (Moe) Rezaalipour, Mriganka Nath, Muhammed Abdullah, Nicolas Granger, Nicolas Hug, Nikita Shulga, peterbell10, Philip Meier, Piyush Singh, Prabhat Roy, ProGamerGov, puhuk, Richard Barnes, rvandeghen, Sai Krishna, Santiago Castro, Saswat Das, Sepehr Sameni, Sergii Khomenko, Stephen Matthews, Sumanth Ratna, Sumukh Aithal, Tal Ben-Nun, Vasilis Vryniotis, vfdev, Xiaolin Wang, Yi Zhang, Yiwen Song, Yoshitomo Matsubara, Yuchen Huang, Yuxin Wu, zhiqiang, and Zhiqiang Wang.

0.11.3

This is a minor release compatible with [PyTorch 1.10.2](https://github.com/pytorch/pytorch/releases/tag/v1.10.2) and a minor bug fix.

Highlights

Bug Fixes
- [CI] Skip jpeg comparison tests with PIL (5232)

0.11.2

This minor release bumps the pinned PyTorch version to v1.10.1 and contains some minor bug fixes.

Highlights

Bug Fixes
- [CI] Fix clang_format issue (5061)
- [CI, MOBILE] Fix binary_libtorchvision_ops_android job (5062)
- [CI] Add numpy as explicit dependency to build_cmake.sh (5065)
- [MODELS] Amend the weights only if quantize=True. (5066)
- [TRANSFORMS] Fix augmentation space to be uint8 compatible (5067)
- [DATASETS] Fix WIDERFace download links (5068)
- [BUILD, WINDOWS] Workaround for loading bundled DLLs (5094)

0.11.1

Users were reporting issues installing torchvision on PyPI, this release contains an update to the dependencies for wheels to point directly to torch==0.10.0

0.11.0

This release introduces the RegNet and EfficientNet architectures, a new FX-based utility to perform Feature Extraction, new data augmentation techniques such as RandAugment and TrivialAugment, updated training recipes that support EMA, Label Smoothing, Learning-Rate Warmup, Mixup and Cutmix, and many more.

Highlights

New Models

[RegNet](https://arxiv.org/abs/2003.13678) and [EfficientNet](https://arxiv.org/abs/1905.11946) are two popular architectures that can be scaled to different computational budgets. In this release we include 22 pre-trained weights for their classification variants. The models were trained on ImageNet and can be used as follows:

python
import torch
from torchvision import models

x = torch.rand(1, 3, 224, 224)

regnet = models.regnet_y_400mf(pretrained=True)
regnet.eval()
predictions = regnet(x)

efficientnet = models.efficientnet_b0(pretrained=True)
efficientnet.eval()
predictions = efficientnet(x)


The accuracies of the pre-trained models obtained on ImageNet val are seen below (see [4403](https://github.com/pytorch/vision/pull/4403#issuecomment-930381524), [4530](https://github.com/pytorch/vision/pull/4530#issuecomment-933213238) and [4293](https://github.com/pytorch/vision/pull/4293) for more details)

|Model |Acc1 |Acc5 |
|--- |--- |--- |
|regnet_x_400mf |72.834 |90.95 |
|regnet_x_800mf |75.212 |92.348 |
|regnet_x_1_6gf |77.04 |93.44 |
|regnet_x_3_2gf |78.364 |93.992 |
|regnet_x_8gf |79.344 |94.686 |
|regnet_x_16gf |80.058 |94.944 |
|regnet_x_32gf |80.622 |95.248 |
|regnet_y_400mf |74.046 |91.716 |
|regnet_y_800mf |76.42 |93.136 |
|regnet_y_1_6gf |77.95 |93.966 |
|regnet_y_3_2gf |78.948 |94.576 |
|regnet_y_8gf |80.032 |95.048 |
|regnet_y_16gf |80.424 |95.24 |
|regnet_y_32gf |80.878 |95.34 |
|EfficientNet-B0 |77.692 |93.532 |
|EfficientNet-B1 |78.642 |94.186 |
|EfficientNet-B2 |80.608 |95.31 |
|EfficientNet-B3 |82.008 |96.054 |
|EfficientNet-B4 |83.384 |96.594 |
|EfficientNet-B5 |83.444 |96.628 |
|EfficientNet-B6 |84.008 |96.916 |
|EfficientNet-B7 |84.122 |96.908 |

We would like to thank Ross Wightman and Luke Melas-Kyriazi for contributing the weights of the EfficientNet variants.

FX-based Feature Extraction

A new Feature Extraction method has been added to our utilities. It uses PyTorch FX and enables us to retrieve the outputs of intermediate layers of a network which is useful for feature extraction and visualization. Here is an example of how to use the new utility:

python
import torch
from torchvision.models import resnet50
from torchvision.models.feature_extraction import create_feature_extractor


x = torch.rand(1, 3, 224, 224)

model = resnet50()

return_nodes = {
"layer4.2.relu_2": "layer4"
}
model2 = create_feature_extractor(model, return_nodes=return_nodes)
intermediate_outputs = model2(x)

print(intermediate_outputs['layer4'].shape)



We would like to thank Alexander Soare for developing this utility.

New Data Augmentations

Two new Automatic Augmentation techniques were added: [Rand Augment](https://arxiv.org/abs/1909.13719) and [Trivial Augment](https://arxiv.org/abs/2103.10158). Both methods can be used as drop-in replacement of the AutoAugment technique as seen below:

python
from torchvision import transforms

t = transforms.RandAugment()
t = transforms.TrivialAugmentWide()
transformed = t(image)

transform = transforms.Compose([
transforms.Resize(256),
transforms.RandAugment(), transforms.TrivialAugmentWide()
transforms.ToTensor()])


We would like to thank Samuel G. Müller for contributing Trivial Augment and for his help on refactoring the AA package.

Updated Training Recipes

We have updated our training reference scripts to add support of Exponential Moving Average, Label Smoothing, Learning-Rate Warmup, [Mixup](https://arxiv.org/abs/1710.09412), [Cutmix](https://arxiv.org/abs/1905.04899) and other [SOTA primitives](https://github.com/pytorch/vision/issues/3911). The above enabled us to improve the classification Acc1 of some pre-trained models by [over 4 points](https://github.com/pytorch/vision/issues/3995). A major update of the existing pre-trained weights is expected on the next release.

Backward-incompatible changes

[models] Use torch instead of scipy for random initialization of inception and googlenet weights (4256)

Deprecations

[models] Deprecate the C++ vision::models namespace (4375)

New Features

[datasets] Add iNaturalist dataset (4123)
[datasets] Download and Kinetics 400/600/700 Datasets (3680)
[datasets] Added LFW Dataset (4255)
[models] Add FX feature extraction as an alternative to intermediate_layer_getter (4302) (4418)
[models] Add RegNet Architecture in TorchVision (4403) (4530) (4550)
[ops] Add new masks_to_boxes op (4290) (4469)
[ops] Add StochasticDepth implementation (4301)
[reference scripts] Adding Mixup and Cutmix (4379)
[transforms] Integration of TrivialAugment with the current AutoAugment Code (4221)
[transforms] Adding RandAugment implementation (4348)
[models] Add EfficientNet Architecture in TorchVision (4293)

Improvements

Various documentation improvements (4239) (4251) (4275) (4342) (3894) (4159) (4133) (4138) (4089) (3944) (4349) (3754) (4308) (4352) (4318) (4244) (4362) (3863) (4382) (4484) (4503) (4376) (4457) (4505) (4363) (4361) (4337) (4546) (4553) (4565) (4567) (4574) (4575) (4383) (4390) (3409) (4451) (4340) (3967) (4072) (4028) (4132)
[build] Add CUDA-11.3 builds to torchvision (4248)
[ci, tests] Skip some CPU-only tests on CircleCI machines with GPU (4002) (4025) (4062)
[ci] New issue templates (4299)
[ci] Various CI improvements, in particular putting back GPU testing on windows (4421) (4014) (4053) (4482) (4475) (3998) (4388) (4179) (4394) (4162) (4065) (3928) (4081) (4203) (4011) (4055) (4074) (4419) (4067) (4201) (4200) (4202) (4496) (3925)
[ci] ping maintainers in case a PR was not properly labeled (3993) (4012) (4021) (4501)
[datasets] Add bzip2 file compression support to datasets (4097)
[datasets] Faster dataset indexing (3939)
[datasets] Enable logging of internal dataset instanciations. (4319) (4090)
[datasets] Removed copy=False in torch.from_numpy in MNIST to avoid warning (4184)
[io] Add warning for files with corrupt containers (3961)
[models, tests] Add test to check that classification models are FX-compatible (3662)
[tests] Speedup various tests (3929) (3933) (3936)
[models] Allow custom activation in SqueezeExcitation of EfficientNet (4448)
[models] Allow gradient backpropagation through GeneralizedRCNNTransform to inputs (4327)
[ops, tests] Add JIT tests (4472)
[ops] Make StochasticDepth FX-compatible (4373)
[ops] Added backward pass on CPU and CUDA for interpolation with anti-alias option (4208) (4211)
[ops] Small refactoring to support opt mode for torchvision ops (fb internal specific) (4080) (4095)
[reference scripts] Added Exponential Moving Average support to classification reference script (4381) (4406) (4407)
[reference scripts] Adding label smoothing on classification reference (4335)
[reference scripts] Further enhance Classification Reference (4444)
[reference scripts] Replaced to_tensor() with pil_to_tensor() + convert_image_dtype() (4452)
[reference scripts] Update the metrics output on reference scripts (4408)
[reference scripts] Warmup schedulers in References (4411)
[tests] Add check for fx compatibility on segmentation and video models (4131)
[tests] Mock redirection logic for tests (4197)
[tests] Replace set_deterministic with non-deprecated spelling (4212)
[tests] Skip building torchvision with ffmpeg when python==3.9 (4417)
[tests] [jit] Make operation call accept Stack& instead Stack* (63414) (4380)
[tests] make tests that involve GDrive more robust (4454)
[tests] remove dependency for dtype getters (4291)
[transforms] Replaced example usage of ToTensor() by PILToTensor() + ConvertImageDtype() (4494)
[transforms] Explicitly copying array in pil_to_tensor (4566) (4573)
[transforms] Make get_image_size and get_image_num_channels public. (4321)
[transforms] adding gray images support for adjust_contrast and adjust_saturation (4477) (4480)
[utils] Support single color in utils.draw_bounding_boxes (4075)
[video, documentation] Port the video_api.ipynb notebook to the example gallery (4241)
[video, io, tests] Added check for invalid input file (3932)
[video, io] remove deprecated function call (3861) (3989)
[video, tests] Removed test_audio_video_sync as it doesn't work as expected (4050)
[video] Build torchvision with ffmpeg only on Linux and ignore ffmpeg on other platforms (4413, 4410, 4041)

Bug Fixes

[build] Conda: Add numpy dependency (4442)
[build] Explicitly exclude PIL 8.3.0 from compatible dependencies (4148)
[build] More robust version check (4285)
[ci] Fix broken clang format test. (4320)
[ci] Remove mentions of conda-forge (4082)
[ci] fixup '*' -> '/.*/' for CI filter (4059)
[datasets] Fix download from google drive which was downloading empty files in some cases (4109)
[datasets] Fix splitting CelebA dataset (4377)
[datasets] Add support for files with periods in name (4099)
[io, tests] Don't check transparency channel for pil >= 8.3 in test_decode_png (4167)
[io] Fix size_t issues across JPEG versions and platforms (4439)
[io] Raise proper error when decoding 16-bits jpegs (4101)
[io] Unpinned the libjpeg version and fixed jpeg_mem_dest's size type Wind… (4288)
[io] deinterlacing PNG images with read_image (4268)
[io] More robust ffmpeg version query in setup.py (4254)
[io] Fixed read_image bug (3948)
[models] Don't download backbone weights if pretrained=True (4283)
[onnx, tests] Do not disable profiling executor in ONNX tests (4324)
[ops, tests] Fix DeformConvTester::test_backward_cuda by setting threads per block to 512 (3942)
[ops] Fix typing issue to make DeformConv2d scriptable (4079)
[ops] Fixes deform_conv issue with large input/output (4351)
[ops] Resolving tracing problem on StochasticDepth iterator. (4372)
[ops] Port quantize_val and dequantize_val into torchvision to avoid at::native and android xplat incompatibility (4311)
[reference scripts] Fix bug on EMA n_averaged estimation. (4544) (4545)
[tests] Avoid cmyk in nvjpeg tests (4246)
[tests] Catch ValueError due to recent change to torch.testing.assert_close (4165)
[tests] Fix failing tests by catching the proper exception from torch.testing (4121)
[tests] Skip test if connection issues on fate (4284)
[transforms] Fix RandAugment and TrivialAugment bugs (4370)
[transforms] [FBcode->GH] [JIT] Add reference semantics to TorchScript classes (44324) (4166)
[utils] Handle grayscale images on draw_bounding_boxes (4043) (4049)
[video, io] Fixed missing audio with video_reader and pyav backend (3934, 4064)

Code Quality

Various typing improvements (4369) (4168) (4169) (4170) (4171) (4224) (4227) (4395) (4409) (4232) (4234 (4236) (4226) (4416)
Renamed the “master” branch into “main” (4306) (4365)
[ci] (fb-internal only) Allow all torchvision test rules to run with RE (4073)
[ci] add pre-commit hooks for convenient formatting checks (4387)
[ci] Import hipify_python only when needed (4031)
[io] Fixed a couple of typos and removed unnecessary bracket (4345)
[io] use from_blob to avoid memcpy (4118)
[models, ops] Moving common layers to ops (4504)
[models, ops] Replace MobileNetV3's SqueezeExcitation with EfficientNet's one (4487)
[models] Explicitely store a distance value that is reused (4341)
[models] Use torch instead of scipy for random initialization of inception and googlenet weights (4256)
[onnx, tests] Use test images from repo rather than internet for ONNX tests (4176)
[onnx] Import ONNX utils from symbolic_opset11 module (4230)
[ops] Fix clang formatting in deform_conv2d_kernel.cu (3943)
[ops] Update gpu atomics include path (4478) (reverted)
[reference scripts] Cleaned-up coco evaluation code (4453)
[reference scripts] remove unused package in coco_eval.py (4404)
[tests] Ported all tests to pytest (3962) (3996) (3950) (3964) (3957) (3959) (3981) (3952) (3977) (3974) (3976) (3983) (3971) (3988) (3990) (3985) (3984) (4030) (3955)r (4008) (4010) (4023) (3954) (4026) (3953) (4047) (4185) (3947) (4045) (4036) (4034) (3978) (4046) (3991) (3930) (4038) (4037) (4215) (3972) (3966) (4114) (4177) (4280) (3946) (4233) (4258) (4035) (4040) (4000) (4196) (3922) (4032)
[tests] Prevent tests from leaking their respective RNG (4497) (3926) (4250)
[tests] Remove TestCase dependency for test_models_detection_anchor_utils.py (4207)
[tests] Removed tests executing deprecated F_t.center/five/ten_crop methods (4479)
[tests] Replace set_deterministic with non-deprecated spelling (4212)
[tests] Remove torchvision/test/fakedata_generation.py (4130)
[transforms, reference scripts] Added PILToTensor and ConvertImageDtype classes in reference scripts and used them to replace ToTensor(4495, 4481)
[transforms] Refactor AutoAugment to support more augmentations. (4338)
[transforms] Replace deprecated torch.lstsq with torch.linalg.lstsq (3918)
[video] Drop virtual from private member functions of Decoder class (4027)
[video] Fixed comparison warnings in audio_stream and video_stream (4007)
[video] Fixed some ffmpeg deprecation warnings in decoder (4003)

Contributors

We're grateful for our community, which helps us improving torchvision by submitting issues and PRs, and providing feedback and suggestions. The following persons have contributed patches for this release:

ABD-01, Adam J. Stewart, Aditya Oke, Alex Lin, Alexander Grund, Alexander Soare, Allen Goodman, Amani Kiruga, Anirudh, Beat Buesser, beet, Bert Maher, Bruno Korbar, Camilo De La Torre, cyy, D. Khuê Lê-Huu, David Fan, DevPranjal, dgenzel, dgenzel2, Dmitriy Genzel, Drishti Bhasin, Edward Z. Yang, Eli Uriegas, F-G Fernandez, Francisco Massa, Gary Miguel, Gaurav7888, IgorSusmelj, Ishan Kumar, Ivan Kobzarev, Jiawei Liu, Jithun Nair, Joao Gomes, Joe Early, Julien RIPOCHE, julienripoche, Kai Zhang, kingyiusuen, Loi Ly, Matti Picus, Meghan Lele, Muhammed Abdullah, Nicolas Hug, Nikita Shulga, ORippler, peterbell10, Philip Meier, Prabhat Roy, puhuk, Rajat Jaiswal, S Harish, Sahil Goyal, Samuel Gabriel, Santiago Castro, Saswat Das, Sepehr Sameni, Shengwei An, Shrill Shrestha, Shruti Pulstya, Sugato Ray, tanvimoharir, Vasilis Vryniotis, Vassilis C. Nicodemou, Vassilis Nicodemou, vfdev-5, Vincent Moens, Vivek Kumar, Yi Zhang, Yiwen Song, Yonghye Kwon, Yuchen Huang, Zhengxu Chen, Zhiqiang Wang, Zhongkai Zhu, zzk1st

Page 19 of 23

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.