New Model additions
Perceiver
Eight new models are released as part of the Perceiver implementation: `PerceiverModel`, `PerceiverForMaskedLM`, `PerceiverForSequenceClassification`, `PerceiverForImageClassificationLearned`, `PerceiverForImageClassificationFourier`, `PerceiverForImageClassificationConvProcessing`, `PerceiverForOpticalFlow`, `PerceiverForMultimodalAutoencoding`, in PyTorch.
The Perceiver IO model was proposed in [Perceiver IO: A General Architecture for Structured Inputs & Outputs](https://arxiv.org/abs/2107.14795) by Andrew Jaegle, Sebastian Borgeaud, Jean-Baptiste Alayrac, Carl Doersch,
Catalin Ionescu, David Ding, Skanda Koppula, Daniel Zoran, Andrew Brock, Evan Shelhamer, Olivier Hénaff, Matthew M.
Botvinick, Andrew Zisserman, Oriol Vinyals, João Carreira.
* Add Perceiver IO by NielsRogge in https://github.com/huggingface/transformers/pull/14487
Compatible checkpoints can be found on the hub: https://huggingface.co/models?other=perceiver
mLUKE
The mLUKE tokenizer is added. The tokenizer can be used for the multilingual variant of LUKE.
The mLUKE model was proposed in [mLUKE: The Power of Entity Representations in Multilingual Pretrained Language Models](https://arxiv.org/abs/2110.08151) by Ryokan Ri, Ikuya Yamada, and Yoshimasa Tsuruoka. It's a multilingual extension
of the [LUKE model](https://arxiv.org/abs/2010.01057) trained on the basis of XLM-RoBERTa.
* Add mLUKE by Ryou0634 in https://github.com/huggingface/transformers/pull/14640
Compatible checkpoints can be found on the hub: https://huggingface.co/models?other=luke
ImageGPT
Three new models are released as part of the ImageGPT integration: `ImageGPTModel`, `ImageGPTForCausalImageModeling`, `ImageGPTForImageClassification`, in PyTorch.
The ImageGPT model was proposed in [Generative Pretraining from Pixels](https://openai.com/blog/image-gpt/) by Mark
Chen, Alec Radford, Rewon Child, Jeffrey Wu, Heewoo Jun, David Luan, Ilya Sutskever. ImageGPT (iGPT) is a GPT-2-like
model trained to predict the next pixel value, allowing for both unconditional and conditional image generation.
* Add ImageGPT by NielsRogge in https://github.com/huggingface/transformers/pull/14240
Compatible checkpoints can be found on the hub: https://huggingface.co/models?other=imagegpt
QDQBert
Eight new models are released as part of the QDQBert implementation: `QDQBertModel`, `QDQBertLMHeadModel`, `QDQBertForMaskedLM`, `QDQBertForSequenceClassification`, `QDQBertForNextSentencePrediction`, `QDQBertForMultipleChoice`, `QDQBertForTokenClassification`, `QDQBertForQuestionAnswering`, in PyTorch.
The QDQBERT model can be referenced in [Integer Quantization for Deep Learning Inference: Principles and Empirical
Evaluation](https://arxiv.org/abs/2004.09602) by Hao Wu, Patrick Judd, Xiaojie Zhang, Mikhail Isaev and Paulius
Micikevicius.
* Add QDQBert model and quantization examples of SQUAD task by shangz-ai in https://github.com/huggingface/transformers/pull/14066
Semantic Segmentation models
*The semantic Segmentation models' API is unstable and bound to change between this version and the next*.
The first semantic segmentation models are added. In semantic segmentation, the goal is to predict a class label for every pixel of an image. The models that are added are SegFormer (by NVIDIA) and BEiT (by Microsoft Research). BEiT was already available in the library, but this release includes the model with a semantic segmentation head.
The SegFormer model was proposed in [SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers](https://arxiv.org/abs/2105.15203) by Enze Xie, Wenhai Wang, Zhiding Yu, Anima Anandkumar, Jose M. Alvarez, Ping Luo. The model consists of a hierarchical Transformer encoder and a lightweight all-MLP decode head to achieve great results on image segmentation benchmarks such as ADE20K and Cityscapes.
The BEiT model was proposed in [BEiT: BERT Pre-Training of Image Transformers](https://arxiv.org/abs/2106.08254) by Hangbo Bao, Li Dong, Furu Wei. Rather than pre-training the model to predict the class of an image (as done in the original ViT paper), BEiT models are pre-trained to predict visual tokens from the codebook of OpenAI’s DALL-E model given masked patches.
* Add SegFormer by NielsRogge in https://github.com/huggingface/transformers/pull/14019
* Add BeitForSemanticSegmentation by NielsRogge in https://github.com/huggingface/transformers/pull/14096
Vision-text dual encoder
Adds VisionTextDualEncoder model in PyTorch and Flax to be able to load any pre-trained vision (ViT, DeiT, BeiT, CLIP's vision model) and text (BERT, ROBERTA) model in the library for vision-text tasks like CLIP.
This model pairs a vision and text encoder and adds projection layers to project the embeddings to another embeddings space with similar dimensions. which can then be used to align the two modalities.
* VisionTextDualEncoder by patil-suraj in https://github.com/huggingface/transformers/pull/13511
CodeParrot
CodeParrot, a model trained to generate code, has been open-sourced in the research projects by lvwerra.
* Add CodeParrot 🦜 codebase by lvwerra in https://github.com/huggingface/transformers/pull/14536
Language model support for ASR
* Add language model support for CTC models by patrickvonplaten in https://github.com/huggingface/transformers/pull/14339
Language model boosted decoding is added for all CTC models via https://github.com/kensho-technologies/pyctcdecode and https://github.com/kpu/kenlm.
See https://huggingface.co/patrickvonplaten/wav2vec2-xlsr-53-es-kenlm for more information.
Flax-specific additions
Adds Flax version of the vision encoder-decoder model, and adds a Flax version of GPT-J.
* Add FlaxVisionEncoderDecoderModel by ydshieh in https://github.com/huggingface/transformers/pull/13359
* FlaxGPTJ by patil-suraj in https://github.com/huggingface/transformers/pull/14396
TensorFlow-specific additions
Vision transformers are here! Convnets are so 2012, now that ML is [converging on self-attention as a universal model](https://twitter.com/karpathy/status/1468370605229547522).
* Add TFViTModel by ydshieh in https://github.com/huggingface/transformers/pull/13778
Want to handle real-world tables, where text and data are positioned in a 2D grid? [TAPAS](https://huggingface.co/docs/transformers/model_doc/tapas) is now here for both TensorFlow and PyTorch.
* Tapas tf by kamalkraj in https://github.com/huggingface/transformers/pull/13393
Automatic checkpointing and cloud saves to the HuggingFace Hub during training are now live, allowing you to resume training when it's interrupted, even if your initial instance is terminated. This is an area of very active development - watch this space for future developments, including automatic model card creation and more.
* Add model checkpointing to push_to_hub and PushToHubCallback by Rocketknight1 in https://github.com/huggingface/transformers/pull/14492
Auto-processors
A new class to automatically select processors is added: `AutoProcessor`. It can be used for all models that require a processor, in both computer vision and audio.
* Auto processor by sgugger in https://github.com/huggingface/transformers/pull/14465
New documentation frontend
A new documentation frontend is out for the `transformers` library! The goal with this documentation is to be better aligned with the rest of our website, and contains tools to improve readability. The documentation can now be written in markdown rather than RST.
* Doc new front by sgugger in https://github.com/huggingface/transformers/pull/14590
LayoutLM Improvements
The LayoutLMv2 feature extractor now supports non-English languages, and LayoutXLM gets its own processor.
* LayoutLMv2FeatureExtractor now supports non-English languages when applying Tesseract OCR. by Xargonus in https://github.com/huggingface/transformers/pull/14514
* Add LayoutXLMProcessor (and LayoutXLMTokenizer, LayoutXLMTokenizerFast) by NielsRogge in https://github.com/huggingface/transformers/pull/14115
Trainer Improvements
You can now take advantage of the Ampere hardware with the Trainer:
- `--bf16` - do training or eval in mixed precision of bfloat16
- `--bf16_full_eval` - do eval in full bfloat16
- `--tf32` control having TF32 mode on/off
Improvements and bugfixes
* Replace assertions with RuntimeError exceptions by ddrm86 in https://github.com/huggingface/transformers/pull/14186
* Adding `batch_size` support for (almost) all pipelines by Narsil in https://github.com/huggingface/transformers/pull/13724
* Remove n_ctx from configs by thomasw21 in https://github.com/huggingface/transformers/pull/14165
* Add `BlenderbotTokenizerFast` by stancld in https://github.com/huggingface/transformers/pull/13720
* Adding `handle_long_generation` paramters for `text-generation` pipeline. by Narsil in https://github.com/huggingface/transformers/pull/14118
* Fix pipeline tests env and fetch by sgugger in https://github.com/huggingface/transformers/pull/14209
* Generalize problem_type to all sequence classification models by sgugger in https://github.com/huggingface/transformers/pull/14180
* Fixing image segmentation with inference mode. by Narsil in https://github.com/huggingface/transformers/pull/14204
* Add a condition for checking labels by hrxorxm in https://github.com/huggingface/transformers/pull/14211
* Torch 1.10 by LysandreJik in https://github.com/huggingface/transformers/pull/14169
* Add more missing models to models/__init__.py by ydshieh in https://github.com/huggingface/transformers/pull/14177
* Clarify QA examples by NielsRogge in https://github.com/huggingface/transformers/pull/14172
* Fixing `image-segmentation` tests. by Narsil in https://github.com/huggingface/transformers/pull/14223
* Tensor location is already handled by Narsil in https://github.com/huggingface/transformers/pull/14224
* Raising exceptions instead of using assertions for few models by pdcoded in https://github.com/huggingface/transformers/pull/14219
* Fix the write problem in trainer.py comment by wmathor in https://github.com/huggingface/transformers/pull/14202
* [GPTJ] enable common tests and few fixes by patil-suraj in https://github.com/huggingface/transformers/pull/14190
* improving efficiency of mlflow metric logging by wamartin-aml in https://github.com/huggingface/transformers/pull/14232
* Fix generation docstring by qqaatw in https://github.com/huggingface/transformers/pull/14216
* Fix test_configuration_tie in FlaxEncoderDecoderModelTest by ydshieh in https://github.com/huggingface/transformers/pull/14076
* [Tests] Fix DistilHubert path by anton-l in https://github.com/huggingface/transformers/pull/14245
* Add PushToHubCallback in main init by sgugger in https://github.com/huggingface/transformers/pull/14246
* Fixes Beit training for PyTorch 1.10+ by sgugger in https://github.com/huggingface/transformers/pull/14249
* Added Beit model ouput class by lumliolum in https://github.com/huggingface/transformers/pull/14133
* Update Transformers to huggingface_hub >= 0.1.0 by sgugger in https://github.com/huggingface/transformers/pull/14251
* Add cross attentions to TFGPT2Model by ydshieh in https://github.com/huggingface/transformers/pull/14038
* [Wav2Vec2] Adapt conversion script by patrickvonplaten in https://github.com/huggingface/transformers/pull/14258
* Put `load_image` function in `image_utils.py` & fix image rotation issue by mishig25 in https://github.com/huggingface/transformers/pull/14062
* minimal fixes to run DataCollatorForWholeWordMask with return_tensors="np" and return_tensors="tf" by dwyatte in https://github.com/huggingface/transformers/pull/13891
* Adding support for `truncation` parameter on `feature-extraction` pipeline. by Narsil in https://github.com/huggingface/transformers/pull/14193
* Fix of issue 13327: Wrong weight initialization for TF t5 model by dshirron in https://github.com/huggingface/transformers/pull/14241
* Fixing typo in error message. by Narsil in https://github.com/huggingface/transformers/pull/14226
* Pin Keras cause they messed their release by sgugger in https://github.com/huggingface/transformers/pull/14262
* Quality explain by sgugger in https://github.com/huggingface/transformers/pull/14264
* Add more instructions to the release guide by sgugger in https://github.com/huggingface/transformers/pull/14263
* Fixing slow pipeline tests by Narsil in https://github.com/huggingface/transformers/pull/14260
* Fixing mishandling of `ignore_labels`. by Narsil in https://github.com/huggingface/transformers/pull/14274
* improve rewrite state_dict missing _metadata by changwangss in https://github.com/huggingface/transformers/pull/14276
* Removing Keras version pinning by Rocketknight1 in https://github.com/huggingface/transformers/pull/14280
* Pin TF until tests are fixed by sgugger in https://github.com/huggingface/transformers/pull/14283
* [Hubert Docs] Make sure example uses a fine-tuned model by patrickvonplaten in https://github.com/huggingface/transformers/pull/14291
* Add new LFS prune API by sgugger in https://github.com/huggingface/transformers/pull/14294
* Remove `DPRPretrainedModel` from docs by xhlulu in https://github.com/huggingface/transformers/pull/14300
* Handle long answer needs to be updated. by Narsil in https://github.com/huggingface/transformers/pull/14279
* [tests] Fix SegFormer and BEiT tests by NielsRogge in https://github.com/huggingface/transformers/pull/14289
* Fix typo on PPLM example README by Beomi in https://github.com/huggingface/transformers/pull/14287
* [Marian Conversion] Fix eos_token_id conversion in conversion script by patrickvonplaten in https://github.com/huggingface/transformers/pull/14320
* [Tests] Update audio classification tests to support torch 1.10 by anton-l in https://github.com/huggingface/transformers/pull/14318
* [TFWav2Vec2Model] Fix input shapes in TFWav2Vec2WeightNormConv1D by anton-l in https://github.com/huggingface/transformers/pull/14319
* Fixing tests on master. by Narsil in https://github.com/huggingface/transformers/pull/14317
* Fixing mutable default argument in `pipeline`. by Narsil in https://github.com/huggingface/transformers/pull/14316
* Changed relative imports to absolute to allow convert_graph_to_onnx.py to run as a script. by nbertagnolli in https://github.com/huggingface/transformers/pull/14325
* Expand dynamic supported objects to configs and tokenizers by sgugger in https://github.com/huggingface/transformers/pull/14296
* [deepspeed] Enable multiple test runs on single box, defer to DS_TEST_PORT if set by jeffra in https://github.com/huggingface/transformers/pull/14331
* Small change to Wav2Vec2 model to support Tensor-Parallelism with DeepSpeed by RezaYazdaniAminabadi in https://github.com/huggingface/transformers/pull/14298
* Correct order of overflowing tokens for LayoutLmV2 tokenizer by Apoorvgarg-creator in https://github.com/huggingface/transformers/pull/13495
* Update Seq2Seq QA example script to use SQuAD metric. by karthikrangasai in https://github.com/huggingface/transformers/pull/14335
* remove an irrelevant test from test_modeling_tf_layoutlm by ydshieh in https://github.com/huggingface/transformers/pull/14341
* bump flax version by patil-suraj in https://github.com/huggingface/transformers/pull/14343
* Rewrite guides for fine-tuning with Datasets by stevhliu in https://github.com/huggingface/transformers/pull/13923
* [Bert2Bert] allow bert2bert + relative embeddings by patrickvonplaten in https://github.com/huggingface/transformers/pull/14324
* Support for TF >= 2.7 by sgugger in https://github.com/huggingface/transformers/pull/14345
* `BatchFeature`: Convert `List[np.ndarray]` to `np.ndarray` before converting to pytorch tensors by eladsegal in https://github.com/huggingface/transformers/pull/14306
* Adding some quality of life for `pipeline` function. by Narsil in https://github.com/huggingface/transformers/pull/14322
* Fix fast tokenization problems by qqaatw in https://github.com/huggingface/transformers/pull/13930
* Add notebook INC quantization for text classification tasks by echarlaix in https://github.com/huggingface/transformers/pull/14293
* enhance rewrite state_dict missing _metadata by changwangss in https://github.com/huggingface/transformers/pull/14348
* Fix list index out of range when padding nested empty lists by qqaatw in https://github.com/huggingface/transformers/pull/13876
* [testing] solve the port conflict by stas00 in https://github.com/huggingface/transformers/pull/14362
* Fix Flax params dtype by patil-suraj in https://github.com/huggingface/transformers/pull/13098
* [flax generate] allow passing params to encode by patil-suraj in https://github.com/huggingface/transformers/pull/14370
* Experimenting with adding proper get_config() and from_config() methods by Rocketknight1 in https://github.com/huggingface/transformers/pull/14361
* Fixing requirements for TF LM models and use correct model mappings by Rocketknight1 in https://github.com/huggingface/transformers/pull/14372
* fix loading flax bf16 weights in pt by patil-suraj in https://github.com/huggingface/transformers/pull/14369
* [wav2vec2] fix --gradient_checkpointing by stas00 in https://github.com/huggingface/transformers/pull/13964
* Adding support for raw python `generator` in addition to `Dataset` for pipelines by Narsil in https://github.com/huggingface/transformers/pull/14352
* minor doc fix by patil-suraj in https://github.com/huggingface/transformers/pull/14377
* [Wav2Vec2 Example] Improve fine-tuning script by patrickvonplaten in https://github.com/huggingface/transformers/pull/14373
* Use `AlbertConverter` for FNet instead of using FNet's own converter by qqaatw in https://github.com/huggingface/transformers/pull/14365
* Add support for WMT21 tokenizer in M2M100Tokenizer by patil-suraj in https://github.com/huggingface/transformers/pull/14376
* [M2M100Tokenizer] fix _build_translation_inputs by patil-suraj in https://github.com/huggingface/transformers/pull/14382
* Raise exceptions instead of using asserts in modeling_openai 12789 by nbertagnolli in https://github.com/huggingface/transformers/pull/14386
* [doc] performance and parallelism updates by stas00 in https://github.com/huggingface/transformers/pull/14391
* Quick fix to TF summarization example by Rocketknight1 in https://github.com/huggingface/transformers/pull/14401
* [Speech2Text2] Enable tokenizers by patrickvonplaten in https://github.com/huggingface/transformers/pull/14390
* Fix TFViT by NielsRogge in https://github.com/huggingface/transformers/pull/14399
* Fix weight loading issue by ydshieh in https://github.com/huggingface/transformers/pull/14016
* Replace BertLayerNorm with LayerNorm by eldarkurtic in https://github.com/huggingface/transformers/pull/14385
* [Wav2Vec2] Make sure that gradient checkpointing is only run if needed by patrickvonplaten in https://github.com/huggingface/transformers/pull/14407
* Allow per-version configurations by LysandreJik in https://github.com/huggingface/transformers/pull/14344
* Fix gradient_checkpointing backward compatibility by sgugger in https://github.com/huggingface/transformers/pull/14408
* Add forward method to dummy models by sgugger in https://github.com/huggingface/transformers/pull/14419
* Avoid looping when data exhausted by valentindey in https://github.com/huggingface/transformers/pull/14413
* Debug doc by sgugger in https://github.com/huggingface/transformers/pull/14424
* [Wav2Vec2] Add New Wav2Vec2 Translation by patrickvonplaten in https://github.com/huggingface/transformers/pull/14392
* Improve semantic segmentation models by NielsRogge in https://github.com/huggingface/transformers/pull/14355
* [Gradient checkpoining] Update Wav2Vec scripts by falcaopetri in https://github.com/huggingface/transformers/pull/14036
* [Bart] Fix docs by patrickvonplaten in https://github.com/huggingface/transformers/pull/14434
* [WIP] Ensure TF model configs can be converted to proper JSON by Zahlii in https://github.com/huggingface/transformers/pull/14415
* Recover Deleted XNLI Instructions by Helw150 in https://github.com/huggingface/transformers/pull/14437
* Fix EncoderDecoderModel code example by NielsRogge in https://github.com/huggingface/transformers/pull/14441
* Add a post init method to all models by sgugger in https://github.com/huggingface/transformers/pull/14431
* Fix finite IterableDataset test on multiple GPUs by sgugger in https://github.com/huggingface/transformers/pull/14445
* [Bert, et al] fix early device assignment by stas00 in https://github.com/huggingface/transformers/pull/14447
* Add GitPython to quality tools by LysandreJik in https://github.com/huggingface/transformers/pull/14459
* [ImageGPT] Small fixes by NielsRogge in https://github.com/huggingface/transformers/pull/14460
* [Generation] Allow `inputs_embeds` as an input by patrickvonplaten in https://github.com/huggingface/transformers/pull/14443
* Adding support for `hidden_states` and `attentions` in unbatching support. by Narsil in https://github.com/huggingface/transformers/pull/14420
* add Tuple as possible type hint for EvalPredictions label_ids by ameasure in https://github.com/huggingface/transformers/pull/14473
* Fix dummy objects for quantization by sgugger in https://github.com/huggingface/transformers/pull/14478
* Moving pipeline tests from `Narsil` to `hf-internal-testing`. by Narsil in https://github.com/huggingface/transformers/pull/14463
* Improve `add-new-pipeline` docs a bit by stancld in https://github.com/huggingface/transformers/pull/14485
* [test] add test for --config_overrides by stas00 in https://github.com/huggingface/transformers/pull/14466
* Support for Training with BF16 by JamesDeAntonis in https://github.com/huggingface/transformers/pull/13207
* fixes some key names for in LayoutLMv2 / LayoutXLM tokenizers by valentindey in https://github.com/huggingface/transformers/pull/14493
* Switch from using sum for flattening lists of lists in group_texts by nbroad1881 in https://github.com/huggingface/transformers/pull/14472
* [deepspeed] zero inference by stas00 in https://github.com/huggingface/transformers/pull/14253
* add cache_dir for tokenizer verification loading by vmaryasin in https://github.com/huggingface/transformers/pull/14508
* Fix feature extraction utils import by LysandreJik in https://github.com/huggingface/transformers/pull/14515
* [Tests] Improve vision tests by NielsRogge in https://github.com/huggingface/transformers/pull/14458
* [CI] clear `~/.cache/torch_extensions` between builds by stas00 in https://github.com/huggingface/transformers/pull/14520
* Fix a slow test. by Narsil in https://github.com/huggingface/transformers/pull/14527
* added save_directories for _psave_pretrained_pt and _tf, changed model to tf_model and pt_model, enable the notebook to run cleanly from top to bottom without error by cfregly in https://github.com/huggingface/transformers/pull/14529
* Quicktour updates by LysandreJik in https://github.com/huggingface/transformers/pull/14533
* Fixes by LysandreJik in https://github.com/huggingface/transformers/pull/14534
* [flax] unfreeze initial cache in gpt models by patil-suraj in https://github.com/huggingface/transformers/pull/14535
* Tokenizers docs: Specify which class contains `__call__` method by xhlulu in https://github.com/huggingface/transformers/pull/14379
* Rename ImageGPT by NielsRogge in https://github.com/huggingface/transformers/pull/14526
* [Generate] Fix generate with inputs_embeds on GPU by patrickvonplaten in https://github.com/huggingface/transformers/pull/14564
* [Flax] token-classification model steps enumerate start from 1 by kamalkraj in https://github.com/huggingface/transformers/pull/14547
* Fix sentinel token IDs in data collator for Flax T5 pretraining script by rahuln in https://github.com/huggingface/transformers/pull/14477
* Fix backend regex by sgugger in https://github.com/huggingface/transformers/pull/14566
* [Flax] Add FlaxBlenderbot by stancld in https://github.com/huggingface/transformers/pull/13633
* Add documentation for multi-label classification by gsnidero in https://github.com/huggingface/transformers/pull/14168
* use functional interface for softmax in attention by t-vi in https://github.com/huggingface/transformers/pull/14198
* Fix mask token handling by qqaatw in https://github.com/huggingface/transformers/pull/14364
* [doc] bf16/tf32 guide by stas00 in https://github.com/huggingface/transformers/pull/14579
* Rename toctree.yml -> _toctree.yml by mishig25 in https://github.com/huggingface/transformers/pull/14594
* Update doc img links by mishig25 in https://github.com/huggingface/transformers/pull/14593
* Adds a git pull instruction to the documentation builder by LysandreJik in https://github.com/huggingface/transformers/pull/14597
* [Flax] Add FlaxBlenderbotSmall by stancld in https://github.com/huggingface/transformers/pull/14576
* Python 3.6 -> Python 3.7 for TF runs by LysandreJik in https://github.com/huggingface/transformers/pull/14598
* change tf.math.divide with int(/) in distilbert model by yis11178 in https://github.com/huggingface/transformers/pull/14600
* fix 14524 (IndexError when mask prob is too low) by nikvaessen in https://github.com/huggingface/transformers/pull/14525
* Improve tokenizer tests by qqaatw in https://github.com/huggingface/transformers/pull/13594
* [CI] move env print to util, add pt, nccl versions by stas00 in https://github.com/huggingface/transformers/pull/14607
* 2022 is the year of multi-modality by LysandreJik in https://github.com/huggingface/transformers/pull/14610
* Fix doc builder by LysandreJik in https://github.com/huggingface/transformers/pull/14616
* [trainer] add tf32-mode control by stas00 in https://github.com/huggingface/transformers/pull/14606
* Make DefaultDataCollator importable from root by Rocketknight1 in https://github.com/huggingface/transformers/pull/14588
* fix a typo by yuchenlin in https://github.com/huggingface/transformers/pull/14626
* updated pytorch token-classification readme by kamalkraj in https://github.com/huggingface/transformers/pull/14624
* Add Flax example tests by patil-suraj in https://github.com/huggingface/transformers/pull/14599
* fix typo by patil-suraj in https://github.com/huggingface/transformers/pull/14635
* add flax example tests in CI workflow by patil-suraj in https://github.com/huggingface/transformers/pull/14637
* [urls to hub] Replace outdated model tags with their now-canonical pipeline types by julien-c in https://github.com/huggingface/transformers/pull/14617
* Update the example of exporting Bart + BeamSearch to ONNX module to resolve comments. by fatcat-z in https://github.com/huggingface/transformers/pull/14310
* Add GPTJForQuestionAnswering by tucan9389 in https://github.com/huggingface/transformers/pull/14503
* doc: mismatch between pooler/d_output by guhur in https://github.com/huggingface/transformers/pull/14641
* fix flax example tests by patil-suraj in https://github.com/huggingface/transformers/pull/14643
* Auto processor fix by LysandreJik in https://github.com/huggingface/transformers/pull/14623
* Fix syntax for class references by sgugger in https://github.com/huggingface/transformers/pull/14644
* Add a job to test the documentation build by sgugger in https://github.com/huggingface/transformers/pull/14645
* fix flax examples tests by patil-suraj in https://github.com/huggingface/transformers/pull/14646
* Use cross_attention_hidden_size in Encoder-Decoder models by ydshieh in https://github.com/huggingface/transformers/pull/14378
* [deepspeed] fix --load_best_model_at_end by stas00 in https://github.com/huggingface/transformers/pull/14652
* quick fix SummarizationPipeline error messages by NouamaneTazi in https://github.com/huggingface/transformers/pull/14618
* Fix a Bug, trainer_seq2seq.py, in the else branch at Line 172, generation_inputs should be a dict by TranSirius in https://github.com/huggingface/transformers/pull/14546
* [trainer] conditional ctx managers into one wrapper by stas00 in https://github.com/huggingface/transformers/pull/14663
* Fixing Dataset for TQA + token-classification. by Narsil in https://github.com/huggingface/transformers/pull/14658
* fix deprecated tf method by ZOHETH in https://github.com/huggingface/transformers/pull/14671
* Fix doc builder by LysandreJik in https://github.com/huggingface/transformers/pull/14676
* [AutoProcessor] Add Wav2Vec2WithLM & small fix 14675 (patrickvonplaten)
* Added support for other features for already supported models 14358 (michaelbenayoun)
* Revert "Added support for other features for already supported models" 14679 (lewtun)
* Convert tutorials 14665 (sgugger)
* fix: verify jsonlines file in run_translation (14660) 14661 (GaurangTandon)
* Improvements to Comet Integration 14680 (DN6)
* Fixes in init 14681 (sgugger)
* Revert open-in-colab and add perceiver 14683 (sgugger)
* Fix wrong checkpoint paths in doc examples 14685 (ydshieh)
* [bf16 support] tweaks 14580 (stas00)
* [trainer] support UserDict inputs (torch-nightly) 14688 (stas00)
* Move pyctcdecode 14686 (sgugger)
* Make MLuke tokenizer tests slow 14690 (sgugger)
* Fix doc examples: name '...' is not defined 14687 (ydshieh)
* Add a job to test doc building (for realsies this time) 14662 (sgugger)
* Fix Perceiver tests 14703 (NielsRogge)
* add str hub token to repository when provided else fallback to default 14682 (philschmid)
* Fix typo in toctree 14704 (mishig25)
New Contributors
* hrxorxm made their first contribution in https://github.com/huggingface/transformers/pull/14211
* pdcoded made their first contribution in https://github.com/huggingface/transformers/pull/14219
* wmathor made their first contribution in https://github.com/huggingface/transformers/pull/14202
* wamartin-aml made their first contribution in https://github.com/huggingface/transformers/pull/14232
* lumliolum made their first contribution in https://github.com/huggingface/transformers/pull/14133
* dwyatte made their first contribution in https://github.com/huggingface/transformers/pull/13891
* dshirron made their first contribution in https://github.com/huggingface/transformers/pull/14241
* changwangss made their first contribution in https://github.com/huggingface/transformers/pull/14276
* xhlulu made their first contribution in https://github.com/huggingface/transformers/pull/14300
* Beomi made their first contribution in https://github.com/huggingface/transformers/pull/14287
* nbertagnolli made their first contribution in https://github.com/huggingface/transformers/pull/14325
* jeffra made their first contribution in https://github.com/huggingface/transformers/pull/14331
* RezaYazdaniAminabadi made their first contribution in https://github.com/huggingface/transformers/pull/14298
* echarlaix made their first contribution in https://github.com/huggingface/transformers/pull/14293
* valentindey made their first contribution in https://github.com/huggingface/transformers/pull/14413
* Zahlii made their first contribution in https://github.com/huggingface/transformers/pull/14415
* Helw150 made their first contribution in https://github.com/huggingface/transformers/pull/14437
* shangz-ai made their first contribution in https://github.com/huggingface/transformers/pull/14066
* vmaryasin made their first contribution in https://github.com/huggingface/transformers/pull/14508
* cfregly made their first contribution in https://github.com/huggingface/transformers/pull/14529
* Xargonus made their first contribution in https://github.com/huggingface/transformers/pull/14514
* rahuln made their first contribution in https://github.com/huggingface/transformers/pull/14477
* gsnidero made their first contribution in https://github.com/huggingface/transformers/pull/14168
* t-vi made their first contribution in https://github.com/huggingface/transformers/pull/14198
* JamesDeAntonis made their first contribution in https://github.com/huggingface/transformers/pull/13207
* yis11178 made their first contribution in https://github.com/huggingface/transformers/pull/14600
* nikvaessen made their first contribution in https://github.com/huggingface/transformers/pull/14525
* yuchenlin made their first contribution in https://github.com/huggingface/transformers/pull/14626
* Ryou0634 made their first contribution in https://github.com/huggingface/transformers/pull/14640
* NouamaneTazi made their first contribution in https://github.com/huggingface/transformers/pull/14618
* TranSirius made their first contribution in https://github.com/huggingface/transformers/pull/14546
* ZOHETH made their first contribution in https://github.com/huggingface/transformers/pull/14671
**Full Changelog**: https://github.com/huggingface/transformers/compare/v4.12.0...v4.13.0