Transformers

Latest version: v4.46.2

Safety actively analyzes 679296 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 9 of 30

4.31.0

Not secure
New models

Llama v2

Llama 2 was proposed in [LLaMA: Open Foundation and Fine-Tuned Chat Models](https://ai.meta.com/research/publications/llama-2-open-foundation-and-fine-tuned-chat-models/) by Hugo Touvron et al. It builds upon the Llama architecture adding Grouped Query Attention for efficient inference.

* Add support for Llama 2 by ArthurZucker in 24891

Musicgen

The MusicGen model was proposed in the paper [Simple and Controllable Music Generation](https://arxiv.org/abs/2306.05284) by Jade Copet, Felix Kreuk, Itai Gat, Tal Remez, David Kant, Gabriel Synnaeve, Yossi Adi and Alexandre Défossez.

MusicGen is a single stage auto-regressive Transformer model capable of generating high-quality music samples conditioned on text descriptions or audio prompts. The text descriptions are passed through a frozen text encoder model to obtain a sequence of hidden-state representations. MusicGen is then trained to predict discrete audio tokens, or audio codes, conditioned on these hidden-states. These audio tokens are then decoded using an audio compression model, such as EnCodec, to recover the audio waveform.

Through an efficient token interleaving pattern, MusicGen does not require a self-supervised semantic representation of the text/audio prompts, thus eliminating the need to cascade multiple models to predict a set of codebooks (e.g. hierarchically or upsampling). Instead, it is able to generate all the codebooks in a single forward pass.

* Add Musicgen by sanchit-gandhi in 24109

Bark

Bark is a transformer-based text-to-speech model proposed by Suno AI in [suno-ai/bark](https://github.com/suno-ai/bark).

* Add bark by ylacombe in 24086

MMS

The MMS model was proposed in [Scaling Speech Technology to 1,000+ Languages](https://arxiv.org/abs/2305.13516) by Vineel Pratap, Andros Tjandra, Bowen Shi, Paden Tomasello, Arun Babu, Sayani Kundu, Ali Elkahky, Zhaoheng Ni, Apoorv Vyas, Maryam Fazel-Zarandi, Alexei Baevski, Yossi Adi, Xiaohui Zhang, Wei-Ning Hsu, Alexis Conneau, Michael Auli

* Add MMS CTC Fine-Tuning by patrickvonplaten in 24281

EnCodec

The EnCodec neural codec model was proposed in [High Fidelity Neural Audio Compression](https://arxiv.org/abs/2210.13438) by Alexandre Défossez, Jade Copet, Gabriel Synnaeve, Yossi Adi.

* Add EnCodec model by hollance in 23655

InstructBLIP

The InstructBLIP model was proposed in [InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning](https://arxiv.org/abs/2305.06500) by Wenliang Dai, Junnan Li, Dongxu Li, Anthony Meng Huat Tiong, Junqi Zhao, Weisheng Wang, Boyang Li, Pascale Fung, Steven Hoi. InstructBLIP leverages the [BLIP-2](https://huggingface.co/docs/transformers/main/en/model_doc/blip-2) architecture for visual instruction tuning.

* Add InstructBLIP by NielsRogge in 23460

Umt5

The UMT5 model was proposed in [UniMax: Fairer and More Effective Language Sampling for Large-Scale Multilingual Pretraining](https://openreview.net/forum?id=kXwdL1cWOAi) by Hyung Won Chung, Xavier Garcia, Adam Roberts, Yi Tay, Orhan Firat, Sharan Narang, Noah Constant.

* [`Umt5`] Add google's umt5 to `transformers` by ArthurZucker in 24477

MRA

The MRA model was proposed in [Multi Resolution Analysis (MRA) for Approximate Self-Attention](https://arxiv.org/abs/2207.10284) by Zhanpeng Zeng, Sourav Pal, Jeffery Kline, Glenn M Fung, and Vikas Singh.

* Add Multi Resolution Analysis (MRA) by novice03 in 24513

ViViT

The Vivit model was proposed in [ViViT: A Video Vision Transformer](https://arxiv.org/abs/2103.15691) by Anurag Arnab, Mostafa Dehghani, Georg Heigold, Chen Sun, Mario Lučić, Cordelia Schmid. The paper proposes one of the first successful pure-transformer based set of models for video understanding.

* Add ViViT by jegork in 22518

Python 3.7

The last version to support Python 3.7 was 4.30.x, as it reached end-of-life on June 27, 2023 and is no longer supported by the Python Software Foundation.

* ⚠️ Time to say goodbye to py37 by ydshieh in 24091

4.30.2

Not secure
- Fix push to hubby NielsRogge in 24187
- Fix how we detect the TF package by Rocketknight1 in 24255

4.30.1

Not secure
- Fix bnb config json serialization in 24137 by younesbelkada
- Correctly build models and import call_context for older TF versions in 24138 by Rocketknight1
- Fix bugs with trainer in 24134 by pacman100

4.30.0

Not secure
100k

Transformers has just reached 100k stars on GitHub, and to celebrate we wanted to highlight 100 projects in the vicinity of `transformers` and we have decided to create an [awesome-transformers](https://github.com/huggingface/transformers/blob/main/awesome-transformers.md) page to do just that.

We accept PRs to add projects to the list!

* Top 100 by LysandreJik in 22912
* Add LlamaIndex to awesome-transformers.md by ravi03071991 in 23484
* add cleanlab to awesome-transformers tools list by jwmueller in 23440

4-bit quantization and QLoRA

By leveraging the `bitsandbytes` library by TimDettmers, we add 4-bit support to `transformers` models!

* 4-bit QLoRA via bitsandbytes (4-bit base model + LoRA) by TimDettmers in 23479

Agents

The Agents framework has been improved and continues to be stabilized. Among bug fixes, here are the important new features that were added:
- Local agent capabilities, to load a generative model directly from `transformers` instead of relying on APIs.
- Prompts are now hosted on the Hub, which means that anyone can fork the prompts and update them with theirs, to let other community contributors re-use them
- We add an `AzureOpenAiAgent` class to support Azure OpenAI agents.

* Add local agent by sgugger in 23438
* Enable prompts on the Hub by sgugger in 23662
* Add AzureOpenAiAgent by sgugger in 24058

Safetensors

The `safetensors` library is a safe serialization framework for machine learning tensors. It has been audited and will become the default serialization framework for several organizations (Hugging Face, EleutherAI, Stability AI).

It has now become a core dependency of `transformers`.

* Making `safetensors` a core dependency. by Narsil in 23254

New models

Swiftformer

The SwiftFormer paper introduces a novel efficient additive attention mechanism that effectively replaces the quadratic matrix multiplication operations in the self-attention computation with linear element-wise multiplications. A series of models called ‘SwiftFormer’ is built based on this, which achieves state-of-the-art performance in terms of both accuracy and mobile inference speed. Even their small variant achieves 78.5% top-1 ImageNet1K accuracy with only 0.8 ms latency on iPhone 14, which is more accurate and 2× faster compared to MobileViT-v2.

* Add swiftformer by shehanmunasinghe in 22686

Autoformer

This model augments the Transformer as a deep decomposition architecture, which can progressively decompose the trend and seasonal components during the forecasting process.

* [Time-Series] Autoformer model by elisim in 21891

MobileViTv2

MobileViTV2 is the second version of MobileViT, constructed by replacing the multi-headed self-attention in MobileViT with separable self-attention.

* Add MobileViTv2 by shehanmunasinghe in 22820

PerSAM

PerSAM proposes a minimal modification to [SAM](https://huggingface.co/docs/transformers/model_doc/sam) to allow dreambooth-like personalization, enabling to segment concepts in new images using just one example.

* Add PerSAM [bis] by NielsRogge in 23659

Timm backbone

We add support for loading `timm` weights within the `AutoBackbone` API in `transformers`. `timm` models can be instantiated through the `TimmBackbone` class, and then used with any vision model that needs a backbone.

* Add TimmBackbone model by amyeroberts in 22619

Image to text pipeline conditional support

We add conditional text generation to the image to text pipeline; allowing the model to continue generating an initial text prompt according to an image.

* [image-to-text pipeline] Add conditional text support + GIT by NielsRogge in 23362

TensorFlow implementations

* Add TensorFlow implementation of EfficientFormer by D-Roberts in 22620

Accelerate Migration

A major rework of the internals of the `Trainer` is underway, leveraging `accelerate` instead of redefining them in `transformers`. This should unify both framework and lead to increased interoperability and more efficient development.

* Smangrul/accelerate mp integrate by pacman100 in 23148
* Smangrul/accelerate ddp integrate by pacman100 in 23151
* fix trainer slow tests related to hyperparam search by pacman100 in 24011
* remove the extra `accelerator.prepare` by pacman100 in 23914
* move fsdp handling to accelerate by pacman100 in 23158
* shift torch dynamo handling to accelerate by pacman100 in 23168
* accelerate deepspeed and gradient accumulation integrate by pacman100 in 23236
* fix executable batch size issue by pacman100 in 24067
* fix accelerator prepare during eval only mode by pacman100 in 24014
* reset accelerate env variables after each test by pacman100 in 24107
* Fix translation no_trainer by muellerzr in 23407
* Update error message when Accelerate isn't installed by muellerzr in 23373
* Fix parallel mode check by muellerzr in 23409
* Muellerzr fix deepspeed by muellerzr in 23657
* Update all no_trainer with skip_first_batches by muellerzr in 23664
* Fix sagemaker DP/MP by muellerzr in 23681
* Log the right train_batch_size if using auto_find_batch_size and also log the adjusted value seperately. by muellerzr in 23800
* Up pinned accelerate version by muellerzr in 24089
* Move import check to before state reset by muellerzr in 23906
* Upgrade safetensors version by muellerzr in 23911
* Act on deprecations in Accelerate no_trainer examples by muellerzr in 24053
* Oops, missed one by muellerzr in 24054

Bugfixes and improvements

* chore: allow protobuf 3.20.3 requirement by jose-turintech in 22759
* Fix link displayed for custom tools by sgugger in 23274
* Remove missplaced test file by sgugger in 23275
* Bring back the PR `Refactor doctests + add CI` to `main` by ydshieh in 23271
* [`gpt`] Gpt2 fix half precision causal mask by younesbelkada in 23256
* Temporary tolerance fix for flaky whipser PT-TF equiv. test by amyeroberts in 23257
* Add `top_k` argument to post-process of conditional/deformable-DETR by CreatlV in 22787
* `transformers-cli` -> `huggingface-cli` by AlpinDale in 23276
* Temporarily increase tol for PT-FLAX whisper tests by amyeroberts in 23288
* Added missing " in CHAT_PROMPT_TEMPLATE by galatolofederico in 23287
* Update custom_tools.mdx: fix link by mishig25 in 23292
* Update transformers_agents.mdx by mishig25 in 23289
* Convert numpy arrays to lists before saving the evaluation metrics as json by harisankar95 in 23268
* Fix doctest files fetch issue by ydshieh in 23277
* skip `test_run_squad_no_trainer` for now by ydshieh in 23302
* Better check for packages availability by apbard in 23163
* Add gradient_checkpointing parameter to FlaxWhisperEncoder by raghavanone in 23300
* Agents extras by LysandreJik in 23301
* Fix broken links in the agent docs by sgugger in 23297
* Fix typo in gradio-tools docs by freddyaboulton in 23305
* Fix image segmentation tool test by sgugger in 23306
* unpin tf prob by ydshieh in 23293
* Revert "search buffers for dtype" by sgugger in 23308
* Remove `LanguageIdentificationTool` in `__init__.py` as we don't have it yet by ydshieh in 23326
* Fix docker image (caused by `tensorflow_text`) by ydshieh in 23321
* Compute the mask in-place, with less memory reads, and on CUDA on `XLNetLMHeadModel` by lezcano in 23332
* Only add files with modification outside doc blocks by ydshieh in 23327
* [docs] Fix Agents and Tools docstring by stevhliu in 23313
* OR am I crazy? by hwuebben in 23295
* Handle padding warning in generation when using `inputs_embeds` by zrthxn in 23131
* replaced assert with raise ValueError for t5, switch_transformers, pix2struct, mt5, longt5, gptsan_japanese. by susnato in 23273
* Use cu118 with cudnn >= 8.6 in docker file by ydshieh in 23339
* Removing one of the twice defined position_embeddings in LongFormer by GregorySenay in 23343
* Fix issue introduced in PR 23163 by ydshieh in 23363
* Typo suggestion by richardachen in 23360
* Fix some `is_xxx_available` by ydshieh in 23365
* Fix `BigBirdForMaskedLM` doctest by ydshieh in 23369
* Fix `OwlViTForObjectDetection.image_guided_detection` doc example by ydshieh in 23370
* Revert "Only add files with modification outside doc blocks" by ydshieh in 23371
* [Bugfix] `OPTDecoderLayer` does not return attentions when `gradient_checkpointing` and `training` is enabled. by gmlwns2000 in 23367
* Skip failing `AlignModelTest::test_multi_gpu_data_parallel_forward` by ydshieh in 23374
* Fix test typos - audio feature extractors by LWprogramming in 23310
* Added type hints for `Graphormer` pytorch version by dewasahu2003 in 23073
* Replace NumPy Operations with JAX NumPy Equivalents for JIT Compilation Compatibility by gojiteji in 23356
* Use `mkstemp` to replace deprecated `mktemp` by ready-research in 23372
* Fix `RwkvModel` by ydshieh in 23392
* Update `test_batched_inference_image_captioning_conditioned` by ydshieh in 23391
* OPT/BioGPT: Improved attention mask shape exception by gante in 23270
* Fix chat prompt in HFAgent by IvanSedykh in 23335
* 🌐 [i18n-KO] Translated `asr.mdx` to Korean by sim-so in 23106
* Minor fixes in transformers-tools by Wauplin in 23364
* [`Pix2Struct`] Add conditional generation on docstring example by younesbelkada in 23399
* Generate: faster `can_generate` check on TF and Flax by gante in 23398
* [AutoModel] fix `torch_dtype=auto` in `from_pretrained` by stas00 in 23379
* Docs: add link to assisted generation blog post by gante in 23397
* Build with non Python files by sgugger in 23405
* Generate: add test to check KV format by gante in 23403
* Replace appends with list comprehension. by ttsugriy in 23359
* Fix smdistributed check by sgugger in 23414
* Why crash the whole run when HFHub gives a 50x error? by ropoctl in 23320
* Run doctest (in PRs) only when some doc example(s) are modified by ydshieh in 23387
* Update `ConvNextV2ModelIntegrationTest::test_inference_image_classification_head` by ydshieh in 23402
* Fix a typo in HfAgent docstring. by ttsugriy in 23420
* Use dict.items to avoid unnecessary lookups. by ttsugriy in 23415
* Update 3 docker files to use cu118 by ydshieh in 23406
* [`SAM`] fix sam slow test by younesbelkada in 23376
* Return early once stop token is found. by ttsugriy in 23421
* [Reland] search model buffers for dtype as the last resort by cyyever in 23319
* Add Missing tokenization test [electra] by IMvision12 in 22997
* Small fixes and link in the README by LysandreJik in 23428
* TF: embeddings out of bounds check factored into function by gante in 23427
* Update Bigbird Pegasus tests by ydshieh in 23431
* Encoder-Decoder: add informative exception when the decoder is not compatible by gante in 23426
* Remove hardcoded prints in Trainer by hugoabonizio in 23432
* Fix device issue in `SwiftFormerModelIntegrationTest::test_inference_image_classification_head` by ydshieh in 23435
* Generate: skip left-padding tests on old models by gante in 23437
* remove unnecessary print in gpt neox sequence classifier by cfhammill in 23433
* 🌐 [i18n-KO] Translated `tasks/zero_shot_object_detection.mdx` to Korean by HanNayeoniee in 23430
* Fix (skip) a pipeline test for `RwkvModel` by ydshieh in 23444
* Fix DecisionTransformerConfig doctring by joaoareis in 23450
* TF: GPT2 with native embedding layers by gante in 23436
* Make `RwkvModel` accept `attention_mask` but discard it internally by ydshieh in 23442
* Less flaky `test_assisted_decoding_matches_greedy_search` by ydshieh in 23451
* Update tiny models and pipeline tests by ydshieh in 23446
* Properly guard PyTorch stuff by sgugger in 23452
* Add an option to log result from the Agent by sgugger in 23454
* Clean up CUDA kernels by sgugger in 23455
* fix bug in group_texts function, that was inserting short batches by BodaSadalla98 in 23429
* feat: Whisper prompting by connor-henderson in 22496
* README: Fix affiliation for MEGA by julien-c in 23394
* Remove .data usages in optimizations.py by alanwaketan in 23417
* TF port of the Segment Anything Model (SAM) by Rocketknight1 in 22970
* [`RWKV`] Rwkv fix for 8bit inference by younesbelkada in 23468
* Use config to set name and description if not present by sgugger in 23473
* Fix `transformers`' DeepSpeed CI job by ydshieh in 23463
* Fix PretrainedConfig `min_length` docstring by joaoareis in 23471
* Fix: Change tensors to integers for torch.dynamo and torch.compile compatibility by loevlie in 23475
* [`Blip`] Remove redundant shift right by younesbelkada in 23153
* Fix DeepSpeed stuff in the nightly CI by ydshieh in 23478
* Fix confusing `transformers` installation in CI by ydshieh in 23465
* Fix `tests/repo_utils/test_get_test_info.py` by ydshieh in 23485
* Debug example code for MegaForCausalLM by Tylersuard in 23382
* Remove erroneous `img` closing tag by xenova in 23646
* Fix tensor device while attention_mask is not None by zspo in 23538
* Fix accelerate logger bug by younesbelkada in 23650
* Bugfix: LLaMA layer norm incorrectly changes input type and consumers lots of memory by TimDettmers in 23535
* Fix wav2vec2 is_batched check to include 2-D numpy arrays by LWprogramming in 23223
* changing the requirements to a cpu torch version that works by sshahrokhi in 23483
* Fix SAM tests and use smaller checkpoints by Rocketknight1 in 23656
* Update workflow files by ydshieh in 23658
* small fix to remove unused eos in processor when it's not used. by Narsil in 23408
* Fix typo in a parameter name for open llama model by aaalexlit in 23637
* Fix PyTorch SAM tests by ydshieh in 23682
* 🌐 [i18n-KO] Translated `tasks/monocular_depth_estimation.mdx` to Korean by HanNayeoniee in 23621
* Fix a `BridgeTower` test by ydshieh in 23694
* [`SAM`] Fixes pipeline and adds a dummy pipeline test by younesbelkada in 23684
* TF version compatibility fixes by Rocketknight1 in 23663
* [`Blip`] Fix blip doctest by younesbelkada in 23698
* is_batched fix for remaining 2-D numpy arrays by LWprogramming in 23309
* Skip `TFCvtModelTest::test_keras_fit_mixed_precision` for now by ydshieh in 23699
* fix: load_best_model_at_end error when load_in_8bit is True by dkqkxx in 23443
* Fix some docs what layerdrop does by zspo in 23691
* add GPTJ/bloom/llama/opt into model list and enhance the jit support by sywangyi in 23291

* Paged Optimizer + Lion Optimizer for Trainer by TimDettmers in 23217
* Export to ONNX doc refocused on using optimum, added tflite by MKhalusova in 23434
* fix: use bool instead of uint8/byte in Deberta/DebertaV2/SEW-D to make it compatible with TensorRT by uchuhimo in 23683
* fix gptj could not jit.trace in GPU by sywangyi in 23317
* Better TF docstring types by Rocketknight1 in 23477
* Minor awesome-transformers.md fixes by pagarsky in 23453
* TF SAM memory reduction by Rocketknight1 in 23732
* fix: delete duplicate sentences in `document_question_answering.mdx` by jungnerd in 23735
* fix: Whisper generate, move text_prompt_ids trim up for max_new_tokens calculation by connor-henderson in 23724
* Overhaul TF serving signatures + dummy inputs by Rocketknight1 in 23234
* [Whisper] Reduce batch size in tests by sanchit-gandhi in 23736
* Fix the regex in `get_imports` to support multiline try blocks and excepts with specific exception types by dakinggg in 23725
* Remove the last few TF serving sigs by Rocketknight1 in 23738
* Fix `pip install --upgrade accelerate` command in modeling_utils.py by tloen in 23747
* Fix psuh_to_hub in Trainer when nothing needs pushing by sgugger in 23751
* Revamp test selection for the example tests by sgugger in 23737
* [LongFormer] code nits, removed unused parameters by ArthurZucker in 23749
* Fix is_ninja_available() by niltok in 23752
* [`Nllb-Moe`] Fix nllb moe accelerate issue by younesbelkada in 23758
* [OPT] Doc nit, using fast is fine by ArthurZucker in 23789
* Fix RWKV backward on GPU by sgugger in 23774
* Update trainer.mdx class_weights example by amitportnoy in 23787
* no_cuda does not take effect in non distributed environment by sywangyi in 23795
* Fix no such file or directory error by RissyRan in 23783
* Enable code-specific revision for code on the Hub by sgugger in 23799
* add type hint in pipeline model argument by y3sar in 23740
* TF SAM shape flexibility fixes by Rocketknight1 in 23842
* fix Whisper tests on GPU by hollance in 23753
* 🌐 [i18n-KO] Translated `fast_tokenizers.mdx` to Korean by KIHOON71 in 22956
* [i18n-KO] Translated video_classification.mdx to Korean by KIHOON71 in 23026
* 🌐 [i18n-KO] Translated `troubleshooting.mdx` to Korean by 0525hhgus in 23166
* Adds a FlyteCallback by peridotml in 23759
* Update collating_graphormer.py by clefourrier in 23862
* [LlamaTokenizerFast] nit update `post_processor` on the fly by ArthurZucker in 23855
* 23388 Issue: Update RoBERTa configuration by vijethmoudgalya in 23863
* [from_pretrained] imporve the error message when `_no_split_modules` is not defined by ArthurZucker in 23861
* Editing issue with pickle def with lambda function by Natyren in 23869
* Adds AutoProcessor.from_pretrained support for MCTCTProcessor by Ubadub in 23856
* 🌐 [i18n-KO] Translated `pad_truncation.mdx` to Korean by sim-so in 23823
* Fix bug leading to missing token in GPTSanJapaneseTokenizer by passaglia in 23883
* Fix last instances of kbit -> quantized by sgugger in 23797
* fix(configuration_llama): add `keys_to_ignore_at_inference` to `LlamaConfig` by calico-1226 in 23891
* Fix Trainer when model is loaded on a different GPU by sgugger in 23792
* Support shared tensors by thomasw21 in 23871
* ensure banned_mask and indices in same device by cauyxy in 23901
* Unpin numba by sanchit-gandhi in 23162
* [`bnb`] add warning when no linear by younesbelkada in 23894
* fix: Replace `add_prefix_space` in `get_prompt_ids` with manual space for FastTokenizer compatibility by connor-henderson in 23796
* [`RWKV`] Fix RWKV 4bit by younesbelkada in 23910
* add conditional statement for auxiliary loss calculation by harisankar95 in 23899
* Raise error if loss can't be calculated - ViT MIM by amyeroberts in 23872
* Empty circleci config by sgugger in 23913
* Bug fix - flip_channel_order for channels first images by amyeroberts in 23701
* Re-enable squad test by sgugger in 23912
* Update the update metadata job to use upload_folder by sgugger in 23917
* [PushToHub] Make it possible to upload folders by NielsRogge in 23920
* Skip device placement for past key values in decoder models by sgugger in 23919
* [Flax Whisper] Update decode docstring by sanchit-gandhi in 23908
* Effectively allow `encoder_outputs` input to be a tuple in pix2struct by fxmarty in 23932
* Fix doc string nits by sheonhan in 23929
* Pin rhoknp by sgugger in 23937
* rename DocumentQuestionAnsweringTool parameter input to match docstring by Adam-D-Lewis in 23939
* Update stale.yml to use HuggingFaceBot by LysandreJik in 23941
* Make TF ESM inv_freq non-trainable like PyTorch by Rocketknight1 in 23940
* Revert "Update stale.yml to use HuggingFaceBot" by LysandreJik in 23943
* 23675 Registering Malay language by soongbren in 23689
* Modify device_map behavior when loading a model using from_pretrained by SunMarc in 23922
* use _make_causal_mask in clip/vit models by kashif in 23942
* Fix `ReduceLROnPlateau` object has no attribute 'get_last_lr' by wasupandceacar in 23944
* [MMS] Scaling Speech Technology to 1,000+ Languages | Add attention adapter to Wav2Vec2 by patrickvonplaten in 23813
* add new mms functions to doc by patrickvonplaten in 23954
* 🌐 [i18n-KO] Translated object_detection.mdx to Korean by KIHOON71 in 23164
* Trainer: fixed evaluate raising `KeyError` for ReduceLROnPlateau by claudius-kienle in 23952
* [Whisper Tokenizer] Skip special tokens when decoding with timestamps by sanchit-gandhi in 23945
* Add an option to reduce compile() console spam by Rocketknight1 in 23938
* Added time-series blogs to the models by elisim in 23857
* Fix typo in doc comment of BitsAndBytesConfig by ledyba in 23978
* Skip `test_multi_gpu_data_parallel_forward` for `MobileViTV2ModelTest` by ydshieh in 24017
* Update README.md by ydshieh in 24022
* Auto tokenizer registration by Bearnardd in 23965
* expose safe_serialization argument in the pipeline API by yessenzhar in 23775
* Pix2Struct: fix wrong broadcast axis of attention mask in visual encoder by affjljoo3581 in 23976
* TensorBoard callback no longer adds hparams by bri25yu in 23999
* 🌐 [i18n-KO] Translated `tasks_explained.mdx` to Korean by 0525hhgus in 23844
* Fix `MobileViTV2` checkpoint name by ydshieh in 24018
* Pin `deepspeed` to `0.9.2` for now by ydshieh in 24024
* 🌐 [i18n-KO] Translated `language-modeling.mdx` by wonhyeongseo in 23969
* 🌐 [i18n-KO] Translated `bertology.mdx` to Korean by wonhyeongseo in 23968
* Add check for tied parameters by SunMarc in 24029
* Fixing single candidate_label return. by Narsil in 24023
* Use TruncatedNormal from Keras initializers by hvaara in 24036
* Prevent ZeroDivisionError on `trainer.evaluate` if model and dataset are tiny by tomaarsen in 24049
* Modification of one text example file should trigger said test by sgugger in 24051
* Tiny fix for `check_self_hosted_runner.py` by ydshieh in 24052
* Reduce memory usage in TF building by Rocketknight1 in 24046
* Move TF building to an actual build() method by Rocketknight1 in 23760
* Use new parametrization based weight norm if available by ezyang in 24030
* bring back `filtered_test_list_cross_tests.txt` by ydshieh in 24055
* Fix device placement for model-parallelism in generate for encoder/de… by sgugger in 24025
* Remote code improvements by sgugger in 23959
* Generate: increase left-padding test atol by gante in 23448
* [Wav2Vec2] Fix torch srcipt by patrickvonplaten in 24062
* Add support for non-rust implemented tokenization for `__getitem__` method. by jacklanda in 24039
* Support PEFT models when saving the model using trainer by younesbelkada in 24073
* [`Hub`] Add `safe_serialization` in push_to_hub by younesbelkada in 24074
* Fix `is_optimum_neuron_available` by michaelbenayoun in 23961
* [`bnb`] Fix bnb skip modules by younesbelkada in 24043
* Be nice to TF by ydshieh in 24076
* Make the TF dummies even smaller by Rocketknight1 in 24071
* [doc build] Use secrets by mishig25 in 24079
* Fix expected value in tests of the test fetcher by sgugger in 24077
* Update delete_doc_comment_trigger.yml by mishig25 in 24084
* Do not prepare lr scheduler as it as the right number of steps by sgugger in 24088
* Fix a tiny typo in `WhisperForConditionalGeneration::generate` docstring by sadra-barikbin in 24045
* [`Trainer`] Correct behavior of `_load_best_model` for PEFT models by younesbelkada in 24103

Significant community contributions

The following contributors have made significant changes to the library over the last release:

* shehanmunasinghe
* Add swiftformer (22686)
* Add MobileViTv2 (22820)
* TimDettmers
* Bugfix: LLaMA layer norm incorrectly changes input type and consumers lots of memory (23535)
* 4-bit QLoRA via bitsandbytes (4-bit base model + LoRA) (23479)
* Paged Optimizer + Lion Optimizer for Trainer (23217)
* elisim
* [Time-Series] Autoformer model (21891)
* Added time-series blogs to the models (23857)
* KIHOON71
* 🌐 [i18n-KO] Translated `fast_tokenizers.mdx` to Korean (22956)
* [i18n-KO] Translated video_classification.mdx to Korean (23026)
* 🌐 [i18n-KO] Translated object_detection.mdx to Korean (23164)
* D-Roberts
* Add TensorFlow implementation of EfficientFormer (22620)
* soongbren
* 23675 Registering Malay language (23689)

4.29.2

Not secure
Fixes the package so non-Python files (like CUDA kernels) are properly included.

4.29.1

Not secure
Reverts a regression in the FSDP integration.
Add `pip install transformers["agent"]` to have all dependencies agents rely on.
Fixes the documentation about agents.

* Revert "search buffers for dtype" in 23308 by sgugger
* Fix image segmentation tool test in 23306 by sgugger
* Fix typo in gradio-tools docs in 23305 by freddyaboulton
* Fix broken links in the agent docs in 23297 by sgugger
* Agents extras in 23301 by LysandreJik
* Update transformers_agents.mdx in 23289 by mishig25
* Update custom_tools.mdx: fix link in 23292 by mishig25

Page 9 of 30

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.