pre-trained on 1T tokens of text and code data.
<img width="400" alt="zamba" src="https://github.com/user-attachments/assets/a86428b8-4d24-4e5a-bf78-222312693bb2">
* Add Zamba by pglorio in 30950
GLM
The GLM Model was proposed in ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools by GLM Team,
THUDM & ZhipuAI.
The abstract from the paper starts with the following:
We introduce ChatGLM, an evolving family of large language models that we have been developing over time. This
report primarily focuses on the GLM-4 language series, which includes GLM-4, GLM-4-Air, and GLM-4-9B.
![image](https://github.com/user-attachments/assets/2bf08b7e-b352-440e-99a0-ddbe90cb7285)
* add Glm by Cyrilvallez in 33823
Idefics 3
The Idefics3 model was proposed in Building and better understanding vision-language models: insights and future directions by Hugo Laurençon, Andrés Marafioti, Victor Sanh, and Léo Tronchon.
Idefics3 is an adaptation of the Idefics2 model with three main differences:
- It uses Llama3 for the text model.
- It uses an updated processing logic for the images.
- It removes the perceiver.
![image](https://github.com/user-attachments/assets/0804f078-31c0-48b4-8641-ce2166d7efbc)
* Add Idefics 3! by andimarafioti in 32473
PhiMoE
The PhiMoE model was proposed in Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone by Microsoft.
This model is very similar to Mixtral with the main difference of Phi3LongRoPEScaledRotaryEmbedding, where they are
used to extend the context of the rotary embeddings. The query, key and values are fused, and the MLPβs up and gate
projection layers are also fused.
![image](https://github.com/user-attachments/assets/a7af05d5-9b20-44f7-ab91-b6d3c490fd7b)
* PhiMoE by garg-amit in 33363
Watermarking
This release adds [SynthID](https://www.nature.com/articles/s41586-024-08025-4), a novel state-of-the-art watermarking technique by Google DeepMind. SynthID has a low generation-time computational cost and can be configured to be nearly imperceptible (at the cost of harder watermarking detection). The release also comes with the code to train and run the corresponding detector, which is a machine learning model itself.
py
from transformers import AutoModelForCausalLM, AutoTokenizer, SynthIDTextWatermarkingConfig
tokenizer = AutoTokenizer.from_pretrained('google/gemma-2-2b', padding_side="left")
model = AutoModelForCausalLM.from_pretrained('google/gemma-2-2b')
SynthID Text configuration
watermarking_config = SynthIDTextWatermarkingConfig(
keys=[654, 400, 836, 123, 340, 443, 597, 160, 57],
ngram_len=5,
)
Generation with watermarking
tokenized_prompts = tokenizer(["Once upon a time, "], return_tensors="pt", padding=True)
output_sequences = model.generate(
**tokenized_prompts, watermarking_config=watermarking_config, do_sample=True, max_new_tokens=10
)
watermarked_text = tokenizer.batch_decode(output_sequences, skip_special_tokens=True)
print(watermarked_text)
Docs for applying SynthID watermarking: https://huggingface.co/docs/transformers/internal/generation_utils#transformers.SynthIDTextWatermarkLogitsProcessor
Docs for detecting SynthID watermarking: https://huggingface.co/docs/transformers/internal/generation_utils#transformers.SynthIDTextWatermarkDetector
<img width="750" alt="how-synthid-works-high-level" src="https://github.com/user-attachments/assets/c5702b21-e7e6-490d-8fe6-b73783e78e6b">
* Add SynthID (watermerking by Google DeepMind) by gante in 34350
Quantization
BitNet
[BitNet](https://arxiv.org/abs/2402.17764) is an architecture introduced by Microsoft Research that uses extreme quantization, representing each parameter with only three values: -1, 0, and 1. This results in a model that uses just 1.58 bits per parameter, significantly reducing computational and memory requirements. It replaces traditional Linear layers in Multi-Head Attention and Feed-Forward Networks with specialized layers called BitLinears that use ternary precision (or even binary, in the initial version)
![image](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/blog/1.58llm_extreme_quantization/bitlinear.png)
* FEAT : Adding BitNet quantization method to HFQuantizer by MekkCyber in 33410
GGUF loading in transformers
More architectures are now supported in our GGUF loader; GGUF files saved with this architecture can now
be loaded directly in transformers to be fine-tuned. We recommend using tooling from llama.cpp to requantize
the models after further training has been done.
* Add gguf support for bloom by VladOS95-cyber in 33473
* Add falcon gguf by g-prz in 33437
* Add gguf support for StableLM by VladOS95-cyber in 33793
* Add gguf support for gpt2 by VladOS95-cyber in 34044
* Add GGUF for starcoder2 by VladOS95-cyber in 34094
Notable improvements and additions
Pipeline API synchronisation
We are pushing for a unified inference API across multiple libraries. As part of this, we are cleaning up the input and output signatures for our pipeline classes and deprecating some rarely-used arguments. This is still a work-in-progress, but when it's finished, `transformers` pipelines should exactly match workflows in deployment libraries like [transformers.js](https://github.com/huggingface/transformers.js) or [TGI](https://huggingface.co/docs/text-generation-inference/en/index), allowing you to seamlessly move from development to production.
* Sync video classification pipeline with huggingface_hub spec by Rocketknight1 in 34288
* Image pipelines spec compliance by Rocketknight1 in 33899
* Make ASR pipeline compliant with Hub spec + add tests by Rocketknight1 in 33769
* Cleanup return_text and return_full_text options in TextGenerationPipeline by Rocketknight1 in 33542
* Make audio classification pipeline spec-compliant and add test by Rocketknight1 in 33730
* Sync QuestionAnsweringPipeline by Rocketknight1 in 34039
Also, pipelines now fully support the `Processor` class, used by vision-language models. Expect full pipeline support for chatting with VLMs in the very near future!
* Make `pipeline` able to load `processor` by qubvel in 32514
Executorch compatibility
[ExecuTorch](https://github.com/pytorch/executorch) is an end-to-end solution for enabling on-device inference capabilities across mobile and edge devices including wearables, embedded devices and microcontrollers. It is part of the PyTorch ecosystem and supports the deployment of PyTorch models with a focus on portability, productivity, and performance.
We are collaborating with the executorch team so that π€ Transformers models can be exported using `torch.export`. The goal of this integration is not only to enable export but also to ensure that the exported artifact can be further lowered and optimized to run efficiently in ExecuTorch, particularly for mobile and edge use cases.
<img width="750" alt="how-executorch-works-high-level" src="https://github.com/user-attachments/assets/e353f9c9-b3e8-4172-86e0-c9b0b1bdd17a">
* Generate using exported model and enable gemma2-2b in ExecuTorch by guangy10 in 33707
* Qwen2.5 is ExecuTorch Compatible by guangy10 in 34102
* Olmo is ExecuTorch Compatible by guangy10 in 34181
* Llama3 and Llama2 are ExecuTorch compatible by guangy10 in 34101
Gradient accumulation bugfix
* Fix Gradient Accumulation issue by ArthurZucker in 34191
* Enable users to use their own loss functions + deal with prefetching for grad accum by muellerzr in 34198
* Enable Gradient Accumulation fix across all models + trainer fully in forward() by muellerzr 34283
Bugfixes and improvements
* adding positional encoder changes and tests by manuelsh in 32600
* Uniformize kwargs for chameleon processor by leloykun in 32181
* [`MllamaProcessor`] Update errors and API with multiple image by ArthurZucker in 33715
* fix: use correct var names for check_tokenizers script by niqodea in 33702
* Fix docs and docstrings Omdet-Turbo by yonigozlan in 33726
* Fix position embeddings singular/plural by molbap in 33678
* Generate: `can_generate()` recursive check by gante in 33718
* clean_up_tokenization_spaces=False if unset by itazap in 31938
* fix: add docstring for `image_size` in Convnextv2 config by lucianosrp in 33734
* Fix modular model converter unable to generate Processor classes by tonywu71 in 33737
* fix trainer tr_loss add error by Wang-Xiaodong1899 in 33651
* Update Albumentations Versions by vasqu in 33704
* Doc and config mismatch for DeBERTa by fkrasnov2 in 33713
* [`clean_up_tokenization_spaces`] Pl bart was failing, updating by ArthurZucker in 33735
* [`MllamaImageProcessing`] Update doc by ArthurZucker in 33747
* Make siglip examples clearer and error free by jbn in 33667
* Paligemma support for multi-image by zucchini-nlp in 33447
* remove warning v2 by itazap in 33761
* Model addition timeline by LysandreJik in 33762
* Fix typing in `load_balancing_loss_func` function of `modeling_mixtral.py`. by PhilipMay in 33641
* Enable non-safetensor ser/deser for TorchAoConfig quantized model π΄ by jerryzh168 in 33456
* Fix typo in documentation by qgallouedec in 33805
* Hqq serialization by mobicham in 33141
* Add Slow CI reminder bot by ydshieh in 33506
* [`modular`] fixes! by ArthurZucker in 33820
* Fix ViT-MAE decoder interpolate by xenova in 33330
* Fixes for issue 33763 in idefics2 model by aroun-coumar in 33766
* Fix link in gguf.md by pogpog in 33768
* minor typo fix by a-r-r-o-w in 33784
* Fix Mamba slow path bug with dtype mismatch. by Adibvafa in 32691
* Fix passing str dtype to static cache by guangy10 in 33741
* fix check for hidden size in text model for deepspeed zero3 auto entries by winglian in 33829
* post reminder comment only once by ydshieh in 33848
* Generate: move llama `prepare_inputs_for_generation` to `GenerationMixin` by gante in 33677
* Refactor image features selection in LlaVa by kenza-bouzid in 33696
* fix: skip dropout in eval for flash_attn in various models by fdschmidt93 in 33844
* add attention weight up-cast to float32 in chameleon by francescortu in 33822
* Workaround for bark issue in pipelines by Rocketknight1 in 33824
* Fix device mismatch errors by zucchini-nlp in 33851
* This PR contains additional changes for 33143 by aroun-coumar in 33581
* Raise `accelerate` dependency error in case of defaulting `low_cpu_mem_usage=True` by kylesayrs in 33830
* Validate the eval dataset in advance. by jackyjinjing in 33743
* Add include_loss_for_metrics by Manalelaidouni in 33088
* Avoid using context that is not accessable from external contributors by ydshieh in 33866
* fix: repair depth estimation multiprocessing by niqodea in 33759
* Move weight initilization deformabledetr by g-prz in 33339
* [Fix] ViViT interpolate_pos_encoding by RUFFY-369 in 33815
* Repo consistency fix after 33339 by amyeroberts in 33873
* Add support for custom inputs and batched inputs in ProcessorTesterMixin by yonigozlan in 33711
* Fix: typo by TrickEye in 33880
* Uniformize model processors by molbap in 31368
* Don't run reminder bot for now by ydshieh in 33883
* populate quantization_config for kv-cache-scheme only configs by horheynm in 33874
* Allow for nightly packages of `compressed_tensors` by kylesayrs in 33828
* Fix kwargs passed by AutoQuantizationConfig.from_pretrained by kylesayrs in 33798
* Add sdpa for DistilBert by OmarManzoor in 33724
* Trainer - deprecate tokenizer for processing_class by amyeroberts in 32385
* [Quantization] Switch to optimum-quanto by SunMarc in 31732
* Optim deformable detr by yonigozlan in 33600
* Handle Trainer `tokenizer` kwarg deprecation with decorator by qubvel in 33887
* rename all test_processing_*.py to test_processor_*.py by yonigozlan in 33878
* uniformize processor Mllama by yonigozlan in 33876
* Fix dt proj bias reassigned by HofitBata in 33314
* Update an keyerror on _save_check_point prevent confusion of missing β¦ by fadingNA in 33832
* VLM Generate: tag `test_static_cache_matches_dynamic` as flaky by gante in 33630
* Migrate the CI runners to the new clusters by glegendre01 in 33849
* Fix module initialization for root module under Zero3 by Ben-Schneider-code in 33632
* Add `SplinterTokenizer` unit test by ariepratama in 32652
* Generate tests: modality-agnostic input preparation by gante in 33685
* Fix: use unidic-lite instead of ipadic as the tokenizer dictionary for Japanese by KanTakahiro in 33372
* [Tests] Diverse Whisper fixes by ylacombe in 33665
* [PEFT] Support low_cpu_mem_usage option for PEFT loading adapters by BenjaminBossan in 33725
* add setter for trainer processor by ArthurZucker in 33911
* Add support for `weights_only` flag when loading state_dict by jerryzh168 in 32481
* Config: lower `save_pretrained` exception to warning by gante in 33906
* Uniformize kwargs for Idefics/2 processors by yonigozlan in 32568
* Remove `logits.float()` by ringohoffman in 33902
* Minor error condition bug fix by htahboub in 33781
* Fix distil whisper segment computation by ylacombe in 33920
* [Doc]: Broken link in Kubernetes doc by saldanhad in 33879
* [i18n-ru] Fixes typo in the README_ru.md by Artanias in 33882
* Ignore keys on `validate_rope` by zucchini-nlp in 33753
* [`PR run-slow`] by ArthurZucker in 33939
* Add a section on writing tool templates to the chat template docs by Rocketknight1 in 33924
* Enables CPU AWQ model with IPEX version. by jiqing-feng in 33460
* π΄ π¨ Resizing tokens embeddings: initialize from old embeddings' normal distribution. by abuelnasr0 in 33325
* Removed unnecessary transpose in Switch Transformer Routing by karan-uppal3 in 33582
* Fix attn mask ignore logic in training-time trace by zhenglongjiepheonix in 32613
* hot fix `self.position_embeddings->self.position_embedding` by ArthurZucker in 33958
* fix red check-copies by ArthurZucker in 33964
* Cache: revert DynamicCache init for BC by gante in 33861
* Paligemma: fix static cache test by zucchini-nlp in 33941
* Updating `char_to_token` documentation to note behaviour when `trim_offsets` is True by Craigacp in 33919
* add test for Jamba with new model jamba-tiny-dev by yecohn in 33863
* Bug fix gguf qwen2moe by VladOS95-cyber in 33940
* [`TF`] Fix Tensorflow XLA Generation on limited seq_len models by vasqu in 33903
* [WIP] Add Tokenizer for MyT5 Model by tomlimi in 31286
* Add position ids in forward pass to opt model by avishaiElmakies in 33121
* Flash-attn performance: remove cuda sync during inference by Cyrilvallez in 33570
* [Docs] Improve VLM docs by NielsRogge in 33393
* [Docs] Add Developer Guide: How to Hack Any Transformers Model by MagnusS0 in 33979
* [`Red CIs`] Fix hub failures by ArthurZucker in 34001
* Fix Tensor + Embedding error in some cases when using SiglipVisionModel by kaitolucifer in 33994
* properly fix and RUN_SLOW by ArthurZucker in 33965
* Enable customized optimizer for DeepSpeed by dataKim1201 in 32049
* [`pytes collection`] Fix flax test collection by ArthurZucker in 34004
* Fix undefined default_config in configuration_utils.py by mgoin in 33934
* π [i18n-KO] Translated `gguf.md` to Korean by yijun-lee in 33764
* π [i18n-KO] Translated `swinv2.md` to Korean by mreraser in 33566
* π [i18n-KO] Translated `audio_utils.md` to Korean by yijun-lee in 33802
* π [i18n-KO] Translated `esm.md` to Korean by yijun-lee in 33796
* π [i18n-KO] Translated `time_series_utils.md` to Korean by yijun-lee in 33806
* π [i18n-KO] Translated `pipelines_utils.md` to Korean by yijun-lee in 33809
* π [i18n-KO] Translated `trainer.md` to Korean by yijun-lee in 33797
* π [i18n-KO] Translated `chameleon.md` to Korean by yijun-lee in 33799
* π [i18n-KO] Translated `logging.md` to Korean by chhaewxn in 33543
* π [i18n-KO] Translated `auto.md` to Korean by boyunJang in 33590
* π [i18n-KO] Translated `swin2sr.md` to Korean by mreraser in 33795
* π [i18n-KO] Translated `vit.md` to Korean by mreraser in 33884
* π [i18n-KO] Translated `gemma.md` to Korean by yijun-lee in 33936
* Cache: slight change in naming by zucchini-nlp in 32421
* Add support for __all__ and potentilly deleting functions by ArthurZucker in 33859
* Processors: don't default padding side by zucchini-nlp in 33942
* Add auto model for image-text-to-text by yonigozlan in 32472
* BatchFeature.to() supports non-tensor keys by Rocketknight1 in 33918
* Improve modular converter by Cyrilvallez in 33991
* Fixup DeepSpeed things by muellerzr in 34007
* Fix typing issue by SunMarc in 34012
* fix awq tests due to ipex backend by SunMarc in 34011
* Remove `decoder_config=None` by SunMarc in 34014
* Fix `trainer_seq2seq.py`'s `__init__` type annotations by benglewis in 34021
* π [i18n-KO] Translated `feature_extractor.md` to Korean by yijun-lee in 33775
* π [i18n-KO] Translated `bertweet.md` to Korean by ahnjj in 33891
* π [i18n-KO] Translated `gpt_neox_japanese.md` to Korean by ahnjj in 33894
* π [i18n-KO] Translated `rag.md` to Korean by chhaewxn in 33989
* π [i18n-KO] Translated `main_classes/quantization.md` to Korean by fabxoe in 33959
* π [i18n-KO] Translated `main_classes/configuration.md` to Korean by fabxoe in 33952
* π [i18n-KO] Translated `model_doc/mamba.md` to Korean by fabxoe in 33626
* π [i18n-KO] Translated `model_doc/autoformer.md` to Korean by fabxoe in 33574
* π [i18n-KO] Translated `model_doc/patchtsmixer.md` to Korean by fabxoe in 33587
* π [i18n-KO] Translated `model_doc/clip.md` to Korean by fabxoe in 33610
* π [i18n-KO] Translated `model_doc/paligemma.md` to Korean by fabxoe in 33612
* π [i18n-KO] Translated `model_doc/llama3.md` to Korean by fabxoe in 33635
* π [i18n-KO] Translated `model_doc/mistral.md` to Korean by fabxoe in 33648
* π [i18n-KO] Translated `model_doc/cohere.md` to Korean by fabxoe in 33885
* π [i18n-KO] Translated `model_doc/dbrx.md` to Korean by fabxoe in 33951
* π [i18n-KO] Translated `model_doc/deberta-v2.md` to Korean by fabxoe in 33968
* π [i18n-KO] Translated `main_classes/onnx.md` to Korean by fabxoe in 33601
* π [i18n-KO] Translated `tokenization_utils.md` to Korean by yijun-lee in 33813
* π [i18n-KO] Translated `swin.md` to Korean by mreraser in 33510
* π [i18n-KO] Translated `file_utils.md` to Korean by yijun-lee in 33803
* π [i18n-KO] Translated `openai-gpt.md` to Korean by yijun-lee in 33801
* π [i18n-KO] Translated `biogpt.md` to Korean by yijun-lee in 33773
* π [i18n-KO] Translated `blip.md` to Korean by cjfghk5697 in 33515
* π [i18n-KO] Translated output.md to Korean by 4N3MONE in 33607
* π [i18n-KO] Translated `image_processing_utils.md` to Korean by yijun-lee in 33804
* π [i18n-KO] Translated `modular_transformers.md` to Korean by yijun-lee in 33772
* [`Patch helper`] update to not have to checkout main by ArthurZucker in 34006
* Fix Failed tests with mobile bert resize tokens embedding by abuelnasr0 in 33950
* Generate: remove most decoder-only LLMs `prepare_inputs_for_generation` by gante in 33870
* Mllama: fix tests by zucchini-nlp in 34000
* Fix PIL dep for tests by muellerzr in 34028
* π [i18n-KO] Translated `model_doc/bart.md` to Korean by fabxoe in 33893
* π [i18n-KO] Translated `model_doc/deberta.md` to Korean by fabxoe in 33967
* π [i18n-KO] Translated `main_classes/keras_callbacks.md` to Korean by fabxoe in 33955
* π [i18n-KO] Translated `model_doc/mamba2.md` to Korean by fabxoe in 33629
* π [i18n-KO] Translated `main_classes/model.md` to Korean by fabxoe in 33606
* π [i18n-KO] Translated `model_doc/trajectory_transformer.md` to Korean by fabxoe in 33597
* π [i18n-KO] Translated `model_doc/time_series_transformer.md` to Korean by fabxoe in 33596
* π [i18n-KO] Translated `model_doc/informer.md` to Korean by fabxoe in 33585
* π [i18n-KO] Translated `model_doc/graphormer.md` to Korean by fabxoe in 33569
* π [i18n-KO] Translated `modeling_utils.md` to Korean by yijun-lee in 33808
* π [i18n-KO] Translated `main_classes/data_collator.md` to Korean by fabxoe in 33954
* π [i18n-KO] Translated `model_doc/patchtst.md` to Korean by fabxoe in 33589
* π [i18n-KO] Translated `text_generation.md` to Korean by yijun-lee in 33777
* π [i18n-KO] Translated `main_classes/callback.md` to Korean by Jwaminju in 33572
* π [i18n-KO] Translated `generation_utils.md` to Korean by yijun-lee in 33818
* Add Translate docs into Arabic - section files CONCEPTUAL GUIDES by AhmedAlmaghz in 33982
* add sdpa to OPT by avishaiElmakies in 33298
* Phi3: fix attn for sliding window by zucchini-nlp in 33586
* HfArgumentParser: allow for hyhenated field names in long-options by djmarti in 33990
* Fix pipelines tests by qubvel in 34049
* Specifying torch dtype in Qwen2VLForConditionalGeneration by htahboub in 33953
* Universal Assisted Generation: Assisted generation with any assistant model (by Intel Labs) by danielkorat in 33383
* check if eigenvalues of covariance matrix are complex. by abuelnasr0 in 34037
* [Docs] Update compressed_tensors.md by mgoin in 33961
* Fix data_seed unused by MekkCyber in 33731
* [TESTS] ASR pipeline by ylacombe in 33925
* Update Blip2 `is_pipeline_test_to_skip` method signature by qubvel in 34067
* provide trust_remote_code for search feat extractor in model config by eaidova in 34036
* Small Fix to modular converter by MekkCyber in 34051
* Default `synced_gpus` to `True` when using `FullyShardedDataParallel` by ringohoffman in 33483
* Idefics: fix position ids by zucchini-nlp in 33907
* Update SSH workflow file by ydshieh in 34084
* Tests: upcast `logits` to `float()` by gante in 34042
* Fix flax failures by LysandreJik in 33912
* Fix DAC slow tests by ylacombe in 34088
* Fix failing conversion by LysandreJik in 34010
* Fix PushToHubMixin when pusing to a PR revision by Wauplin in 34090
* avoid many failures for ImageGPT by ydshieh in 34071
* Fix NaNs in cost_matrix for mask2former by ducha-aiki in 34074
* Fix flaky tests by zucchini-nlp in 34069
* Generate: move `prepare_inputs_for_generation` in encoder-decoder llms by gante in 34048
* Avoid many test failures for `LlavaNextVideoForConditionalGeneration` by ydshieh in 34070
* refactor: benchmarks by McPatate in 33896
* fix(ci): benchmarks dashboard was failing due to missing quotations by McPatate in 34100
* Generate: Fix modern llm `generate` calls with `synced_gpus` by gante in 34095
* Mistral-related models for QnA by vasqu in 34045
* Fix a typo by PengWeixuan in 34148
* Fixed error message in mllama by dmgcsilva in 34106
* Specify that users should be careful with their own files by LysandreJik in 34153
* Add documentation for docker by ArthurZucker in 33156
* Update README.md with Enterprise Hub by gary149 in 34150
* Idefics: enable generation tests by zucchini-nlp in 34062
* Add sdpa for Vivit by RUFFY-369 in 33757
* Fix FSDP resume Initialization issue by Itssshikhar in 34032
* Fix default behaviour in TextClassificationPipeline for regression problem type by subhalingamd in 34066
* Generate: move `logits` to same device as `input_ids` by gante in 34076
* Add support for inheritance from class with different suffix in modular by yonigozlan in 34077
* Fix optuna ddp hp search by SunMarc in 34073
* [feat] LlavaNext add feature size check to avoid CUDA Runtime Error by laurentd-lunit in 33608
* π [i18n-KO] Translated `vivit.md` to Korean by mreraser in 33935
* π [i18n-KO] Translated `gemma2.md` to Korean by yijun-lee in 33937
* π [i18n-KO] Translated `trainer_utils.md` to Korean by yijun-lee in 33817
* π [i18n-KO] Translated `blip-2.md` to Korean by cjfghk5697 in 33516
* IDEFICS: support inputs embeds by zucchini-nlp in 34043
* [fix] fix token healing tests and usage errors by alpertunga-bile in 33931
* Revert `accelerate` error caused by `46d09af` by steveepreston in 34197
* Fix wrong name for llava onevision and qwen2_vl in tokenization auto by yonigozlan in 34177
* Avoid using torch's Tensor or PIL's Image in chat template utils if not available by RezaRahemtola in 34165
* Revert "Fix FSDP resume Initialization issue" by SunMarc in 34193
* Update `trainer._get_eval_sampler()` to support `group_by_length` arg by larin92 in 33514
* Fix warning message for fp32_cpu_offloading in bitsandbytes configs by amosyou in 34079
* Ping team members for new failed tests in daily CI by ydshieh in 34171
* fix(Wav2Vec2ForCTC): torch export by chrsmcgrr in 34023
* Fix for tokenizer.apply_chat_template with continue_final_message=True by schoennenbeck in 34214
* removes decord by vrnvu in 33987
* Fix bus error when using GPT2 on M1 macs by chanind in 34031
* Generate: visit non-llm `prepare_inputs_for_generation` by gante in 34199
* Support Llama 3.2 conversion (text models) by pcuenca in 33778
* Fix-red-ci by ArthurZucker in 34230
* BLIP: fix input expansion logic by zucchini-nlp in 34225
* Fix broken test decorator `require_torch_up_to_2_accelerators` by byi8220 in 34201
* Informative 2 by LysandreJik in 34154
* Fix UDOP dtype issue by Rocketknight1 in 34180
* Only cast logits to float when computing loss by ringohoffman in 34147
* Generation tests: don't rely on main input name by zucchini-nlp in 34228
* Change Paligemma import logging to work with modular by yonigozlan in 34211
* Add DetrImageProcessorFast by yonigozlan in 34063
* Add a doc section on writing generation prompts by Rocketknight1 in 34248
* Fix method name which changes in tutorial by andimarafioti in 34252
* Attn implementation for composite models by zucchini-nlp in 32238
* VLM: add more modularity by zucchini-nlp in 34175
* T5 compile compatibilty by zucchini-nlp in 34089
* [docs] Fix GenerationConfig params by stevhliu in 34299
* Fix Korean doc _toctree.yml by regisss in 34293
* Update PR templates by SunMarc in 34065
* [RT-DETR] Fix onnx inference bug for Optype (Where) by YHallouard in 33877
* Fix FA2 attention for models supporting sliding window by Cyrilvallez in 34093
* Fix: tensor of examples of the same length triggers invalid stacking by pbelcak in 34166
* Add post_process_depth_estimation to image processors and support ZoeDepth's inference intricacies by alex-bene in 32550
* Add option for running ffmpeg_microphone_live as a background process by mikamerath in 32838
* Feature: Add `MLFLOW_MAX_LOG_PARAMS` to `MLflowCallback` by cecheta in 34279
* Fix continue_final_message for image-text-to-text chat templates by yonigozlan in 34236
* fix error in _get_eval_sampler when group_by_length enabled by akakakakakaa in 34237
* [docs] fix typo by faaany in 34235
* π [i18n-KO] Translated `executorch.md` to Korean by ahnjj in 33888
* π [i18n-KO] Translated `bert japanese.md` to Korean by ahnjj in 33890
* π [i18n-KO] Translated `model_doc/bartpho.md` to Korean by Jwaminju in 33981
* Example doc for token classification of Llama and Dependent/Copied Models by h3110Fr13nd in 34139
* [docs] Fix Korean toctree by stevhliu in 34324
* Added Deberta model type support by FilipposVentirozos in 34308
Significant community contributions
The following contributors have made significant changes to the library over the last release:
* manuelsh
* adding positional encoder changes and tests (32600)
* ArthurZucker
* [`MllamaProcessor`] Update errors and API with multiple image (33715)
* [`clean_up_tokenization_spaces`] Pl bart was failing, updating (33735)
* [`MllamaImageProcessing`] Update doc (33747)
* [`modular`] fixes! (33820)
* add setter for trainer processor (33911)
* [`PR run-slow`] (33939)
* hot fix `self.position_embeddings->self.position_embedding` (33958)
* fix red check-copies (33964)
* [`Red CIs`] Fix hub failures (34001)
* properly fix and RUN_SLOW (33965)
* [`pytes collection`] Fix flax test collection (34004)
* Add support for __all__ and potentilly deleting functions (33859)
* [`Patch helper`] update to not have to checkout main (34006)
* Add documentation for docker (33156)
* Fix Gradient Accumulation issue (34191)
* Fix-red-ci (34230)
* molbap
* Fix position embeddings singular/plural (33678)
* Uniformize model processors (31368)
* vasqu
* Update Albumentations Versions (33704)
* [`TF`] Fix Tensorflow XLA Generation on limited seq_len models (33903)
* Mistral-related models for QnA (34045)
* VladOS95-cyber
* Add gguf support for bloom (33473)
* Bug fix gguf qwen2moe (33940)
* Add gguf support for StableLM (33793)
* Add gguf support for gpt2 (34044)
* Add GGUF for starcoder2 (34094)
* ydshieh
* Add Slow CI reminder bot (33506)
* post reminder comment only once (33848)
* Avoid using context that is not accessable from external contributors (33866)
* Don't run reminder bot for now (33883)
* Update SSH workflow file (34084)
* avoid many failures for ImageGPT (34071)
* Avoid many test failures for `LlavaNextVideoForConditionalGeneration` (34070)
* Ping team members for new failed tests in daily CI (34171)
* amyeroberts
* Repo consistency fix after 33339 (33873)
* Trainer - deprecate tokenizer for processing_class (32385)
* ylacombe
* [Tests] Diverse Whisper fixes (33665)
* Fix distil whisper segment computation (33920)
* [TESTS] ASR pipeline (33925)
* Fix DAC slow tests (34088)
* Moshi integration (33624)
* ringohoffman
* Remove `logits.float()` (33902)
* Default `synced_gpus` to `True` when using `FullyShardedDataParallel` (33483)
* Only cast logits to float when computing loss (34147)
* garg-amit
* PhiMoE (33363)
* pglorio
* Add Zamba (30950)
* tomlimi
* [WIP] Add Tokenizer for MyT5 Model (31286)
* yijun-lee
* π [i18n-KO] Translated `gguf.md` to Korean (33764)
* π [i18n-KO] Translated `audio_utils.md` to Korean (33802)
* π [i18n-KO] Translated `esm.md` to Korean (33796)
* π [i18n-KO] Translated `time_series_utils.md` to Korean (33806)
* π [i18n-KO] Translated `pipelines_utils.md` to Korean (33809)
* π [i18n-KO] Translated `trainer.md` to Korean (33797)
* π [i18n-KO] Translated `chameleon.md` to Korean (33799)
* π [i18n-KO] Translated `gemma.md` to Korean (33936)
* π [i18n-KO] Translated `feature_extractor.md` to Korean (33775)
* π [i18n-KO] Translated `tokenization_utils.md` to Korean (33813)
* π [i18n-KO] Translated `file_utils.md` to Korean (33803)
* π [i18n-KO] Translated `openai-gpt.md` to Korean (33801)
* π [i18n-KO] Translated `biogpt.md` to Korean (33773)
* π [i18n-KO] Translated `image_processing_utils.md` to Korean (33804)
* π [i18n-KO] Translated `modular_transformers.md` to Korean (33772)
* π [i18n-KO] Translated `modeling_utils.md` to Korean (33808)
* π [i18n-KO] Translated `text_generation.md` to Korean (33777)
* π [i18n-KO] Translated `generation_utils.md` to Korean (33818)
* π [i18n-KO] Translated `gemma2.md` to Korean (33937)
* π [i18n-KO] Translated `trainer_utils.md` to Korean (33817)
* fabxoe
* π [i18n-KO] Translated `main_classes/quantization.md` to Korean (33959)
* π [i18n-KO] Translated `main_classes/configuration.md` to Korean (33952)
* π [i18n-KO] Translated `model_doc/mamba.md` to Korean (33626)
* π [i18n-KO] Translated `model_doc/autoformer.md` to Korean (33574)
* π [i18n-KO] Translated `model_doc/patchtsmixer.md` to Korean (33587)
* π [i18n-KO] Translated `model_doc/clip.md` to Korean (33610)
* π [i18n-KO] Translated `model_doc/paligemma.md` to Korean (33612)
* π [i18n-KO] Translated `model_doc/llama3.md` to Korean (33635)
* π [i18n-KO] Translated `model_doc/mistral.md` to Korean (33648)
* π [i18n-KO] Translated `model_doc/cohere.md` to Korean (33885)
* π [i18n-KO] Translated `model_doc/dbrx.md` to Korean (33951)
* π [i18n-KO] Translated `model_doc/deberta-v2.md` to Korean (33968)
* π [i18n-KO] Translated `main_classes/onnx.md` to Korean (33601)
* π [i18n-KO] Translated `model_doc/bart.md` to Korean (33893)
* π [i18n-KO] Translated `model_doc/deberta.md` to Korean (33967)
* π [i18n-KO] Translated `main_classes/keras_callbacks.md` to Korean (33955)
* π [i18n-KO] Translated `model_doc/mamba2.md` to Korean (33629)
* π [i18n-KO] Translated `main_classes/model.md` to Korean (33606)
* π [i18n-KO] Translated `model_doc/trajectory_transformer.md` to Korean (33597)
* π [i18n-KO] Translated `model_doc/time_series_transformer.md` to Korean (33596)
* π [i18n-KO] Translated `model_doc/informer.md` to Korean (33585)
* π [i18n-KO] Translated `model_doc/graphormer.md` to Korean (33569)
* π [i18n-KO] Translated `main_classes/data_collator.md` to Korean (33954)
* π [i18n-KO] Translated `model_doc/patchtst.md` to Korean (33589)
* MekkCyber
* FEAT : Adding BitNet quantization method to HFQuantizer (33410)
* Fix data_seed unused (33731)
* Small Fix to modular converter (34051)
* AhmedAlmaghz
* Add Translate docs into Arabic - section files CONCEPTUAL GUIDES (33982)
* alex-bene
* Add post_process_depth_estimation to image processors and support ZoeDepth's inference intricacies (32550)