Transformers

Latest version: v4.46.2

Safety actively analyzes 679296 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 18 of 30

4.10.2

Not secure
- [Wav2Vec2] Fix dtype 64 bug 13517 (patrickvonplaten)

4.10.1

Not secure
- [Wav2Vec2] Fix normalization for non-padded tensors 13512 (patrickvonplaten)
- Fixing backward compatiblity for non prefixed tokens (B-, I-). 13493 (Narsil)
- Fixing 13381 13400 (Narsil)

4.10.0

Not secure
LayoutLM-v2 and LayoutXLM

Four new models are released as part of the LatourLM-v2 implementation: `LayoutLMv2ForSequenceClassification`, `LayoutLMv2Model`, `LayoutLMv2ForTokenClassification` and `LayoutLMv2ForQuestionAnswering`, in PyTorch.

The LayoutLMV2 model was proposed in [LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document Understanding](https://arxiv.org/abs/2012.14740) by Yang Xu, Yiheng Xu, Tengchao Lv, Lei Cui, Furu Wei, Guoxin Wang, Yijuan Lu, Dinei Florencio, Cha Zhang, Wanxiang Che, Min Zhang, Lidong Zhou. LayoutLMV2 improves [LayoutLM](https://huggingface.co/transformers/model_doc/layoutlm.html) to obtain state-of-the-art results across several document image understanding benchmarks:

- Add LayoutLMv2 + LayoutXLM 12604 (NielsRogge)

Compatible checkpoints can be found on the Hub: https://huggingface.co/models?filter=layoutlmv2

BEiT

Three new models are released as part of the BEiT implementation: `BeitModel`, `BeitForMaskedImageModeling`, and `BeitForImageClassification`, in PyTorch.

The BEiT model was proposed in [BEiT: BERT Pre-Training of Image Transformers](https://arxiv.org/abs/2106.08254) by Hangbo Bao, Li Dong and Furu Wei. Inspired by BERT, BEiT is the first paper that makes self-supervised pre-training of Vision Transformers (ViTs) outperform supervised pre-training. Rather than pre-training the model to predict the class of an image (as done in the [original ViT paper](https://arxiv.org/abs/2010.11929)), BEiT models are pre-trained to predict visual tokens from the codebook of OpenAI’s [DALL-E model](https://arxiv.org/abs/2102.12092) given masked patches.

- Add BEiT 12994 (NielsRogge)

Compatible checkpoints can be found on the Hub: https://huggingface.co/models?filter=beit

Speech improvements

The Wav2Vec2 and HuBERT models now have a sequence classification head available.

- Add Wav2Vec2 & Hubert ForSequenceClassification 13153 (anton-l)

DeBERTa in TensorFlow (kamalkraj)

The DeBERTa and DeBERTa-v2 models have been converted from PyTorch to TensorFlow.

- Deberta tf 12972 (kamalkraj)
- Deberta_v2 tf 13120 (kamalkraj)

Flax model additions

EncoderDecoder, DistilBERT, and ALBERT, now have support in Flax!

- FlaxEncoderDecoder allowing Bert2Bert and Bert2GPT2 in Flax 13008 (ydshieh)
- FlaxDistilBERT 13324 (kamalkraj)
- FlaxAlBERT 13294 (kamalkraj)

TensorFlow examples

A new example has been added in TensorFlow: multiple choice!
Data collators have become framework agnostic and can now work for both TensorFlow and NumPy on top of PyTorch.

- Add TF multiple choice example 12865 (Rocketknight1)
- TF/Numpy variants for all DataCollator classes 13105 (Rocketknight1)


Auto API refactor

The Auto APIs have been disentangled from all the other mode modules of the Transformers library, so you can now safely import the Auto classes without importing all the models (and maybe getting errors if your setup is not compatible with one specific model). The actual model classes are only imported when needed.

- Disentangle auto modules from other modeling files 13023 (sgugger)
- Fix AutoTokenizer when no fast tokenizer is available 13336 (sgugger)

Slight breaking change

When loading some kinds of corrupted state dictionaries of models, the `PreTrainedModel.from_pretrained` method was sometimes silently ignoring weights. This has now become a real error.

- Fix from_pretrained with corrupted state_dict 12939 (sgugger)

General improvements and bugfixes

- Improving pipeline tests 12784 (Narsil)
- Pin git python to <3.1.19 12858 (patrickvonplaten)
- [tests] fix logging_steps requirements 12860 (stas00)
- [Sequence Feature Extraction] Add truncation 12804 (patrickvonplaten)
- add `classifier_dropout` to classification heads 12794 (PhilipMay)
- Fix barrier for SM distributed 12853 (sgugger)
- Add possibility to ignore imports in test_fecther 12801 (sgugger)
- Add accelerate to examples requirements 12888 (sgugger)
- Fix documentation of BigBird tokenizer 12889 (sgugger)
- Better heuristic for token-classification pipeline. 12611 (Narsil)
- Fix push_to_hub for TPUs 12895 (sgugger)
- `Seq2SeqTrainer` set max_length and num_beams only when non None 12899 (cchen-dialpad)
- [FLAX] Minor fixes in CLM example 12914 (stefan-it)
- Correct validation_split_percentage argument from int (ex:5) to float (0.05) 12897 (Elysium1436)
- Fix typo in the example of MobileBertForPreTraining 12919 (buddhics)
- Add option to set max_len in run_ner 12929 (sgugger)
- Fix QA examples for roberta tokenizer 12928 (sgugger)
- Print defaults when using --help for scripts 12930 (sgugger)
- Fix StoppingCriteria ABC signature 12918 (willfrey)
- Add missing classmethod decorators 12927 (willfrey)
- fix distiller.py 12910 (chutaklee)
- Update generation_logits_process.py 12901 (willfrey)
- Update generation_logits_process.py 12900 (willfrey)
- Update tokenization_auto.py 12896 (willfrey)
- Fix docstring typo in tokenization_auto.py 12891 (willfrey)
- [Flax] Correctly Add MT5 12988 (patrickvonplaten)
- ONNX v2 raises an Exception when using PyTorch < 1.8.0 12933 (mfuntowicz)
- Moving feature-extraction pipeline to new testing scheme 12843 (Narsil)
- Add CpmTokenizerFast 12938 (JetRunner)
- fix typo in gradient_checkpointing arg 12855 (21jun)
- Log Azure ML metrics only for rank 0 12766 (harshithapv)
- Add substep end callback method 12951 (wulu473)
- Add multilingual documentation support 12952 (JetRunner)
- Fix division by zero in NotebookProgressPar 12953 (sgugger)
- [FLAX] Minor fixes in LM example 12947 (stefan-it)
- Prevent `Trainer.evaluate()` crash when using only tensorboardX 12963 (aphedges)
- Fix typo in example of DPRReader 12954 (tadejsv)
- Place BigBirdTokenizer in sentencepiece-only objects 12975 (sgugger)
- fix typo in example/text-classification README 12974 (fullyz)
- Fix template for inputs docstrings 12976 (sgugger)
- fix `Trainer.train(resume_from_checkpoint=False)` is causing an exception 12981 (PhilipMay)
- Cast logits from bf16 to fp32 at the end of TF_T5 12332 (szutenberg)
- Update CANINE test 12453 (NielsRogge)
- pad_to_multiple_of added to DataCollatorForWholeWordMask 12999 (Aktsvigun)
- [Flax] Align jax flax device name 12987 (patrickvonplaten)
- [Flax] Correct flax docs 12782 (patrickvonplaten)
- T5: Create position related tensors directly on device instead of CPU 12846 (armancohan)
- Skip ProphetNet test 12462 (LysandreJik)
- Create perplexity.rst 13004 (sashavor)
- GPT-Neo ONNX export 12911 (michaelbenayoun)
- Update generate method - Fix floor_divide warning 13013 (nreimers)
- [Flax] Correct pt to flax conversion if from base to head 13006 (patrickvonplaten)
- [Flax T5] Speed up t5 training 13012 (patrickvonplaten)
- FX submodule naming fix 13016 (michaelbenayoun)
- T5 with past ONNX export 13014 (michaelbenayoun)
- Fix ONNX test: Put smaller ALBERT model 13028 (LysandreJik)
- Tpu tie weights 13030 (sgugger)
- Use min version for huggingface-hub dependency 12961 (lewtun)
- tfhub.de -> tfhub.dev 12565 (abhishekkrthakur)
- [Flax] Refactor gpt2 & bert example docs 13024 (patrickvonplaten)
- Add MBART to models exportable with ONNX 13049 (LysandreJik)
- Add to ONNX docs 13048 (LysandreJik)
- Fix small typo in M2M100 doc 13061 (SaulLu)
- Add try-except for torch_scatter 13040 (JetRunner)
- docs: add HuggingArtists to community notebooks 13050 (AlekseyKorshuk)
- Fix ModelOutput instantiation form dictionaries 13067 (sgugger)
- Roll out the test fetcher on push tests 13055 (sgugger)
- Fix fallback of test_fetcher 13071 (sgugger)
- Revert to all tests whil we debug what's wrong 13072 (sgugger)
- Use original key for label in DataCollatorForTokenClassification 13057 (ibraheem-moosa)
- [Doctest] Setup, quicktour and task_summary 13078 (sgugger)
- Add VisualBERT demo notebook 12263 (gchhablani)
- Install git 13091 (LysandreJik)
- Fix classifier dropout in AlbertForMultipleChoice 13087 (ibraheem-moosa)
- Doctests job 13088 (LysandreJik)
- Fix VisualBert Embeddings 13017 (gchhablani)
- Proper import for unittest.mock.patch 13085 (sgugger)
- Reactive test fecthers on scheduled test with proper git install 13097 (sgugger)
- Change a parameter name in FlaxBartForConditionalGeneration.decode() 13074 (ydshieh)
- [Flax/JAX] Run jitted tests at every commit 13090 (patrickvonplaten)
- Rely on huggingface_hub for common tools 13100 (sgugger)
- [FlaxCLIP] allow passing params to image and text feature methods 13099 (patil-suraj)
- Ci last fix 13103 (sgugger)
- Improve type checker performance 13094 (bschnurr)
- Fix VisualBERT docs 13106 (gchhablani)
- Fix CircleCI nightly tests 13113 (sgugger)
- Create py.typed 12893 (willfrey)
- Fix flax gpt2 hidden states 13109 (ydshieh)
- Moving fill-mask pipeline to new testing scheme 12943 (Narsil)
- Fix omitted lazy import for xlm-prophetnet 13052 (minwhoo)
- Fix classifier dropout in bertForMultipleChoice 13129 (mandelbrot-walker)
- Fix frameworks table so it's alphabetical 13118 (osanseviero)
- [Feature Processing Sequence] Remove duplicated code 13051 (patrickvonplaten)
- Ci continue through smi failure 13140 (LysandreJik)
- Fix missing `seq_len` in `electra` model when `inputs_embeds` is used. 13128 (sararb)
- Optimizes ByT5 tokenizer 13119 (Narsil)
- Add splinter 12955 (oriram)
- [AutoFeatureExtractor] Fix loading of local folders if config.json exists 13166 (patrickvonplaten)
- Fix generation docstrings regarding input_ids=None 12823 (jvamvas)
- Update namespaces inside torch.utils.data to the latest. 13167 (qqaatw)
- Fix the loss calculation of ProphetNet 13132 (StevenTang1998)
- Fix LUKE tests 13183 (NielsRogge)
- Add min and max question length options to TapasTokenizer 12803 (NielsRogge)
- SageMaker: Fix sagemaker DDP & metric logs 13181 (philschmid)
- correcting group beam search function output score bug 13211 (sourabh112)
- Change how "additional_special_tokens" argument in the ".from_pretrained" method of the tokenizer is taken into account 13056 (SaulLu)
- remove unwanted control-flow code from DeBERTa-V2 13145 (kamalkraj)
- Fix load_tf_weights alias. 13159 (qqaatw)
- Add RemBert to AutoTokenizer 13224 (LysandreJik)
- Allow local_files_only for fast pretrained tokenizers 13225 (BramVanroy)
- fix `AutoModel.from_pretrained(..., torch_dtype=...)` 13209 (stas00)
- Fix broken links in Splinter documentation 13237 (oriram)
- Custom errors and BatchSizeError 13184 (AmbiTyga)
- Bump notebook from 6.1.5 to 6.4.1 in /examples/research_projects/lxmert 13226 (dependabot[bot])
- Update generation_logits_process.py 12671 (willfrey)
- Remove side effects of disabling gradient computaiton 13257 (LysandreJik)
- Replace assert statement with if condition and raise ValueError 13263 (nishprabhu)
- Better notification service 13267 (LysandreJik)
- Fix failing Hubert test 13261 (LysandreJik)
- Add CLIP tokenizer to AutoTokenizer 13258 (LysandreJik)
- Some `model_type`s cannot be in the mapping 13259 (LysandreJik)
- Add require flax to MT5 Flax test 13260 (LysandreJik)
- Migrating conversational pipeline tests to new testing format 13114 (Narsil)
- fix `tokenizer_class_from_name` for models with `-` in the name 13251 (stas00)
- Add error message concerning revision 13266 (BramVanroy)
- Move `image-classification` pipeline to new testing 13272 (Narsil)
- [Hotfix] Fixing the test (warnings was incorrect.) 13278 (Narsil)
- Moving question_answering tests to the new testing scheme. Had to tweak a little some ModelTesterConfig for pipelines. 13277 (Narsil)
- Moving `summarization` pipeline to new testing format. 13279 (Narsil)
- Moving `table-question-answering` pipeline to new testing. 13280 (Narsil)
- Moving `table-question-answering` pipeline to new testing 13281 (Narsil)
- Hotfixing master tests. 13282 (Narsil)
- Moving `text2text-generation` to new pipeline testing mecanism 13283 (Narsil)
- Add DINO conversion script 13265 (NielsRogge)
- Moving `text-generation` pipeline to new testing framework. 13285 (Narsil)
- Moving `token-classification` pipeline to new testing. 13286 (Narsil)
- examples: add keep_linebreaks option to CLM examples 13150 (stefan-it)
- Moving `translation` pipeline to new testing scheme. 13297 (Narsil)
- Fix BeitForMaskedImageModeling 13275 (NielsRogge)
- Moving `zero-shot-classification` pipeline to new testing. 13299 (Narsil)
- Fixing mbart50 with `return_tensors` argument too. 13301 (Narsil)
- [Flax] Correct all return tensors to numpy 13307 (patrickvonplaten)

- examples: only use keep_linebreaks when reading TXT files 13320 (stefan-it)
- Slow tests - run rag token in half precision 13304 (patrickvonplaten)
- [Slow tests] Disable Wav2Vec2 pretraining test for now 13303 (patrickvonplaten)
- Announcing the default model used by the pipeline (with a link). 13276 (Narsil)
- use float 16 in causal mask and masked bias 13194 (hwijeen)
- ✨ add citation file 13214 (flaxel)
- Improve documentation of pooler_output in ModelOutput 13228 (navjotts)
- fix: typo spelling grammar 13212 (slowy07)
- Check None before going through iteration 13250 (qqaatw)
- Use existing functionality for 13251 13333 (sgugger)
- neptune.ai logger: add ability to connect to a neptune.ai run 13319 (fcakyon)
- Update label2id in the model config for run_glue 13334 (sgugger)
- :bug: fix small model card bugs 13310 (nateraw)
- Fall back to `observed_batch_size` when the `dataloader` does not know the `batch_size`. 13188 (mbforbes)
- Fixes 12941 where use_auth_token not been set up early enough 13205 (bennimmo)
- Correct wrong function signatures on the docs website 13198 (qqaatw)
- Fix release utils 13337 (sgugger)
- Add missing module __spec__ 13321 (laurahanu)
- Use DS callable API to allow hf_scheduler + ds_optimizer 13216 (tjruwase)
- Tests fetcher tests 13340 (sgugger)
- [Testing] Add Flax Tests on GPU, Add Speech and Vision to Flax & TF tests 13313 (patrickvonplaten)
- Fixing a typo in the data_collator documentation 13309 (Serhiy-Shekhovtsov)
- Add GPT2ForTokenClassification 13290 (tucan9389)
- Doc mismatch fixed 13345 (Apoorvgarg-creator)
- Handle nested dict/lists of tensors as inputs in the Trainer 13338 (sgugger)
- [doc] correct TP implementation resources 13248 (stas00)
- Fix minor typo in parallelism doc 13289 (jaketae)
- Set missing seq_length variable when using inputs_embeds with ALBERT & Remove code duplication 13152 (olenmg)
- TF CLM example fix typo 13002 (Rocketknight1)
- Add generate kwargs to Seq2SeqTrainingArguments 13339 (sgugger)

4.9.2

Not secure
- Tpu tie weights 13030 (sgugger)
- ONNX fixes & examples: 13048, 13049, 13028, 13014, 12911, (mfuntowicz, michaelbenayoun, LysandreJik)
- Fix push_to_hub for TPUs 12895 (sgugger)

4.9.1

Not secure
Fix barrier for SM distributed 12853 (sgugger)

4.9.0

Not secure
v4.9.0: TensorFlow examples, CANINE, tokenizer training, ONNX rework

ONNX rework

This version introduces a new package, `transformers.onnx`, which can be used to export models to ONNX. Contrary to the previous implementation, this approach is meant as an easily extendable package where users may define their own ONNX configurations and export the models they wish to export.

bash
python -m transformers.onnx --model=bert-base-cased onnx/bert-base-cased/


Validating ONNX model...
-[✓] ONNX model outputs' name match reference model ({'pooler_output', 'last_hidden_state'}
- Validating ONNX Model output "last_hidden_state":
-[✓] (2, 8, 768) matchs (2, 8, 768)
-[✓] all values close (atol: 0.0001)
- Validating ONNX Model output "pooler_output":
-[✓] (2, 768) matchs (2, 768)
-[✓] all values close (atol: 0.0001)
All good, model saved at: onnx/bert-base-cased/model.onnx


- [RFC] Laying down building stone for more flexible ONNX export capabilities 11786 (mfuntowicz)

CANINE model

Four new models are released as part of the CANINE implementation: `CanineForSequenceClassification`, `CanineForMultipleChoice`, `CanineForTokenClassification` and `CanineForQuestionAnswering`, in PyTorch.

The CANINE model was proposed in [CANINE: Pre-training an Efficient Tokenization-Free Encoder for Language Representation](https://arxiv.org/abs/2103.06874) by Jonathan H. Clark, Dan Garrette, Iulia Turc, John Wieting. It’s among the first papers that train a Transformer without using an explicit tokenization step (such as Byte Pair Encoding (BPE), WordPiece, or SentencePiece). Instead, the model is trained directly at a Unicode character level. Training at a character level inevitably comes with a longer sequence length, which CANINE solves with an efficient downsampling strategy, before applying a deep Transformer encoder.

- Add CANINE 12024 (NielsRogge)

Compatible checkpoints can be found on the Hub: https://huggingface.co/models?filter=canine

Tokenizer training

This version introduces a new method to train a tokenizer from scratch based off of an existing tokenizer configuration.

py
from datasets import load_dataset
from transformers import AutoTokenizer

dataset = load_dataset("wikitext", name="wikitext-2-raw-v1", split="train")
We train on batch of texts, 1000 at a time here.
batch_size = 1000
corpus = (dataset[i : i + batch_size]["text"] for i in range(0, len(dataset), batch_size))

tokenizer = AutoTokenizer.from_pretrained("gpt2")
new_tokenizer = tokenizer.train_new_from_iterator(corpus, vocab_size=20000)


- Easily train a new fast tokenizer from a given one - tackle the special tokens format (str or AddedToken) 12420 (SaulLu)
- Easily train a new fast tokenizer from a given one 12361 (sgugger)

TensorFlow examples

The `TFTrainer` is now entering deprecation - and it is replaced by `Keras`. With version v4.9.0 comes the end of a long rework of the TensorFlow examples, for them to be more Keras-idiomatic, clearer, and more robust.

- NER example for Tensorflow 12469 (Rocketknight1)
- TF summarization example 12617 (Rocketknight1)
- Adding TF translation example 12667 (Rocketknight1)
- Deprecate TFTrainer 12706 (Rocketknight1)

TensorFlow implementations

HuBERT is now implemented in TensorFlow:

- Add TFHubertModel 12206 (will-rice)

Breaking changes

When `load_best_model_at_end` was set to `True` in the `TrainingArguments`, having a different `save_strategy` and `eval_strategy` was accepted but the `save_strategy` was overwritten by the `eval_strategy` (the option to keep track of the best model needs to make sure there is an evaluation each time there is a save). This led to a lot of confusion with users not understanding why the script was not doing what it was told, so this situation will now raise an error indicating to set `save_strategy` and `eval_strategy` to the same values, and in the case that value is `"steps"`, `save_steps` must be a round multiple of `eval_steps`.

General improvements and bugfixes

- UpdateDescription of TrainingArgs param save_strategy 12328 (sam-qordoba)
- [Deepspeed] new docs 12077 (stas00)
- [ray] try fixing import error 12338 (richardliaw)
- [examples/Flax] move the examples table up 12341 (patil-suraj)
- Fix torchscript tests 12336 (LysandreJik)
- Add flax/jax quickstart 12342 (marcvanzee)
- Fixed a typo in readme 12356 (MichalPitr)
- Fix exception in prediction loop occurring for certain batch sizes 12350 (jglaser)
- Add FlaxBigBird QuestionAnswering script 12233 (vasudevgupta7)
- Replace NotebookProgressReporter by ProgressReporter in Ray Tune run 12357 (krfricke)
- [examples] remove extra white space from log format 12360 (stas00)
- fixed multiplechoice tokenization 12362 (cronoik)
- [trainer] add main_process_first context manager 12351 (stas00)
- [Examples] Replicates the new --log_level feature to all trainer-based pytorch 12359 (bhadreshpsavani)
- [Examples] Update Example Template for `--log_level` feature 12365 (bhadreshpsavani)
- [Examples] Replace `print` statement with `logger.info` in QA example utils 12368 (bhadreshpsavani)
- Onnx export v2 fixes 12388 (LysandreJik)
- [Documentation] Warn that DataCollatorForWholeWordMask is limited to BertTokenizer-like tokenizers 12371 (ionicsolutions)
- Update run_mlm.py 12344 (TahaAslani)
- Add possibility to maintain full copies of files 12312 (sgugger)
- [CI] add dependency table sync verification 12364 (stas00)
- [Examples] Added context manager to datasets map 12367 (bhadreshpsavani)
- [Flax community event] Add more description to readme 12398 (patrickvonplaten)
- Remove the need for `einsum` in Albert's attention computation 12394 (mfuntowicz)
- [Flax] Adapt flax examples to include `push_to_hub` 12391 (patrickvonplaten)
- Tensorflow LM examples 12358 (Rocketknight1)
- [Deepspeed] match the trainer log level 12401 (stas00)
- [Flax] Add T5 pretraining script 12355 (patrickvonplaten)
- [models] respect dtype of the model when instantiating it 12316 (stas00)
- Rename detr targets to labels 12280 (NielsRogge)
- Add out of vocabulary error to ASR models 12288 (will-rice)
- Fix TFWav2Vec2 SpecAugment 12289 (will-rice)
- [example/flax] add summarization readme 12393 (patil-suraj)
- [Flax] Example scripts - correct weight decay 12409 (patrickvonplaten)
- fix ids_to_tokens naming error in tokenizer of deberta v2 12412 (hjptriplebee)
- Minor fixes in original RAG training script 12395 (shamanez)
- Added talks 12415 (suzana-ilic)
- [modelcard] fix 12422 (stas00)
- Add option to save on each training node 12421 (sgugger)
- Added to talks section 12433 (suzana-ilic)
- Fix default bool in argparser 12424 (sgugger)
- Add default bos_token and eos_token for tokenizer of deberta_v2 12429 (hjptriplebee)
- fix typo in mt5 configuration docstring 12432 (fcakyon)
- Add to talks section 12442 (suzana-ilic)
- [JAX/Flax readme] add philosophy doc 12419 (patil-suraj)
- [Flax] Add wav2vec2 12271 (patrickvonplaten)
- Add test for a WordLevel tokenizer model 12437 (SaulLu)
- [Flax community event] How to use hub during training 12447 (patrickvonplaten)
- [Wav2Vec2, Hubert] Fix ctc loss test 12458 (patrickvonplaten)
- Comment fast GPU TF tests 12452 (LysandreJik)
- Fix training_args.py barrier for torch_xla 12464 (jysohn23)
- Added talk details 12465 (suzana-ilic)
- Add TPU README 12463 (patrickvonplaten)
- Import check_inits handling of duplicate definitions. 12467 (Iwontbecreative)
- Validation split added: custom data files sgugger, patil-suraj 12407 (Souvic)
- Fixing bug with param count without embeddings 12461 (TevenLeScao)
- [roberta] fix lm_head.decoder.weight ignore_key handling 12446 (stas00)
- Rework notebooks and move them to the Notebooks repo 12471 (sgugger)
- fixed typo in flax-projects readme 12466 (mplemay)
- Fix TAPAS test uncovered by 12446 12480 (LysandreJik)
- Add guide on how to build demos for the Flax sprint 12468 (osanseviero)
- Add `Repository` import to the FLAX example script 12501 (LysandreJik)
- [examples/flax] clip style image-text training example 12491 (patil-suraj)
- [Flax] Fix wav2vec2 pretrain arguments 12498 (Wikidepia)
- [Flax] ViT training example 12300 (patil-suraj)
- Fix order of state and input in Flax Quickstart README 12510 (navjotts)
- [Flax] Dataset streaming example 12470 (patrickvonplaten)
- [Flax] Correct flax training scripts 12514 (patrickvonplaten)
- [Flax] Correct logging steps flax 12515 (patrickvonplaten)
- [Flax] Fix another bug in logging steps 12516 (patrickvonplaten)
- [Wav2Vec2] Flax - Adapt wav2vec2 script 12520 (patrickvonplaten)
- [Flax] Fix hybrid clip 12519 (patil-suraj)
- [RoFormer] Fix some issues 12397 (JunnYu)
- FlaxGPTNeo 12493 (patil-suraj)
- Updated README 12540 (suzana-ilic)
- Edit readme 12541 (SaulLu)
- implementing tflxmertmodel integration test 12497 (sadakmed)
- [Flax] Adapt examples to be able to use eval_steps and save_steps 12543 (patrickvonplaten)
- [examples/flax] add adafactor optimizer 12544 (patil-suraj)
- [Flax] Add FlaxMBart 12236 (stancld)
- Add a warning for broken ProphetNet fine-tuning 12511 (JetRunner)
- [trainer] add option to ignore keys for the train function too (11719) 12551 (shabie)
- MLM training fails with no validation file(same as 12406 for pytorch now) 12517 (Souvic)
- [Flax] Allow retraining from save checkpoint 12559 (patrickvonplaten)
- Adding prepare_decoder_input_ids_from_labels methods to all TF ConditionalGeneration models 12560 (Rocketknight1)
- Remove tf.roll wherever not needed 12512 (szutenberg)
- Double check for attribute num_examples 12562 (sgugger)
- [examples/hybrid_clip] fix loading clip vision model 12566 (patil-suraj)
- Remove logging of GPU count etc from run_t5_mlm_flax.py 12569 (ibraheem-moosa)
- raise exception when arguments to pipeline are incomplete 12548 (hwijeen)
- Init pickle 12567 (sgugger)
- Fix group_lengths for short datasets 12558 (sgugger)
- Don't stop at num_epochs when using IterableDataset 12561 (sgugger)
- Fixing the pipeline optimization by reindexing targets (V2) 12330 (Narsil)
- Fix MT5 init 12591 (sgugger)
- [model.from_pretrained] raise exception early on failed load 12574 (stas00)
- [doc] fix broken ref 12597 (stas00)
- Add Flax sprint project evaluation section 12592 (osanseviero)
- This will reduce "Already borrowed error": 12550 (Narsil)
- [Flax] Add flax marian 12595 (patrickvonplaten)
- [Flax] Fix cur step flax examples 12608 (patrickvonplaten)
- Simplify unk token 12582 (sgugger)
- Fix arg count for partial functions 12609 (sgugger)
- Pass `model_kwargs` when loading a model in `pipeline()` 12449 (aphedges)
- [Flax] Fix mt5 auto 12612 (patrickvonplaten)
- [Flax Marian] Add marian flax example 12614 (patrickvonplaten)
- [FLax] Fix marian docs 2 12615 (patrickvonplaten)
- [debugging utils] minor doc improvements 12525 (stas00)
- [doc] DP/PP/TP/etc parallelism 12524 (stas00)
- [doc] fix anchor 12620 (stas00)
- [Examples][Flax] added test file in summarization example 12630 (bhadreshpsavani)
- Point to the right file for hybrid CLIP 12599 (edugp)
- [flax]fix jax array type check 12638 (patil-suraj)
- Add tokenizer_file parameter to PreTrainedTokenizerFast docstring 12624 (lewisbails)
- Skip TestMarian_MT_EN 12649 (LysandreJik)
- The extended trainer tests should require torch 12650 (LysandreJik)
- Pickle auto models 12654 (sgugger)
- Pipeline should be agnostic 12656 (LysandreJik)
- Fix transfo xl integration test 12652 (LysandreJik)
- Remove SageMaker documentation 12657 (philschmid)
- Fixed docs 12646 (KickItLikeShika)
- fix typo in modeling_t5.py docstring 12640 (PhilipMay)
- Translate README.md to Simplified Chinese 12596 (JetRunner)
- Fix typo in README_zh-hans.md 12663 (JetRunner)
- Updates timeline for project evaluation 12660 (osanseviero)
- [WIP] Patch BigBird tokenization test 12653 (LysandreJik)
- **encode_plus() shouldn't run for W2V2CTC 12655 (LysandreJik)
- Add ByT5 option to example run_t5_mlm_flax.py 12634 (mapmeld)
- Wrong model is used in example, should be character instead of subword model 12676 (jsteggink)
- [Blenderbot] Fix docs 12227 (patrickvonplaten)
- Add option to load a pretrained model with mismatched shapes 12664 (sgugger)
- Fix minor docstring typos. 12682 (qqaatw)
- [tokenizer.prepare_seq2seq_batch] change deprecation to be easily actionable 12669 (stas00)
- [Flax Generation] Correct inconsistencies PyTorch/Flax 12662 (patrickvonplaten)
- [Deepspeed] adapt multiple models, add zero_to_fp32 tests 12477 (stas00)
- Add timeout to CI. 12684 (LysandreJik)
- Fix Tensorflow Bart-like positional encoding 11897 (JunnYu)
- [Deepspeed] non-native optimizers are mostly ok with zero-offload 12690 (stas00)
- Fix multiple choice doc examples 12679 (sgugger)
- Provide mask_time_indices to `_mask_hidden_states` to avoid double masking 12692 (mfuntowicz)
- Update TF examples README 12703 (Rocketknight1)
- Fix uninitialized variables when `config.mask_feature_prob > 0` 12705 (mfuntowicz)
- Only test the files impacted by changes in the diff 12644 (sgugger)
- flax model parallel training 12590 (patil-suraj)
- [test] split test into 4 sub-tests to avoid timeout 12710 (stas00)
- [trainer] release tmp memory in checkpoint load 12718 (stas00)
- [Flax] Correct shift labels for seq2seq models in Flax 12720 (patrickvonplaten)
- Fix typo in Speech2TextForConditionalGeneration example 12716 (will-rice)
- Init adds its own files as impacted 12709 (sgugger)
- LXMERT integration test typo 12736 (LysandreJik)
- Fix AutoModel tests 12733 (LysandreJik)
- Skip test while the model is not available 12739 (LysandreJik)
- Skip test while the model is not available 12740 (LysandreJik)
- Translate README.md to Traditional Chinese 12701 (qqaatw)
- Fix MBart failing test 12737 (LysandreJik)
- Patch T5 device test 12742 (LysandreJik)
- Fix DETR integration test 12734 (LysandreJik)
- Fix led torchscript 12735 (LysandreJik)
- Remove framework mention 12731 (LysandreJik)
- [doc] parallelism: Which Strategy To Use When 12712 (stas00)
- [doc] performance: batch sizes 12725 (stas00)
- Replace specific tokenizer in log message by AutoTokenizer 12745 (SaulLu)
- [Wav2Vec2] Correctly pad mask indices for PreTraining 12748 (patrickvonplaten)
- [doc] testing: how to trigger a self-push workflow 12724 (stas00)
- add intel-tensorflow-avx512 to the candidates 12751 (zzhou612)
- [flax/model_parallel] fix typos 12757 (patil-suraj)
- Turn on eval mode when exporting to ONNX 12758 (mfuntowicz)
- Preserve `list` type of `additional_special_tokens` in `special_token_map` 12759 (SaulLu)
- [Wav2Vec2] Padded vectors should not allowed to be sampled 12764 (patrickvonplaten)
- Add tokenizers class mismatch detection between `cls` and checkpoint 12619 (europeanplaice)
- Fix push_to_hub docstring and make it appear in doc 12770 (sgugger)
- [ray] Fix `datasets_modules` ImportError with Ray Tune 12749 (Yard1)
- Longer timeout for slow tests 12779 (LysandreJik)
- Enforce eval and save strategies are compatible when --load_best_model_at_end 12786 (sgugger)
- [CIs] add troubleshooting docs 12791 (stas00)
- Fix Padded Batch Error 12282 12487 (will-rice)
- Flax MLM: Allow validation split when loading dataset from local file 12689 (fgaim)
- [Longformer] Correct longformer docs 12809 (patrickvonplaten)
- [CLIP/docs] add and fix examples 12810 (patil-suraj)
- [trainer] sanity checks for `save_steps=0|None` and `logging_steps=0` 12796 (stas00)
- Expose get_config() on ModelTesters 12812 (LysandreJik)
- Refactor slow sentencepiece tokenizers. 11716 (PhilipMay)
- Refer warmup_ratio when setting warmup_num_steps. 12818 (tsuchm)
- Add versioning system to fast tokenizer files 12713 (sgugger)
- Add _CHECKPOINT_FOR_DOC to all models 12811 (LysandreJik)

Page 18 of 30

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.