Accelerate

Latest version: v1.5.2

Safety actively analyzes 722032 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 9 of 18

0.10.0

This release adds two major new features: the DeepSpeed integration has been revamped to match the one in Transformers Trainer, with multiple new options unlocked, and the TPU integration has been sped up.

This version also officially stops supporting Python 3.6 and requires Python 3.7+

DeepSpeed integration revamp

Users can now specify a DeepSpeed config file when they want to use DeepSpeed, which unlocks many new options. More details in the new [documentation](https://huggingface.co/docs/accelerate/deepspeed).

* Migrate HFDeepSpeedConfig from trfrs to accelerate by pacman100 in 432
* DeepSpeed Revamp by pacman100 in 405

TPU speedup

If you're using TPUs we have sped up the dataloaders and models quite a bit, on top of a few bug fixes.

* Revamp TPU internals to be more efficient + enable mixed precision types by muellerzr in 441

What's new?

* Fix docstring by muellerzr in 447
* Add psutil as depenedency by sgugger in 445
* fix fsdp torch version dependency by pacman100 in 437
* Create Gradient Accumulation Example by muellerzr in 431
* init by muellerzr in 429
* Introduce `no_sync` context wrapper + clean up some more warnings for DDP by muellerzr in 428
* updating tests to resolve runner failures wrt deepspeed revamp by pacman100 in 427
* Fix secrets in Docker workflow by muellerzr in 426
* Introduce a Dependency Checker to trigger new Docker Builds on main by muellerzr in 424
* Enable slow tests nightly by muellerzr in 421
* Push out python 3.6 + fix all tests related to the upgrade by muellerzr in 420
* Speedup main CI by muellerzr in 419
* Switch to evaluate for metrics by sgugger in 417
* Create an issue template for Accelerate by muellerzr in 415
* Introduce post-merge runners by muellerzr in 416
* Fix debug_launcher issues by muellerzr in 413
* Use main egg by muellerzr in 414
* Introduce nightly runners by muellerzr in 410
* Update requirements to pin tensorboard and include psutil by muellerzr in 408
* Fix CUDA examples tests by muellerzr in 407
* Move datasets and transformers to under func by muellerzr in 411
* Fix CUDA Dockerfile by muellerzr in 409
* Hotfix all failing GPU tests by muellerzr in 401
* improve metrics logged in examples by pacman100 in 399
* Refactor offload_state_dict and fix in offload_weight by sgugger in 398
* Refactor version checking into a utility by muellerzr in 395
* Include fastai in frameworks by muellerzr in 396
* Add packaging to requirements by muellerzr in 394
* Better dispatch for submodules by sgugger in 392
* Build Docker Images nightly by muellerzr in 391
* Small bugfix for the stalebot workflow by muellerzr in 390
* Introduce stalebot by muellerzr in 387
* Create Dockerfiles for Accelerate by muellerzr in 377
* Mix precision -> Mixed precision by muellerzr in 388
* Fix OneCycle step length when in multiprocess by muellerzr in 385

0.9.0

This release offers no significant new API, it is just needed to have access to some utils in Transformers.

* Handle deprication errors in launch by muellerzr in 360
* Update launchers.py by tmabraham in 363
* fix tracking by pacman100 in 361
* Remove tensor call by muellerzr in 365
* Add a utility for writing a barebones config file by muellerzr in 371
* fix deepspeed model saving by pacman100 in 370
* deepspeed save model temp fix by pacman100 in 374
* Refactor tests to use accelerate launch by muellerzr in 373
* fix zero stage-1 by pacman100 in 378
* fix shuffling for ShufflerIterDataPipe instances by loubnabnl in 376
* Better check for deepspeed availability by sgugger in 379
* Refactor some parts in utils by sgugger in 380

0.8.0

Big model inference

To handle very large models, new functionality has been added in Accelerate:
- a context manager to initalize empty models
- a function to load a sharded checkpoint directly on the right devices
- a set of custom hooks that allow execution of a model split on different devices, as well as CPU or disk offload
- a magic method that auto-determines a device map for a given model, maximizing the GPU spaces, available RAM before using disk offload as a last resort.
- a function that wraps the last three blocks in one simple call (`load_checkpoint_and_dispatch`)

See more in the [documentation](https://huggingface.co/docs/accelerate/main/en/big_modeling)

* Big model inference by sgugger in 345

What's new

* Create peak_memory_uasge_tracker.py by pacman100 in 336
* Fixed a typo to enable running accelerate correctly by Idodox in 339
* Introduce multiprocess logger by muellerzr in 337
* Refactor utils into its own module by muellerzr in 340
* Improve num_processes question in CLI by muellerzr in 343
* Handle Manual Wrapping in FSDP. Minor fix of fsdp example. by pacman100 in 342
* Better prompt for number of training devices by muellerzr in 344
* Fix prompt for num_processes by pacman100 in 347
* Fix sample calculation in examples by muellerzr in 352
* Fixing metric eval in distributed setup by pacman100 in 355
* DeepSpeed and FSDP plugin support through script by pacman100 in 356

0.7.1

- Fix fdsp config in cluster [331](https://github.com/huggingface/accelerate/pull/331)
- Add guards for batch size finder [334](https://github.com/huggingface/accelerate/pull/334)
- Patchfix infinite loop [335](https://github.com/huggingface/accelerate/pull/335)

0.7.0

Logging API

Use any of your favorite logging libraries (TensorBoard, Wandb, CometML...) with just a few lines of code inside your training scripts with Accelerate. All details are in the [documentation](https://huggingface.co/docs/accelerate/tracking).

* Add logging capabilities by muellerzr in https://github.com/huggingface/accelerate/pull/293

Support for FSDP (fully sharded DataParallel)

PyTorch recently released a new model wrapper for sharded DDP training called [FSDP](https://pytorch.org/docs/stable/fsdp.html). This release adds support for it (note that it doesn't work with mixed precision yet). See all caveats in the [documentation](https://huggingface.co/docs/accelerate/fsdp).

* PyTorch FSDP Feature Incorporation by pacman100 in https://github.com/huggingface/accelerate/pull/321

Batch size finder

Say goodbye to the CUDA OOM errors with the new `find_executable_batch_size` decorator. Just decorate your training function and pick a starting batch size, then let Accelerate do the rest.

* Add a memory-aware decorator for CUDA OOM avoidance by muellerzr in https://github.com/huggingface/accelerate/pull/324

Examples revamp

The [Accelerate examples](https://github.com/huggingface/accelerate/tree/main/examples) are now split in two: you can find in the base folder a very simple nlp and computer vision examples, as well as complete versions incorporating all features. But you can also browse the examples in the `by_feature` subfolder, which will show you exactly what code to add for each given feature (checkpointing, tracking, cross-validation etc.)

* Refactor Examples by Feature by muellerzr in https://github.com/huggingface/accelerate/pull/312

What's Changed
* Document save/load state by muellerzr in https://github.com/huggingface/accelerate/pull/290
* Refactor precisions to its own enum by muellerzr in https://github.com/huggingface/accelerate/pull/292
* Load model and optimizet states on CPU to void OOMs by sgugger in https://github.com/huggingface/accelerate/pull/299
* Fix example for datasets v2 by sgugger in https://github.com/huggingface/accelerate/pull/298
* Leave default as None in `mixed_precision` for launch command by sgugger in https://github.com/huggingface/accelerate/pull/300
* Pass `lr_scheduler` to `Accelerator.prepare` by sgugger in https://github.com/huggingface/accelerate/pull/301
* Create new TestCase classes and clean up W&B tests by muellerzr in https://github.com/huggingface/accelerate/pull/304
* Have custom trackers work with the API by muellerzr in https://github.com/huggingface/accelerate/pull/305
* Write tests for comet_ml by muellerzr in https://github.com/huggingface/accelerate/pull/306
* Fix training in DeepSpeed by sgugger in https://github.com/huggingface/accelerate/pull/308
* Update example scripts by muellerzr in https://github.com/huggingface/accelerate/pull/307
* Use --no_local_rank for DeepSpeed launch by sgugger in https://github.com/huggingface/accelerate/pull/309
* Fix Accelerate CLI CPU option + small fix for W&B tests by muellerzr in https://github.com/huggingface/accelerate/pull/311
* Fix DataLoader sharding for deepspeed in accelerate by m3rlin45 in https://github.com/huggingface/accelerate/pull/315
* Create a testing framework for example scripts and fix current ones by muellerzr in https://github.com/huggingface/accelerate/pull/313
* Refactor Tracker logic and write guards for logging_dir by muellerzr in https://github.com/huggingface/accelerate/pull/316
* Create Cross-Validation example by muellerzr in https://github.com/huggingface/accelerate/pull/317
* Create alias for Accelerator.free_memory by muellerzr in https://github.com/huggingface/accelerate/pull/318
* fix typo in docs of accelerate tracking by loubnabnl in https://github.com/huggingface/accelerate/pull/320
* Update examples to show how to deal with extra validation copies by muellerzr in https://github.com/huggingface/accelerate/pull/319
* Fixup all checkpointing examples by muellerzr in https://github.com/huggingface/accelerate/pull/323
* Introduce reduce operator by muellerzr in https://github.com/huggingface/accelerate/pull/326

New Contributors
* m3rlin45 made their first contribution in https://github.com/huggingface/accelerate/pull/315
* loubnabnl made their first contribution in https://github.com/huggingface/accelerate/pull/320
* pacman100 made their first contribution in https://github.com/huggingface/accelerate/pull/321

**Full Changelog**: https://github.com/huggingface/accelerate/compare/v0.6.0...v0.7.0

0.6.2

The launcher was ignoring the mixed precision attribute of the config since v0.6.0. This patch fixes that.

Page 9 of 18

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.