Safetensors default
As of this release, `safetensors` will be the default format saved when applicable! To read more about safetensors and why it's best to use it for safety (and not pickle/torch.save), check it out [here](https://github.com/huggingface/safetensors)
New Experiment Trackers
This release has two new experiment trackers, ClearML and DVCLive!
To use them, just pass `clear_ml` or `dvclive` to `log_with` in the `Accelerator` init. h/t to eugen-ajechiloae-clearml and dberenbaum
DeepSpeed
* Accelerate's DeepSpeed integration now supports NPU devices, h/t to statelesshz
* DeepSpeed can now be launched via accelerate on single GPU setups
FSDP
FSDP had a huge refactoring so that the interface when using FSDP is the exact same as every other scenario when using `accelerate`. No more needing to call `accelerator.prepare()` twice!
Other useful enhancements
* We now raise and try to disable P2P communications on consumer GPUs for the 3090 series and beyond. Without this users were seeing timeout issues and the like as NVIDIA dropped P2P support. If using `accelerate launch` we will automatically disable, and if we sense that it is still enabled on distributed setups using 3090's +, we will raise an error.
* When doing `.gather()`, if tensors are on different devices we explicitly will raise an error (for now only valid on CUDA)
Bug fixes
* Fixed a bug that caused dataloaders to not shuffle despite `shuffle=True` when using multiple GPUs and the new `SeedableRandomSampler`.
General Changelog
* Add logs offloading by SunMarc in https://github.com/huggingface/accelerate/pull/2075
* Add ClearML tracker by eugen-ajechiloae-clearml in https://github.com/huggingface/accelerate/pull/2034
* CRITICAL: fix failing ci by muellerzr in https://github.com/huggingface/accelerate/pull/2088
* Fix flag typo by kuza55 in https://github.com/huggingface/accelerate/pull/2090
* Fix batch sampler by muellerzr in https://github.com/huggingface/accelerate/pull/2097
* fixed ip address typo by Fluder-Paradyne in https://github.com/huggingface/accelerate/pull/2099
* Fix memory leak in fp8 causing OOM (and potentially 3x vRAM usage) by muellerzr in https://github.com/huggingface/accelerate/pull/2089
* fix warning when offload by SunMarc in https://github.com/huggingface/accelerate/pull/2105
* Always use SeedableRandomSampler by muellerzr in https://github.com/huggingface/accelerate/pull/2110
* Fix issue with tests by muellerzr in https://github.com/huggingface/accelerate/pull/2111
* Make SeedableRandomSampler the default always by muellerzr in https://github.com/huggingface/accelerate/pull/2117
* Use "and" instead of comma in Bibtex citation by qgallouedec in https://github.com/huggingface/accelerate/pull/2119
* Add explicit error if empty batch received by YuryYakhno in https://github.com/huggingface/accelerate/pull/2115
* Allow for ACCELERATE_SEED env var by muellerzr in https://github.com/huggingface/accelerate/pull/2126
* add DeepSpeed support for NPU by statelesshz in https://github.com/huggingface/accelerate/pull/2054
* Sync states for npu fsdp by jq460494839 in https://github.com/huggingface/accelerate/pull/2113
* Fix import error when torch>=2.0.1 and torch.distributed is disabled by natsukium in https://github.com/huggingface/accelerate/pull/2121
* Make safetensors the default by muellerzr in https://github.com/huggingface/accelerate/pull/2120
* Raise error when saving with param on meta device by SunMarc in https://github.com/huggingface/accelerate/pull/2132
* Leave native `save` as `False` by muellerzr in https://github.com/huggingface/accelerate/pull/2138
* fix retie_parameters by SunMarc in https://github.com/huggingface/accelerate/pull/2137
* Deal with shared memory scenarios by muellerzr in https://github.com/huggingface/accelerate/pull/2136
* specify config file path on README by kwonmha in https://github.com/huggingface/accelerate/pull/2140
* Fix safetensors contiguous by SunMarc in https://github.com/huggingface/accelerate/pull/2145
* Fix more tests by muellerzr in https://github.com/huggingface/accelerate/pull/2146
* [docs] fixed a couple of broken links by MKhalusova in https://github.com/huggingface/accelerate/pull/2147
* [docs] troubleshooting guide by MKhalusova in https://github.com/huggingface/accelerate/pull/2133
* [Docs] fix doc typos by kashif in https://github.com/huggingface/accelerate/pull/2150
* Add note about GradientState being in-sync with the dataloader by default by muellerzr in https://github.com/huggingface/accelerate/pull/2134
* Deprecated runner stuff by muellerzr in https://github.com/huggingface/accelerate/pull/2152
* Add examples to tests by muellerzr in https://github.com/huggingface/accelerate/pull/2131
* Disable pypi for merge workflows + fix trainer tests by muellerzr in https://github.com/huggingface/accelerate/pull/2153
* Adds dvclive tracker by dberenbaum in https://github.com/huggingface/accelerate/pull/2139
* check port availability only in main deepspeed/torchrun launcher by Jingru in https://github.com/huggingface/accelerate/pull/2078
* Do not attempt to pad nested tensors by frankier in https://github.com/huggingface/accelerate/pull/2041
* Add warning for problematic libraries by muellerzr in https://github.com/huggingface/accelerate/pull/2151
* Add ZeRO++ to DeepSpeed usage docs by SumanthRH in https://github.com/huggingface/accelerate/pull/2166
* Fix Megatron-LM Arguments Bug by yuanenming in https://github.com/huggingface/accelerate/pull/2168
* Fix non persistant buffer dispatch by SunMarc in https://github.com/huggingface/accelerate/pull/1941
* Updated torchrun instructions by TJ-Solergibert in https://github.com/huggingface/accelerate/pull/2096
* New CI Runners by muellerzr in https://github.com/huggingface/accelerate/pull/2087
* Revert "New CI Runners" by muellerzr in https://github.com/huggingface/accelerate/pull/2172
* [Working again] New CI by muellerzr in https://github.com/huggingface/accelerate/pull/2173
* fsdp refactoring by pacman100 in https://github.com/huggingface/accelerate/pull/2177
* Pin DVC by muellerzr in https://github.com/huggingface/accelerate/pull/2196
* Apply DVC warning to Accelerate by muellerzr in https://github.com/huggingface/accelerate/pull/2197
* Explicitly disable P2P using `launch`, and pick up in `state` if a user will face issues. by muellerzr in https://github.com/huggingface/accelerate/pull/2195
* Better error when device mismatches when calling gather() on CUDA by muellerzr in https://github.com/huggingface/accelerate/pull/2180
* unpins dvc by dberenbaum in https://github.com/huggingface/accelerate/pull/2200
* Assemble state dictionary for offloaded models by blbadger in https://github.com/huggingface/accelerate/pull/2156
* Allow deepspeed without distributed launcher by pacman100 in https://github.com/huggingface/accelerate/pull/2204
New Contributors
* eugen-ajechiloae-clearml made their first contribution in https://github.com/huggingface/accelerate/pull/2034
* kuza55 made their first contribution in https://github.com/huggingface/accelerate/pull/2090
* Fluder-Paradyne made their first contribution in https://github.com/huggingface/accelerate/pull/2099
* YuryYakhno made their first contribution in https://github.com/huggingface/accelerate/pull/2115
* jq460494839 made their first contribution in https://github.com/huggingface/accelerate/pull/2113
* kwonmha made their first contribution in https://github.com/huggingface/accelerate/pull/2140
* dberenbaum made their first contribution in https://github.com/huggingface/accelerate/pull/2139
* Jingru made their first contribution in https://github.com/huggingface/accelerate/pull/2078
* frankier made their first contribution in https://github.com/huggingface/accelerate/pull/2041
* yuanenming made their first contribution in https://github.com/huggingface/accelerate/pull/2168
* TJ-Solergibert made their first contribution in https://github.com/huggingface/accelerate/pull/2096
* blbadger made their first contribution in https://github.com/huggingface/accelerate/pull/2156
**Full Changelog**: https://github.com/huggingface/accelerate/compare/v0.24.1...v0.25.0