Core
* Accelerate can now optimize NUMA affinity, which can help increase throughput on NVIDIA multi-GPU systems. To enable it either follow the prompt during `accelerate config`, set the `ACCELERATE_CPU_AFFINITY=1` env variable, or manually using the following:
python
from accelerate.utils import set_numa_affinity
For GPU 0
set_numa_affinity(0)
Big thanks to stas00 for the recommendation, request, and feedback during development
* Allow for setting deterministic algorithms in `set_seed` by muellerzr in https://github.com/huggingface/accelerate/pull/2569
* Fixed the test script for TPU v2/v3 by vanbasten23 in https://github.com/huggingface/accelerate/pull/2542
* Cambricon MLU device support introduced by huismiling in https://github.com/huggingface/accelerate/pull/2552
* A big refactor was performed to the PartialState and AcceleratorState to allow for easier future-proofing and simplification of adding new devices by muellerzr in https://github.com/huggingface/accelerate/pull/2576
* Fixed a reproducibility issue in distributed environments with Dataloader shuffling when using `BatchSamplerShard` by universuen in https://github.com/huggingface/accelerate/pull/2584
* `notebook_launcher` can use multiple GPUs in Google Colab if using a custom instance that supports multiple GPUs by StefanTodoran in https://github.com/huggingface/accelerate/pull/2561
Big Model Inference
* Add log message for RTX 4000 series when performing multi-gpu inference with device_map which can lead to hanging by SunMarc in https://github.com/huggingface/accelerate/pull/2557
* Fix `load_checkpoint_in_model` behavior when unexpected keys are in the checkpoint by fxmarty in https://github.com/huggingface/accelerate/pull/2588
DeepSpeed
* Fix issue with the mapping of `main_process_ip` and `master_addr` when not using standard as deepspeed launcher by asdfry in https://github.com/huggingface/accelerate/pull/2495
* Improve deepspeed env gen by checking for bad keys, by muellerzr and ricklamers in https://github.com/huggingface/accelerate/pull/2565
* We now support custom deepspeed env files. Like normal `deepspeed`, set it with the `DS_ENV_FILE` environmental variable by muellerzr in https://github.com/huggingface/accelerate/pull/2566
* Resolve ZeRO-3 Initialization Failure in already-started distributed environments by sword865 in https://github.com/huggingface/accelerate/pull/2578
What's Changed
* Fix test_script.py on TPU v2/v3 by vanbasten23 in https://github.com/huggingface/accelerate/pull/2542
* Add mapping `main_process_ip` and `master_addr` when not using standard as deepspeed launcher by asdfry in https://github.com/huggingface/accelerate/pull/2495
* split_between_processes for Dataset by geronimi73 in https://github.com/huggingface/accelerate/pull/2433
* Include working driver check by muellerzr in https://github.com/huggingface/accelerate/pull/2558
* 🚨🚨🚨Move to using tags rather than latest for docker images and consolidate image repos 🚨 🚨🚨 by muellerzr in https://github.com/huggingface/accelerate/pull/2554
* Add Cambricon MLU accelerator support by huismiling in https://github.com/huggingface/accelerate/pull/2552
* Add NUMA affinity control for NVIDIA GPUs by muellerzr in https://github.com/huggingface/accelerate/pull/2535
* Add log message for RTX 4000 series when performing multi-gpu inference with device_map by SunMarc in https://github.com/huggingface/accelerate/pull/2557
* Improve deepspeed env gen by muellerzr in https://github.com/huggingface/accelerate/pull/2565
* Allow for setting deterministic algorithms by muellerzr in https://github.com/huggingface/accelerate/pull/2569
* Unpin deepspeed by muellerzr in https://github.com/huggingface/accelerate/pull/2570
* Rm uv install by muellerzr in https://github.com/huggingface/accelerate/pull/2577
* Allow for custom deepspeed env files by muellerzr in https://github.com/huggingface/accelerate/pull/2566
* [docs] Missing functions from API by stevhliu in https://github.com/huggingface/accelerate/pull/2580
* Update data_loader.py to Ensure Reproducibility in Multi-Process Environments with Dataloader Shuffle by universuen in https://github.com/huggingface/accelerate/pull/2584
* Refactor affinity and make it stateful by muellerzr in https://github.com/huggingface/accelerate/pull/2579
* Refactor and improve model estimator tool by muellerzr in https://github.com/huggingface/accelerate/pull/2581
* Fix `load_checkpoint_in_model` behavior when unexpected keys are in the checkpoint by fxmarty in https://github.com/huggingface/accelerate/pull/2588
* Guard stateful objects by muellerzr in https://github.com/huggingface/accelerate/pull/2572
* Expound PartialState docstring by muellerzr in https://github.com/huggingface/accelerate/pull/2589
* [docs] Fix kwarg docstring by stevhliu in https://github.com/huggingface/accelerate/pull/2590
* Allow notebook_launcher to launch to multiple GPUs from Colab by StefanTodoran in https://github.com/huggingface/accelerate/pull/2561
* Fix warning log for unused checkpoint keys by fxmarty in https://github.com/huggingface/accelerate/pull/2594
* Resolve ZeRO-3 Initialization Failure in Pre-Set Torch Distributed Environments (huggingface/transformers28803) by sword865 in https://github.com/huggingface/accelerate/pull/2578
* Refactor PartialState and AcceleratorState by muellerzr in https://github.com/huggingface/accelerate/pull/2576
* Allow for force unwrapping by muellerzr in https://github.com/huggingface/accelerate/pull/2595
* Pin hub for tests by muellerzr in https://github.com/huggingface/accelerate/pull/2608
* Default false for trust_remote_code by muellerzr in https://github.com/huggingface/accelerate/pull/2607
* fix llama example for pippy by SunMarc in https://github.com/huggingface/accelerate/pull/2616
* Fix links in Quick Tour by muellerzr in https://github.com/huggingface/accelerate/pull/2617
* Link to bash in env reporting by muellerzr in https://github.com/huggingface/accelerate/pull/2623
* Unpin hub by muellerzr in https://github.com/huggingface/accelerate/pull/2625
New Contributors
* asdfry made their first contribution in https://github.com/huggingface/accelerate/pull/2495
* geronimi73 made their first contribution in https://github.com/huggingface/accelerate/pull/2433
* huismiling made their first contribution in https://github.com/huggingface/accelerate/pull/2552
* universuen made their first contribution in https://github.com/huggingface/accelerate/pull/2584
* StefanTodoran made their first contribution in https://github.com/huggingface/accelerate/pull/2561
* sword865 made their first contribution in https://github.com/huggingface/accelerate/pull/2578
**Full Changelog**: https://github.com/huggingface/accelerate/compare/v0.28.0...v0.29.0