Accelerate

Latest version: v1.1.1

Safety actively analyzes 681812 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 2 of 16

0.33.0

MUSA backend support and bugfixes

Small release this month, with key focuses on some added support for backends and bugs:
* Support MUSA (Moore Threads GPU) backend in accelerate by fmo-mt in https://github.com/huggingface/accelerate/pull/2917
* Allow multiple process per device by cifkao in https://github.com/huggingface/accelerate/pull/2916
* Add `torch.float8_e4m3fn` format `dtype_byte_size` by SunMarc in https://github.com/huggingface/accelerate/pull/2945
* Properly handle Params4bit in set_module_tensor_to_device by matthewdouglas in https://github.com/huggingface/accelerate/pull/2934

What's Changed
* [tests] fix bug in torch_device by faaany in https://github.com/huggingface/accelerate/pull/2909
* Fix slowdown on init with `device_map="auto"` by muellerzr in https://github.com/huggingface/accelerate/pull/2914
* fix: bug where `multi_gpu` was being set and warning being printed even with `num_processes=1` by HarikrishnanBalagopal in https://github.com/huggingface/accelerate/pull/2921
* Better error when a bad directory is given for weight merging by muellerzr in https://github.com/huggingface/accelerate/pull/2852
* add xpu device check before moving tensor directly to xpu device by faaany in https://github.com/huggingface/accelerate/pull/2928
* Add huggingface_hub version to setup.py by nullquant in https://github.com/huggingface/accelerate/pull/2932
* Correct loading of models with shared tensors when using accelerator.load_state() by jkuntzer in https://github.com/huggingface/accelerate/pull/2875
* Hotfix PyTorch Version Installation in CI Workflow for Minimum Version Matrix by yhna940 in https://github.com/huggingface/accelerate/pull/2889
* Fix import test by muellerzr in https://github.com/huggingface/accelerate/pull/2931
* Consider pynvml available when installed through the nvidia-ml-py distribution by matthewdouglas in https://github.com/huggingface/accelerate/pull/2936
* Improve test reliability for Accelerator.free_memory() by matthewdouglas in https://github.com/huggingface/accelerate/pull/2935
* delete CCL env var setting by Liangliang-Ma in https://github.com/huggingface/accelerate/pull/2927
* feat(ci): add `pip` caching in CI by SauravMaheshkar in https://github.com/huggingface/accelerate/pull/2952

New Contributors
* HarikrishnanBalagopal made their first contribution in https://github.com/huggingface/accelerate/pull/2921
* fmo-mt made their first contribution in https://github.com/huggingface/accelerate/pull/2917
* nullquant made their first contribution in https://github.com/huggingface/accelerate/pull/2932
* cifkao made their first contribution in https://github.com/huggingface/accelerate/pull/2916
* jkuntzer made their first contribution in https://github.com/huggingface/accelerate/pull/2875
* matthewdouglas made their first contribution in https://github.com/huggingface/accelerate/pull/2936
* Liangliang-Ma made their first contribution in https://github.com/huggingface/accelerate/pull/2927
* SauravMaheshkar made their first contribution in https://github.com/huggingface/accelerate/pull/2952

**Full Changelog**: https://github.com/huggingface/accelerate/compare/v0.32.1...v0.33.0

0.32.0

Core
* Utilize shard saving from the `huggingface_hub` rather than our own implementation (https://github.com/huggingface/accelerate/pull/2795)
* Refactor logging to use logger in `dispatch_model` (https://github.com/huggingface/accelerate/pull/2855)
* The `Accelerator.step` number is now restored when using `save_state` and `load_state` (https://github.com/huggingface/accelerate/pull/2765)
* A new profiler has been added allowing users to collect performance metrics during model training and inference, including detailed analysis of execution time and memory consumption. These can then be generated in Chrome's tracing tool. Read more about it [here](https://huggingface.co/docs/accelerate/usage_guides/profiler) (https://github.com/huggingface/accelerate/pull/2883)
* Reduced import times for doing `import accelerate` and any other major core import by 68%, now should be only slightly longer than doing `import torch` (https://github.com/huggingface/accelerate/pull/2845)
* Fixed a bug in `get_backend` and added a `clear_device_cache` utility (https://github.com/huggingface/accelerate/pull/2857)

Distributed Data Parallelism
* Introduce DDP communication hooks to have more flexibility in how gradients are communicated across workers, overriding the standard `allreduce`. (https://github.com/huggingface/accelerate/pull/2841)
* Make `log_line_prefix_template` optional the `notebook_launcher` (https://github.com/huggingface/accelerate/pull/2888)
FSDP
* If the output directory doesn't exist when using `accelerate merge-weights`, one will be automatically created (https://github.com/huggingface/accelerate/pull/2854)
* When merging weights, the default is now `.safetensors` (https://github.com/huggingface/accelerate/pull/2853)

XPU
* Migrate to pytorch's native XPU backend on `torch>=2.4` (https://github.com/huggingface/accelerate/pull/2825)
* Add `require_triton` test decorator and enable `test_dynamo` work on xpu (https://github.com/huggingface/accelerate/pull/2878)
* Fixed `load_state_dict` not working on `xpu` and refine xpu `safetensors` version check (https://github.com/huggingface/accelerate/pull/2879)
XLA
* Added support for XLA Dynamo backends for both training and inference (https://github.com/huggingface/accelerate/pull/2892)

Examples
* Added a new multi-cpu SLURM example using `accelerate launch` (https://github.com/huggingface/accelerate/pull/2902)

Full Changelog
* Use shard saving from huggingface_hub by SunMarc in https://github.com/huggingface/accelerate/pull/2795
* doc: fix link by imba-tjd in https://github.com/huggingface/accelerate/pull/2844
* Revert "Slight rename" by SunMarc in https://github.com/huggingface/accelerate/pull/2850
* remove warning hook addede during dispatch_model by SunMarc in https://github.com/huggingface/accelerate/pull/2843
* Remove underlines between badges by novialriptide in https://github.com/huggingface/accelerate/pull/2851
* Auto create dir when merging FSDP weights by helloworld1 in https://github.com/huggingface/accelerate/pull/2854
* Add DDP Communication Hooks by yhna940 in https://github.com/huggingface/accelerate/pull/2841
* Refactor logging to use logger in `dispatch_model` by panjd123 in https://github.com/huggingface/accelerate/pull/2855
* xpu: support xpu backend from stock pytorch (>=2.4) by dvrogozh in https://github.com/huggingface/accelerate/pull/2825
* Drop torch re-imports in npu and mlu paths by dvrogozh in https://github.com/huggingface/accelerate/pull/2856
* Default FSDP weights merge to safetensors by helloworld1 in https://github.com/huggingface/accelerate/pull/2853
* [tests] fix bug in `test_tracking.ClearMLTest` by faaany in https://github.com/huggingface/accelerate/pull/2863
* [tests] use `torch_device` instead of `0` for device check by faaany in https://github.com/huggingface/accelerate/pull/2861
* [tests] skip bnb-related tests instead of failing on xpu by faaany in https://github.com/huggingface/accelerate/pull/2860
* Potentially fix tests by muellerzr in https://github.com/huggingface/accelerate/pull/2862
* [tests] enable XPU backend for `test_zero3_integration` by faaany in https://github.com/huggingface/accelerate/pull/2864
* Support saving and loading of step while saving and loading state by bipinKrishnan in https://github.com/huggingface/accelerate/pull/2765
* Add Profiler Support for Performance Analysis by yhna940 in https://github.com/huggingface/accelerate/pull/2883
* Speed up imports and add a CI by muellerzr in https://github.com/huggingface/accelerate/pull/2845
* Make `log_line_prefix_template` Optional in Elastic Launcher for Backward Compatibility by yhna940 in https://github.com/huggingface/accelerate/pull/2888
* Add XLA Dynamo backends for training and inference by johnsutor in https://github.com/huggingface/accelerate/pull/2892
* Added a MultiCPU SLURM example using Accelerate Launch and MPIRun by okhleif-IL in https://github.com/huggingface/accelerate/pull/2902
* make more cuda-only tests device-agnostic by faaany in https://github.com/huggingface/accelerate/pull/2876
* fix mlu device longTensor bugs by huismiling in https://github.com/huggingface/accelerate/pull/2887
* add `require_triton` and enable `test_dynamo` work on xpu by faaany in https://github.com/huggingface/accelerate/pull/2878
* fix `load_state_dict` for xpu and refine xpu safetensor version check by faaany in https://github.com/huggingface/accelerate/pull/2879
* Fix get_backend bug and add clear_device_cache function by NurmaU in https://github.com/huggingface/accelerate/pull/2857

New Contributors
* McPatate made their first contribution in https://github.com/huggingface/accelerate/pull/2836
* imba-tjd made their first contribution in https://github.com/huggingface/accelerate/pull/2844
* novialriptide made their first contribution in https://github.com/huggingface/accelerate/pull/2851
* panjd123 made their first contribution in https://github.com/huggingface/accelerate/pull/2855
* dvrogozh made their first contribution in https://github.com/huggingface/accelerate/pull/2825
* johnsutor made their first contribution in https://github.com/huggingface/accelerate/pull/2892
* okhleif-IL made their first contribution in https://github.com/huggingface/accelerate/pull/2902
* NurmaU made their first contribution in https://github.com/huggingface/accelerate/pull/2857

**Full Changelog**: https://github.com/huggingface/accelerate/compare/v0.31.0...v0.32.0

0.31.0

Core
* Set `timeout` default to PyTorch defaults based on backend by muellerzr in https://github.com/huggingface/accelerate/pull/2758
* fix duplicate elements in split_between_processes by hkunzhe in https://github.com/huggingface/accelerate/pull/2781
* Add Elastic Launch Support to `notebook_launcher` by yhna940 in https://github.com/huggingface/accelerate/pull/2788
* Fix Wrong use of sync_gradients used to implement sync_each_batch by fabianlim in https://github.com/huggingface/accelerate/pull/2790

FSDP
* Introduce shard-merging util for FSDP by muellerzr in https://github.com/huggingface/accelerate/pull/2772
* Enable sharded state dict + offload to cpu resume by muellerzr in https://github.com/huggingface/accelerate/pull/2762
* Enable config for fsdp activation checkpointing by helloworld1 in https://github.com/huggingface/accelerate/pull/2779

Megatron
* Upgrade huggingface's megatron to nvidia's megatron when use MegatronLMPlugin by zhangsheng377 in https://github.com/huggingface/accelerate/pull/2501

What's Changed
* Add feature to allow redirecting std streams into log files when using torchrun as the launcher. by lyuwen in https://github.com/huggingface/accelerate/pull/2740
* Update modeling.py by adding try-catch section to skip the unavailable devices by MeVeryHandsome in https://github.com/huggingface/accelerate/pull/2681
* Fixed the problem of incorrect conditional judgment statement when configuring enable_cpu_affinity by statelesshz in https://github.com/huggingface/accelerate/pull/2748
* Fix stacklevel in `logging` to log the actual user call site (instead of the call site inside the logger wrapper) of log functions by luowyang in https://github.com/huggingface/accelerate/pull/2730
* LOMO / FIX: Support multiple optimizers by younesbelkada in https://github.com/huggingface/accelerate/pull/2745
* Fix max_memory assignment by SunMarc in https://github.com/huggingface/accelerate/pull/2751
* Fix duplicate environment variable check in multi-cpu condition by yhna940 in https://github.com/huggingface/accelerate/pull/2752
* Simplify CLI args validation and ensure CLI args take precedence over config file. by Iain-S in https://github.com/huggingface/accelerate/pull/2757
* Fix sagemaker config by muellerzr in https://github.com/huggingface/accelerate/pull/2753
* fix cpu omp num threads set by jiqing-feng in https://github.com/huggingface/accelerate/pull/2755
* Revert "Simplify CLI args validation and ensure CLI args take precedence over config file." by muellerzr in https://github.com/huggingface/accelerate/pull/2763
* Enable sharded cpu resume by muellerzr in https://github.com/huggingface/accelerate/pull/2762
* Sets default to PyTorch defaults based on backend by muellerzr in https://github.com/huggingface/accelerate/pull/2758
* optimize get_module_leaves speed by BBuf in https://github.com/huggingface/accelerate/pull/2756
* fix minor typo by TemryL in https://github.com/huggingface/accelerate/pull/2767
* Fix small edge case in get_module_leaves by SunMarc in https://github.com/huggingface/accelerate/pull/2774
* Skip deepspeed test by SunMarc in https://github.com/huggingface/accelerate/pull/2776
* Enable config for fsdp activation checkpointing by helloworld1 in https://github.com/huggingface/accelerate/pull/2779
* Add arg from CLI to fix failing test by muellerzr in https://github.com/huggingface/accelerate/pull/2783
* Skip tied weights disk offload test by SunMarc in https://github.com/huggingface/accelerate/pull/2782
* Introduce shard-merging util for FSDP by muellerzr in https://github.com/huggingface/accelerate/pull/2772
* FIX / FSDP : Guard fsdp utils for earlier PyTorch versions by younesbelkada in https://github.com/huggingface/accelerate/pull/2794
* Upgrade huggingface's megatron to nvidia's megatron when use MegatronLMPlugin by zhangsheng377 in https://github.com/huggingface/accelerate/pull/2501
* Fixup CLI test by muellerzr in https://github.com/huggingface/accelerate/pull/2796
* fix duplicate elements in split_between_processes by hkunzhe in https://github.com/huggingface/accelerate/pull/2781
* Add Elastic Launch Support to `notebook_launcher` by yhna940 in https://github.com/huggingface/accelerate/pull/2788
* Fix Wrong use of sync_gradients used to implement sync_each_batch by fabianlim in https://github.com/huggingface/accelerate/pull/2790
* Fix type in accelerator.py by qgallouedec in https://github.com/huggingface/accelerate/pull/2800
* fix comet ml test by SunMarc in https://github.com/huggingface/accelerate/pull/2804
* New template by muellerzr in https://github.com/huggingface/accelerate/pull/2808
* Fix access error for torch.mps when using torch==1.13.1 on macOS by SunMarc in https://github.com/huggingface/accelerate/pull/2806
* 4-bit quantization meta device bias loading bug by SunMarc in https://github.com/huggingface/accelerate/pull/2805
* State dictionary retrieval from offloaded modules by blbadger in https://github.com/huggingface/accelerate/pull/2619
* add cuda dep for a test by SunMarc in https://github.com/huggingface/accelerate/pull/2820
* Remove out-dated xpu device check code in `get_balanced_memory` by faaany in https://github.com/huggingface/accelerate/pull/2826
* Fix DeepSpeed config validation error by changing `stage3_prefetch_bucket_size` value to an integer by adk9 in https://github.com/huggingface/accelerate/pull/2814
* Improve test speeds by up to 30% in multi-gpu settings by muellerzr in https://github.com/huggingface/accelerate/pull/2830
* monitor-interval, take 2 by muellerzr in https://github.com/huggingface/accelerate/pull/2833
* Optimize the megatron plugin by zhangsheng377 in https://github.com/huggingface/accelerate/pull/2822
* fix fstr format by Jintao-Huang in https://github.com/huggingface/accelerate/pull/2810

New Contributors
* lyuwen made their first contribution in https://github.com/huggingface/accelerate/pull/2740
* MeVeryHandsome made their first contribution in https://github.com/huggingface/accelerate/pull/2681
* luowyang made their first contribution in https://github.com/huggingface/accelerate/pull/2730
* Iain-S made their first contribution in https://github.com/huggingface/accelerate/pull/2757
* BBuf made their first contribution in https://github.com/huggingface/accelerate/pull/2756
* TemryL made their first contribution in https://github.com/huggingface/accelerate/pull/2767
* helloworld1 made their first contribution in https://github.com/huggingface/accelerate/pull/2779
* hkunzhe made their first contribution in https://github.com/huggingface/accelerate/pull/2781
* adk9 made their first contribution in https://github.com/huggingface/accelerate/pull/2814
* Jintao-Huang made their first contribution in https://github.com/huggingface/accelerate/pull/2810

**Full Changelog**: https://github.com/huggingface/accelerate/compare/v0.30.1...v0.31.0

0.30.1

Patchfix
* Fix duplicate environment variable check in multi-cpu condition thanks to yhna940 in https://github.com/huggingface/accelerate/pull/2752
* Fix issue with missing values in the SageMaker config leading to not being able to launch in https://github.com/huggingface/accelerate/pull/2753
* Fix CPU OMP num threads setting thanks to jiqing-feng in https://github.com/huggingface/accelerate/pull/2755
* Fix FSDP checkpoint unable to resume when using offloading and sharded weights due to CUDA OOM when loading the optimizer and model https://github.com/huggingface/accelerate/pull/2762
* Fixed the problem of incorrect conditional judgment statement when configuring enable_cpu_affinity thanks to statelesshz in https://github.com/huggingface/accelerate/pull/2748
* Fix stacklevel in logging to log the actual user call site (instead of the call site inside the logger wrapper) of log functions thanks to luowyang in https://github.com/huggingface/accelerate/pull/2730
* Fix support for multiple optimizers when using LOMO thanks to younesbelkada in https://github.com/huggingface/accelerate/pull/2745

**Full Changelog**: https://github.com/huggingface/accelerate/compare/v0.30.0...v0.30.1

0.30.0

Core
* We've simplified the `tqdm` wrapper to make it fully passthrough, no need to have `tqdm(main_process_only, *args)`, it is now just `tqdm(*args)` and you can pass in `is_main_process` as a kwarg.
* We've added support for advanced optimizer usage:
* Schedule free optimizer introduced by [Meta](https://github.com/facebookresearch/schedule_free/tree/main) by muellerzr in https://github.com/huggingface/accelerate/pull/2631
* LOMO optimizer introduced by [OpenLMLab](https://github.com/OpenLMLab/LOMO) by younesbelkada in https://github.com/huggingface/accelerate/pull/2695
* Enable BF16 autocast to everything during FP8 and enable FSDP by muellerzr in https://github.com/huggingface/accelerate/pull/2655
* Support dataloader send_to_device calls to use non-blocking by drhead in https://github.com/huggingface/accelerate/pull/2685
* allow gather_for_metrics to be more flexible by SunMarc in https://github.com/huggingface/accelerate/pull/2710
* Add `cann` version info to command accelerate env for NPU by statelesshz in https://github.com/huggingface/accelerate/pull/2689
* Add MLU rng state setter by ArthurinRUC in https://github.com/huggingface/accelerate/pull/2664
* device agnostic testing for hooks&utils&big_modeling by statelesshz in https://github.com/huggingface/accelerate/pull/2602

Documentation
* Through collaboration between fabianlim (lead contribuitor), stas00, pacman100, and muellerzr we have a new concept guide out for FSDP and DeepSpeed explicitly detailing how each interop and explaining fully and clearly how each of those work. This was a momumental effort by fabianlim to ensure that everything can be as accurate as possible to users. I highly recommend visiting this new documentation, available [here](https://huggingface.co/docs/accelerate/concept_guides/fsdp_and_deepspeed)
* New distributed inference examples have been added thanks to SunMarc in https://github.com/huggingface/accelerate/pull/2672
* Fixed some docs for using internal trackers by brentyi in https://github.com/huggingface/accelerate/pull/2650

DeepSpeed
* Accelerate can now handle MoE models when using deepspeed, thanks to pacman100 in https://github.com/huggingface/accelerate/pull/2662
* Allow "auto" for gradient clipping in YAML by regisss in https://github.com/huggingface/accelerate/pull/2649
* Introduce a `deepspeed`-specific Docker image by muellerzr in https://github.com/huggingface/accelerate/pull/2707. To use, pull the `gpu-deepspeed` tag `docker pull huggingface/accelerate:cuda-deepspeed-nightly`

Megatron
* Megatron plugin can support NPU by zhangsheng377 in https://github.com/huggingface/accelerate/pull/2667


Big Modeling
* Add strict arg to load_checkpoint_and_dispatch by SunMarc in https://github.com/huggingface/accelerate/pull/2641

Bug Fixes
* Fix up state with xla + performance regression by muellerzr in https://github.com/huggingface/accelerate/pull/2634
* Parenthesis on xpu_available by muellerzr in https://github.com/huggingface/accelerate/pull/2639
* Fix `is_train_batch_min` type in DeepSpeedPlugin by yhna940 in https://github.com/huggingface/accelerate/pull/2646
* Fix backend check by jiqing-feng in https://github.com/huggingface/accelerate/pull/2652
* Fix the rng states of sampler's generator to be synchronized for correct sharding of dataset across GPUs by pacman100 in https://github.com/huggingface/accelerate/pull/2694
* Block AMP for MPS device by SunMarc in https://github.com/huggingface/accelerate/pull/2699
* Fixed issue when doing multi-gpu training with bnb when the first gpu is not used by SunMarc in https://github.com/huggingface/accelerate/pull/2714
* Fixup `free_memory` to deal with garbage collection by muellerzr in https://github.com/huggingface/accelerate/pull/2716
* Fix sampler serialization failing by SunMarc in https://github.com/huggingface/accelerate/pull/2723
* Fix deepspeed offload device type in the arguments to be more accurate by yhna940 in https://github.com/huggingface/accelerate/pull/2717

Full Changelog
* Schedule free optimizer support by muellerzr in https://github.com/huggingface/accelerate/pull/2631
* Fix up state with xla + performance regression by muellerzr in https://github.com/huggingface/accelerate/pull/2634
* Parenthesis on xpu_available by muellerzr in https://github.com/huggingface/accelerate/pull/2639
* add third-party device prefix to `execution_device` by faaany in https://github.com/huggingface/accelerate/pull/2612
* add strict arg to load_checkpoint_and_dispatch by SunMarc in https://github.com/huggingface/accelerate/pull/2641
* device agnostic testing for hooks&utils&big_modeling by statelesshz in https://github.com/huggingface/accelerate/pull/2602
* Docs fix for using internal trackers by brentyi in https://github.com/huggingface/accelerate/pull/2650
* Allow "auto" for gradient clipping in YAML by regisss in https://github.com/huggingface/accelerate/pull/2649
* Fix `is_train_batch_min` type in DeepSpeedPlugin by yhna940 in https://github.com/huggingface/accelerate/pull/2646
* Don't use deprecated `Repository` anymore by Wauplin in https://github.com/huggingface/accelerate/pull/2658
* Fix test_from_pretrained_low_cpu_mem_usage_measured failure by yuanwu2017 in https://github.com/huggingface/accelerate/pull/2644
* Add MLU rng state setter by ArthurinRUC in https://github.com/huggingface/accelerate/pull/2664
* fix backend check by jiqing-feng in https://github.com/huggingface/accelerate/pull/2652
* Megatron plugin can support NPU by zhangsheng377 in https://github.com/huggingface/accelerate/pull/2667
* Revert "fix backend check" by muellerzr in https://github.com/huggingface/accelerate/pull/2669
* `tqdm`: `*args` should come ahead of `main_process_only` by rb-synth in https://github.com/huggingface/accelerate/pull/2654
* Handle MoE models with DeepSpeed by pacman100 in https://github.com/huggingface/accelerate/pull/2662
* Fix deepspeed moe test with version check by pacman100 in https://github.com/huggingface/accelerate/pull/2677
* Pin DS...again.. by muellerzr in https://github.com/huggingface/accelerate/pull/2679
* fix backend check by jiqing-feng in https://github.com/huggingface/accelerate/pull/2670
* Deprecate tqdm args + slight logic tweaks by muellerzr in https://github.com/huggingface/accelerate/pull/2673
* Enable BF16 autocast to everything during FP8 + some tweaks to enable FSDP by muellerzr in https://github.com/huggingface/accelerate/pull/2655
* Fix the rng states of sampler's generator to be synchronized for correct sharding of dataset across GPUs by pacman100 in https://github.com/huggingface/accelerate/pull/2694
* Simplify test logic by pacman100 in https://github.com/huggingface/accelerate/pull/2697
* Add source code for DataLoader Animation by muellerzr in https://github.com/huggingface/accelerate/pull/2696
* Block AMP for MPS device by SunMarc in https://github.com/huggingface/accelerate/pull/2699
* Do a pip freeze during workflows by muellerzr in https://github.com/huggingface/accelerate/pull/2704
* add cann version info to command accelerate env by statelesshz in https://github.com/huggingface/accelerate/pull/2689
* Add version checks for the import of DeepSpeed moe utils by pacman100 in https://github.com/huggingface/accelerate/pull/2705
* Change dataloader send_to_device calls to non-blocking by drhead in https://github.com/huggingface/accelerate/pull/2685
* add distributed examples by SunMarc in https://github.com/huggingface/accelerate/pull/2672
* Add diffusers to req by muellerzr in https://github.com/huggingface/accelerate/pull/2711
* fix bnb multi gpu training by SunMarc in https://github.com/huggingface/accelerate/pull/2714
* allow gather_for_metrics to be more flexible by SunMarc in https://github.com/huggingface/accelerate/pull/2710
* Add Upcasting for FSDP in Mixed Precision. Add Concept Guide for FSPD and DeepSpeed. by fabianlim in https://github.com/huggingface/accelerate/pull/2674
* Segment out a deepspeed docker image by muellerzr in https://github.com/huggingface/accelerate/pull/2707
* Fixup `free_memory` to deal with garbage collection by muellerzr in https://github.com/huggingface/accelerate/pull/2716
* fix sampler serialization by SunMarc in https://github.com/huggingface/accelerate/pull/2723
* Fix sampler failing test by SunMarc in https://github.com/huggingface/accelerate/pull/2728
* Docs: Fix build main documentation by SunMarc in https://github.com/huggingface/accelerate/pull/2729
* Fix Documentation in FSDP and DeepSpeed Concept Guide by fabianlim in https://github.com/huggingface/accelerate/pull/2725
* Fix deepspeed offload device type by yhna940 in https://github.com/huggingface/accelerate/pull/2717
* FEAT: Add LOMO optimizer by younesbelkada in https://github.com/huggingface/accelerate/pull/2695
* Fix tests on main by muellerzr in https://github.com/huggingface/accelerate/pull/2739

New Contributors
* brentyi made their first contribution in https://github.com/huggingface/accelerate/pull/2650
* regisss made their first contribution in https://github.com/huggingface/accelerate/pull/2649
* yhna940 made their first contribution in https://github.com/huggingface/accelerate/pull/2646
* Wauplin made their first contribution in https://github.com/huggingface/accelerate/pull/2658
* ArthurinRUC made their first contribution in https://github.com/huggingface/accelerate/pull/2664
* jiqing-feng made their first contribution in https://github.com/huggingface/accelerate/pull/2652
* zhangsheng377 made their first contribution in https://github.com/huggingface/accelerate/pull/2667
* rb-synth made their first contribution in https://github.com/huggingface/accelerate/pull/2654
* drhead made their first contribution in https://github.com/huggingface/accelerate/pull/2685

**Full Changelog**: https://github.com/huggingface/accelerate/compare/v0.29.3...v0.30.0

0.29.3

* Fixes issue with backend refactor not working on CPU-based distributed environments by jiqing-feng: https://github.com/huggingface/accelerate/pull/2670
* Fixes issue where `load_checkpoint_and_dispatch` needs a `strict` argument
* by SunMarc: https://github.com/huggingface/accelerate/pull/2641

**Full Changelog**: https://github.com/huggingface/accelerate/compare/v0.29.2...v0.29.3

Page 2 of 16

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.