Deepspeed

Latest version: v0.16.2

Safety actively analyzes 693883 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 14 of 17

0.3.15

Not secure
* [ZeRO-Infinity](https://www.microsoft.com/en-us/research/blog/zero-infinity-and-deepspeed-unlocking-unprecedented-model-scale-for-deep-learning-training/) release allowing nvme offload and more!
* Deprecated `cpu_offload` in config JSON, see [JSON docs](https://www.deepspeed.ai/docs/config-json/#zero-optimizations-for-fp16-training) for more details.
* Automatic external parameter registration, more details in the [ZeRO 3 docs](https://deepspeed.readthedocs.io/en/latest/zero3.html#registering-external-parameters).

* Several bug fixes for ZeRO stage 3

0.3.14

Not secure
Notes to come

0.3.13

Not secure
Combined release notes since Jan 12th v0.3.10 release

* ZeRO 3 Offload (834)
* more detailed notes to come

0.3.10

Not secure
Combined release notes since November 12th v0.3.1 release

* Various updates to torch.distributed initialization
* New `deepspeed.init_distributed` API, 608, 645, 644
* Improved AzureML support for patching torch.distributed backend, 542
* Simplify dist init and only init if needed 553
* Transformer kernel updates
* Support for different hidden dimensions 559
* Support arbitrary sequence-length 587
* Elastic training support (602)
* NOTE: More details to come on this feature, currently still in initial piloting of this feature.
* Module replacement support 586
* NOTE: Will be used more and documented in the short-term to help automatically inject/replace deepspeed ops into client models.
* 528 removes dependencies psutil and cpufeature
* Various ZeRO 1 and 2 bug fixes and updates: 531, 532, 545, 548
* 543 backwards compatible checkpoints with older deepspeed v0.2 version
* Add static_loss_scale support to unfused optimizer 546
* Bug fix for norm calculation in absence of model parallel group 551
* Switch CI from azure pipelines to github actions
* Deprecate client ability to disable gradient reduction 552
* Bug fix for tracking optimizer step in cpu-adam when loading checkpoint 564
* Improved support for Ampere architecture 572, 570, 577, 578, 591, 642
* Fix potential random layout inconsistency issues in sparse attention modules 534
* Supported customizing kwargs for lr_scheduler 584
* Support deepspeed.initialize with dict configuration instead of arg 632
* Allow DeepSpeed models to be initialized with optimizer=None 469


Special thanks to our contributors in this release
stas00, gcooper-isi, g-karthik, sxjscience, brettkoonce, carefree0910, Justin1904, harrydrippin

0.3.1

Not secure
Updates
* Efficient and robust compressed training through [progressive layer dropping](https://www.deepspeed.ai/news/2020/10/28/progressive-layer-dropping-news.html)
* JIT compilation of C++/CUDA extensions
* Python-only install support, ~10x faster install time
* PyPI hosted installation via `pip install deepspeed`
* Removed apex dependency
* Bug fixes for ZeRO-offload and CPU-Adam
* Transformer support for dynamic sequence length (424)
* Linear warmup+decay lr schedule (414)

0.3.0

New features
* [DeepSpeed: Extreme-scale model training for everyone](linklink)
* [Powering 10x longer sequences and 6x faster execution through DeepSpeed Sparse Attention](https://www.deepspeed.ai/news/2020/09/08/sparse-attention-news.html)
* [Training a trillion parameters with pipeline parallelism](https://www.deepspeed.ai/news/2020/09/08/pipeline-parallelism.html)
* [Up to 5x less communication and 3.4x faster training through 1-bit Adam](https://www.deepspeed.ai/news/2020/09/08/onebit-adam-news.html)
* [10x bigger model training on a single GPU with ZeRO-Offload](https://www.deepspeed.ai/news/2020/09/08/ZeRO-Offload.html)

Software improvements
* Refactor codebase to make cleaner distinction between ops/runtime/zero/etc.
* Conditional Op builds
* Not all users should have to spend time building transformer kernels if they don't want to use them.
* To ensure DeepSpeed is portable in multiple environments some features require unique dependencies that not everyone will be able to or want to install.
* DeepSpeed launcher supports different backends in additional to pdsh such as Open MPI and MVAPICH.

Page 14 of 17

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.