Deepspeed

Latest version: v0.16.2

Safety actively analyzes 693883 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 3 of 17

0.14.1

Not secure
What's Changed
* Update version.txt after 0.14.0 release by mrwyattii in https://github.com/microsoft/DeepSpeed/pull/5238
* Fp6 blog chinese by xiaoxiawu-microsoft in https://github.com/microsoft/DeepSpeed/pull/5239
* Add contributed HW support into README by delock in https://github.com/microsoft/DeepSpeed/pull/5240
* Set tp world size to 1 in ckpt load, if MPU is not provided by samadejacobs in https://github.com/microsoft/DeepSpeed/pull/5243
* Make op builder detection adapt to accelerator change by delock in https://github.com/microsoft/DeepSpeed/pull/5206
* Replace HIP_PLATFORM_HCC with HIP_PLATFORM_AMD by rraminen in https://github.com/microsoft/DeepSpeed/pull/5264
* Add CI for Habana Labs HPU/Gaudi2 by loadams in https://github.com/microsoft/DeepSpeed/pull/5244
* Fix attention mask handling in the Hybrid Engine Bloom flow by deepcharm in https://github.com/microsoft/DeepSpeed/pull/5101
* Skip 1Bit Compression and sparsegrad tests for HPU. by vshekhawat-hlab in https://github.com/microsoft/DeepSpeed/pull/5270
* Enabled LMCorrectness inference tests on HPU. by vshekhawat-hlab in https://github.com/microsoft/DeepSpeed/pull/5271
* Added HPU backend support for torch.compile tests. by vshekhawat-hlab in https://github.com/microsoft/DeepSpeed/pull/5269
* Average only valid part of the ipg buffer. by BacharL in https://github.com/microsoft/DeepSpeed/pull/5268
* Add HPU accelerator support in unit tests. by vshekhawat-hlab in https://github.com/microsoft/DeepSpeed/pull/5162
* Fix loading a universal checkpoint by tohtana in https://github.com/microsoft/DeepSpeed/pull/5263
* Add Habana Gaudi2 CI badge to the README by loadams in https://github.com/microsoft/DeepSpeed/pull/5286
* Add intel gaudi to contributed HW in README by BacharL in https://github.com/microsoft/DeepSpeed/pull/5300
* Fixed Accelerate Link by wkaisertexas in https://github.com/microsoft/DeepSpeed/pull/5314
* Enable mixtral 8x7b autotp by Yejing-Lai in https://github.com/microsoft/DeepSpeed/pull/5257
* support bf16_optimizer moe expert parallel training and moe EP grad_scale/grad_norm fix by inkcherry in https://github.com/microsoft/DeepSpeed/pull/5259
* fix comms dtype by mayank31398 in https://github.com/microsoft/DeepSpeed/pull/5297
* Modified regular expression by igeni in https://github.com/microsoft/DeepSpeed/pull/5306
* Docs typos fix and grammar suggestions by Gr0g0 in https://github.com/microsoft/DeepSpeed/pull/5322
* Added Gaudi2 CI tests. by vshekhawat-hlab in https://github.com/microsoft/DeepSpeed/pull/5275
* Improve universal checkpoint by tohtana in https://github.com/microsoft/DeepSpeed/pull/5289
* Increase coverage for HPU by loadams in https://github.com/microsoft/DeepSpeed/pull/5324
* Add NFS path check for default deepspeed triton cache directory by HeyangQin in https://github.com/microsoft/DeepSpeed/pull/5323
* Correct typo in checking on bf16 unit test support by loadams in https://github.com/microsoft/DeepSpeed/pull/5317
* Make NFS warning print only once by HeyangQin in https://github.com/microsoft/DeepSpeed/pull/5345
* resolve KeyError: 'PDSH_SSH_ARGS_APPEND' by Lzhang-hub in https://github.com/microsoft/DeepSpeed/pull/5318
* BF16 optimizer: Clear lp grads after updating hp grads in hook by YangQun1 in https://github.com/microsoft/DeepSpeed/pull/5328
* Fix sort of zero checkpoint files by tohtana in https://github.com/microsoft/DeepSpeed/pull/5342
* Add `distributed_port` for `deepspeed.initialize` by LZHgrla in https://github.com/microsoft/DeepSpeed/pull/5260
* [fix] fix typo s/simultanenously /simultaneously by digger-yu in https://github.com/microsoft/DeepSpeed/pull/5359
* Update container version for Gaudi2 CI by raza-sikander in https://github.com/microsoft/DeepSpeed/pull/5360
* compute global norm on device by BacharL in https://github.com/microsoft/DeepSpeed/pull/5125
* logger update with torch master changes by rogerxfeng8 in https://github.com/microsoft/DeepSpeed/pull/5346
* Ensure capacity does not exceed number of tokens by jeffra in https://github.com/microsoft/DeepSpeed/pull/5353
* Update workflows that use cu116 to cu117 by loadams in https://github.com/microsoft/DeepSpeed/pull/5361
* FP [6,8,12] quantizer op by jeffra in https://github.com/microsoft/DeepSpeed/pull/5336
* CPU SHM based inference_all_reduce improve by delock in https://github.com/microsoft/DeepSpeed/pull/5320
* Auto convert moe param groups by jeffra in https://github.com/microsoft/DeepSpeed/pull/5354
* Support MoE for pipeline models by mosheisland in https://github.com/microsoft/DeepSpeed/pull/5338
* Update pytest and transformers with fixes for pytest>= 8.0.0 by loadams in https://github.com/microsoft/DeepSpeed/pull/5164
* Increase CI coverage for Gaudi2 accelerator. by vshekhawat-hlab in https://github.com/microsoft/DeepSpeed/pull/5358
* Add CI for Intel XPU/Max1100 by Liangliang-Ma in https://github.com/microsoft/DeepSpeed/pull/5376
* Update path name on xpu-max1100.yml, add badge in README by loadams in https://github.com/microsoft/DeepSpeed/pull/5386
* Update checkout action on workflows on ubuntu 20.04 by loadams in https://github.com/microsoft/DeepSpeed/pull/5387
* Cleanup required_torch_version code and references. by loadams in https://github.com/microsoft/DeepSpeed/pull/5370
* Update README.md for intel XPU support by Liangliang-Ma in https://github.com/microsoft/DeepSpeed/pull/5389
* Optimize the fp-dequantizer to get high memory-BW utilization by RezaYazdaniAminabadi in https://github.com/microsoft/DeepSpeed/pull/5373
* Removal of cuda hardcoded string with get_device function by raza-sikander in https://github.com/microsoft/DeepSpeed/pull/5351
* Add custom reshaping for universal checkpoint by tohtana in https://github.com/microsoft/DeepSpeed/pull/5390
* fix pagable h2d memcpy by GuanhuaWang in https://github.com/microsoft/DeepSpeed/pull/5301
* stage3: efficient compute of scaled_global_grad_norm by nelyahu in https://github.com/microsoft/DeepSpeed/pull/5256
* Fix the FP6 kernels compilation problem on non-Ampere GPUs. by JamesTheZ in https://github.com/microsoft/DeepSpeed/pull/5333

New Contributors
* vshekhawat-hlab made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5270
* wkaisertexas made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5314
* igeni made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5306
* Gr0g0 made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5322
* Lzhang-hub made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5318
* YangQun1 made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5328
* raza-sikander made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5360
* rogerxfeng8 made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5346
* JamesTheZ made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5333

**Full Changelog**: https://github.com/microsoft/DeepSpeed/compare/v0.14.0...v0.14.1

0.14.0

Not secure
New Features
* [DeepSpeed-FP6: The Power of FP6-Centric Serving for Large Language Models.](https://github.com/microsoft/DeepSpeed/tree/master/blogs/deepspeed-fp6/03-05-2024)


What's Changed
* Update version.txt after 0.13.5 release by mrwyattii in https://github.com/microsoft/DeepSpeed/pull/5229
* MOE gate fixes and enhancements by mosheisland in https://github.com/microsoft/DeepSpeed/pull/5156
* FP6 quantization end-to-end. by loadams in https://github.com/microsoft/DeepSpeed/pull/5234
* FP6 blog by loadams in https://github.com/microsoft/DeepSpeed/pull/5235


**Full Changelog**: https://github.com/microsoft/DeepSpeed/compare/v0.13.5...v0.14.0

0.13.5

Not secure
What's Changed
* Update version.txt after 0.13.4 release by mrwyattii in https://github.com/microsoft/DeepSpeed/pull/5196
* Fix assertion to run pipeline engine with a compiled module by tohtana in https://github.com/microsoft/DeepSpeed/pull/5197
* Allow specifying MII branch on MII CI by mrwyattii in https://github.com/microsoft/DeepSpeed/pull/5208
* [zero++] Synchronize at the end of secondary partitioning and simplify the logic by ByronHsu in https://github.com/microsoft/DeepSpeed/pull/5216
* Add fp16 support of Qwen1.5 models (0.5B to 72B) to DeepSpeed-FastGen by ZonePG in https://github.com/microsoft/DeepSpeed/pull/5219
* Rename nv-torch-latest-cpu workflow to cpu-torch-latest by loadams in https://github.com/microsoft/DeepSpeed/pull/5226
* Fix moe cpu offload by RezaYazdaniAminabadi in https://github.com/microsoft/DeepSpeed/pull/5220
* Use `deepspeed.comm` instead of `torch.distributed` by jinyouzhi in https://github.com/microsoft/DeepSpeed/pull/5225
* fix fused_qkv model accuracy issue by Yejing-Lai in https://github.com/microsoft/DeepSpeed/pull/5217


**Full Changelog**: https://github.com/microsoft/DeepSpeed/compare/v0.13.4...v0.13.5

0.13.4

Not secure
What's Changed
* Update version.txt after v0.13.3 release by loadams in https://github.com/microsoft/DeepSpeed/pull/5185
* Fixes for `--extra-index-url` by loadams in https://github.com/microsoft/DeepSpeed/pull/5183
* allow debug/experimental compiler backends by tohtana in https://github.com/microsoft/DeepSpeed/pull/5191
* Disable ninja by default by mrwyattii in https://github.com/microsoft/DeepSpeed/pull/5194
* [CPUAdam] Update full_precision_optimizer_states in docstring by rohan-varma in https://github.com/microsoft/DeepSpeed/pull/5181
* Add script to check for `--extra-index-url` by loadams in https://github.com/microsoft/DeepSpeed/pull/5184

New Contributors
* rohan-varma made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5181

**Full Changelog**: https://github.com/microsoft/DeepSpeed/compare/v0.13.3...v0.13.4

0.13.3

Not secure
What's Changed
* Update version.txt after 0.13.2 release by mrwyattii in https://github.com/microsoft/DeepSpeed/pull/5119
* Stop tracking backward chain of broadcast (ZeRO3) by tohtana in https://github.com/microsoft/DeepSpeed/pull/5113
* [NPU]ZeRO-Infinity feature compatibility by misstek in https://github.com/microsoft/DeepSpeed/pull/5077
* BF16 optimizer: Improve device utilization by immediate grad update by deepcharm in https://github.com/microsoft/DeepSpeed/pull/4975
* removed if condition in `if collate_fn is None` by bm-synth in https://github.com/microsoft/DeepSpeed/pull/5107
* disable compile tests for torch<2.1 by mrwyattii in https://github.com/microsoft/DeepSpeed/pull/5121
* Update inference test model names by mrwyattii in https://github.com/microsoft/DeepSpeed/pull/5127
* Fix issue with zero-sized file after merging file on curriculum `map_reduce` by bm-synth in https://github.com/microsoft/DeepSpeed/pull/5106
* Update return codes in PyTest to properly error out if tests fail by loadams in https://github.com/microsoft/DeepSpeed/pull/5122
* add missing methods to MPS_Accelerator by mrwyattii in https://github.com/microsoft/DeepSpeed/pull/5134
* Solve tensor vs numpy dtype conflicts in data efficiency map-reduce. by bm-synth in https://github.com/microsoft/DeepSpeed/pull/5108
* Fix broadcast deadlock for incomplete batches in data sample for data analysis by bm-synth in https://github.com/microsoft/DeepSpeed/pull/5117
* Avoid zero-sized microbatches for incomplete minibatches when doing curriculum learning by bm-synth in https://github.com/microsoft/DeepSpeed/pull/5118
* remove mandatory `index` key from output of `metric_function` in `DataAnalysis` map operation by bm-synth in https://github.com/microsoft/DeepSpeed/pull/5112
* tensorboard logging: avoid item() outside gas to improve performance by nelyahu in https://github.com/microsoft/DeepSpeed/pull/5135
* Check overflow on device without host synchronization for each tensor by BacharL in https://github.com/microsoft/DeepSpeed/pull/5115
* Update nv-inference torch version by loadams in https://github.com/microsoft/DeepSpeed/pull/5128
* Method `run_map_reduce` to fix errors when running `run_map` followed by `run_reduce` by bm-synth in https://github.com/microsoft/DeepSpeed/pull/5131
* Added missing `isinstance` check in PR 5112 by bm-synth in https://github.com/microsoft/DeepSpeed/pull/5142
* Fix UserWarning: The torch.cuda.*DtypeTensor constructors are no long… by ShukantPal in https://github.com/microsoft/DeepSpeed/pull/5018
* TestEmptyParameterGroup: replace fusedAdam with torch.optim.AdamW by nelyahu in https://github.com/microsoft/DeepSpeed/pull/5139
* Update deprecated HuggingFace function by mrwyattii in https://github.com/microsoft/DeepSpeed/pull/5144
* Pin to PyTest 8.0.0 by loadams in https://github.com/microsoft/DeepSpeed/pull/5163
* get_grad_norm_direct: fix a case of empty norm group by nelyahu in https://github.com/microsoft/DeepSpeed/pull/5148
* Distributed in-memory map-reduce for data analyzer by bm-synth in https://github.com/microsoft/DeepSpeed/pull/5129
* DeepSpeedZeroOptimizer_Stage3: remove cuda specific optimizer by nelyahu in https://github.com/microsoft/DeepSpeed/pull/5138
* MOE: Fix save checkpoint when TP > 1 by mosheisland in https://github.com/microsoft/DeepSpeed/pull/5157
* Fix gradient clipping by tohtana in https://github.com/microsoft/DeepSpeed/pull/5150
* Use ninja to speed up build by jinzhen-lin in https://github.com/microsoft/DeepSpeed/pull/5088
* Update flops profiler to handle attn and __matmul__ by KimmiShi in https://github.com/microsoft/DeepSpeed/pull/4724
* Fix allreduce for BF16 and ZeRO0 by tohtana in https://github.com/microsoft/DeepSpeed/pull/5170
* Write multiple items to output file at once, in distributed data analyzer. by bm-synth in https://github.com/microsoft/DeepSpeed/pull/5169
* Fix typos in blogs/ by jinyouzhi in https://github.com/microsoft/DeepSpeed/pull/5172
* Inference V2 Human Eval by lekurile in https://github.com/microsoft/DeepSpeed/pull/4804
* Reduce ds_id name length by jomayeri in https://github.com/microsoft/DeepSpeed/pull/5176
* Switch cpu-inference workflow from --extra-index-url to --index-url by loadams in https://github.com/microsoft/DeepSpeed/pull/5182

New Contributors
* bm-synth made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5107
* KimmiShi made their first contribution in https://github.com/microsoft/DeepSpeed/pull/4724

**Full Changelog**: https://github.com/microsoft/DeepSpeed/compare/v0.13.2...v0.13.3

0.13.2

Not secure
What's Changed
* Update version.txt after 0.13.1 release by mrwyattii in https://github.com/microsoft/DeepSpeed/pull/5002
* Support `exclude_frozen_parameters` for `save_16bit_model` by LZHgrla in https://github.com/microsoft/DeepSpeed/pull/4999
* Allow nightly tests dispatch by mrwyattii in https://github.com/microsoft/DeepSpeed/pull/5014
* Enable hpz based on secondary tensor presence by HeyangQin in https://github.com/microsoft/DeepSpeed/pull/4906
* Enable workflow dispatch on all workflows by loadams in https://github.com/microsoft/DeepSpeed/pull/5016
* [minor] improve code quality and readablilty by ByronHsu in https://github.com/microsoft/DeepSpeed/pull/5011
* Update falcon fused type order by Yejing-Lai in https://github.com/microsoft/DeepSpeed/pull/5007
* Fix error report of DSElasticAgent._set_master_addr_port() by RobinDong in https://github.com/microsoft/DeepSpeed/pull/4985
* DS 4993 662 : autotune single node hostfile bugfix by oushu1zhangxiangxuan1 in https://github.com/microsoft/DeepSpeed/pull/4996
* [minor] Improve logging for multiprocesses by ByronHsu in https://github.com/microsoft/DeepSpeed/pull/5004
* deepspeed/launcher: add launcher_helper as each rank's start portal by YizhouZ in https://github.com/microsoft/DeepSpeed/pull/4699
* Graph capture support on HPU accelerators by deepcharm in https://github.com/microsoft/DeepSpeed/pull/5013
* launcher/launcher_helper.py: fix PMI name and add EnvironmentError by YizhouZ in https://github.com/microsoft/DeepSpeed/pull/5025
* Remove MI100 badge from landing page by mrwyattii in https://github.com/microsoft/DeepSpeed/pull/5036
* Remove coverage reports from workflows and fix for inference CI by loadams in https://github.com/microsoft/DeepSpeed/pull/5028
* Remove Megatron-DeepSpeed CI workflow by mrwyattii in https://github.com/microsoft/DeepSpeed/pull/5038
* Fix P40 CI failures by mrwyattii in https://github.com/microsoft/DeepSpeed/pull/5037
* Fix for nightly torch CI by mrwyattii in https://github.com/microsoft/DeepSpeed/pull/5039
* Fix nv-accelerate and nv-torch-latest-v100. by loadams in https://github.com/microsoft/DeepSpeed/pull/5035
* update inference pages to point to FastGen by mrwyattii in https://github.com/microsoft/DeepSpeed/pull/5029
* launcher_helper: enable fds passing by YizhouZ in https://github.com/microsoft/DeepSpeed/pull/5042
* Fix nv-torch-latest-cpu CI by mrwyattii in https://github.com/microsoft/DeepSpeed/pull/5045
* [NPU] Add NPU to support hybrid engine by CurryRice233 in https://github.com/microsoft/DeepSpeed/pull/4831
* MoE type hints by ringohoffman in https://github.com/microsoft/DeepSpeed/pull/5043
* [doc] update inference related docs from `mp_size` to `tensor_parallel` for TP by yundai424 in https://github.com/microsoft/DeepSpeed/pull/5048
* Fix broken model names in inference CI by mrwyattii in https://github.com/microsoft/DeepSpeed/pull/5053
* [NPU] Change log level to debug by CurryRice233 in https://github.com/microsoft/DeepSpeed/pull/5051
* Delay reduce-scatter for ZeRO3 leaf modules by tohtana in https://github.com/microsoft/DeepSpeed/pull/5008
* Optimize grad_norm calculations by reducing device/host dependency by nelyahu in https://github.com/microsoft/DeepSpeed/pull/4974
* load linear layer weight with given dtype by polisettyvarma in https://github.com/microsoft/DeepSpeed/pull/4044
* Update import for changes to latest diffusers by mrwyattii in https://github.com/microsoft/DeepSpeed/pull/5065
* adding hccl to init_distributed function description by nelyahu in https://github.com/microsoft/DeepSpeed/pull/5034
* [Zero++ qgZ] Fall back to reduce_scatter if `tensor.numel() % (2 * global_world_size) != 0` by ByronHsu in https://github.com/microsoft/DeepSpeed/pull/5056
* Make batch size documentation clearer by segyges in https://github.com/microsoft/DeepSpeed/pull/5072
* [doc/1-line change] default stage3_param_persistence_threshold is wrong in the doc by ByronHsu in https://github.com/microsoft/DeepSpeed/pull/5073
* Further refactor deepspeed.moe.utils + deepspeed.moe.layer type hints by ringohoffman in https://github.com/microsoft/DeepSpeed/pull/5060
* Fix verification for ZeRO3 leaf module by tohtana in https://github.com/microsoft/DeepSpeed/pull/5074
* Stop tracking backward chain of broadcast in initialization by tohtana in https://github.com/microsoft/DeepSpeed/pull/5075
* Update torch version for nv-torch-latest-cpu by loadams in https://github.com/microsoft/DeepSpeed/pull/5086
* Add backwards compatibility w/ older versions of diffusers (<0.25.0) by lekurile in https://github.com/microsoft/DeepSpeed/pull/5083
* Enable torch.compile with ZeRO (Experimental) by tohtana in https://github.com/microsoft/DeepSpeed/pull/4878
* Update nv-accelerate to latest torch by loadams in https://github.com/microsoft/DeepSpeed/pull/5040
* HPU Accelerator: fix supported_dtypes API by nelyahu in https://github.com/microsoft/DeepSpeed/pull/5094
* [NPU] replace 'cuda' with get_accelerator().device_name() by minchao-sun in https://github.com/microsoft/DeepSpeed/pull/5095
* optimize clip_grad_norm_ function by mmhab in https://github.com/microsoft/DeepSpeed/pull/4915
* [xs] fix ZEROPP convergence test by yundai424 in https://github.com/microsoft/DeepSpeed/pull/5061
* Switch hasattr check from compile to compiler by loadams in https://github.com/microsoft/DeepSpeed/pull/5096
* Split is_synchronized_device api to multiple apis by BacharL in https://github.com/microsoft/DeepSpeed/pull/5026
* 47% FastGen speedup for low workload - refactor allocator by HeyangQin in https://github.com/microsoft/DeepSpeed/pull/5090
* Support `exclude_frozen_parameters` for `zero_to_fp32.py` script by andstor in https://github.com/microsoft/DeepSpeed/pull/4979
* Fix alignment of optimizer states when loading by tohtana in https://github.com/microsoft/DeepSpeed/pull/5105
* Skip Triton import for AMD by lekurile in https://github.com/microsoft/DeepSpeed/pull/5110
* Add HIP conversion file outputs to .gitignore by lekurile in https://github.com/microsoft/DeepSpeed/pull/5111
* Remove optimizer step on initialization by tohtana in https://github.com/microsoft/DeepSpeed/pull/5104

New Contributors
* ByronHsu made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5011
* RobinDong made their first contribution in https://github.com/microsoft/DeepSpeed/pull/4985
* oushu1zhangxiangxuan1 made their first contribution in https://github.com/microsoft/DeepSpeed/pull/4996
* yundai424 made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5048
* segyges made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5072
* andstor made their first contribution in https://github.com/microsoft/DeepSpeed/pull/4979

**Full Changelog**: https://github.com/microsoft/DeepSpeed/compare/v0.13.1...v0.13.2

Page 3 of 17

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.