Deepspeed

Latest version: v0.16.2

Safety actively analyzes 693883 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 7 of 17

0.9.3

Not secure
What's Changed
* Enable auto TP policy for llama model by jianan-gu in https://github.com/microsoft/DeepSpeed/pull/3170
* Allow users to use mis-matched CUDA versions by mrwyattii in https://github.com/microsoft/DeepSpeed/pull/3436
* Hybrid Engine Refactor and Llama Inference Support by cmikeh2 in https://github.com/microsoft/DeepSpeed/pull/3425
* add sharded checkpoint loading for AutoTP path to reduce the peak mem… by sywangyi in https://github.com/microsoft/DeepSpeed/pull/3102
* launcher/multinode_runner.py: mapping env variables by YizhouZ in https://github.com/microsoft/DeepSpeed/pull/3372
* Update automatic-tensor-parallelism.md by sywangyi in https://github.com/microsoft/DeepSpeed/pull/3198
* Build: Update license in setup by PabloEmidio in https://github.com/microsoft/DeepSpeed/pull/3484
* Doc corrections by goodship1 in https://github.com/microsoft/DeepSpeed/pull/3435
* Fix spelling errors in comments and documents by digger-yu in https://github.com/microsoft/DeepSpeed/pull/3486
* Fix spelling error in function GetMaxTokenLength() by luliyucoordinate in https://github.com/microsoft/DeepSpeed/pull/3482
* Fix a type error on bf16+Pipeline Parallelism by ys950902 in https://github.com/microsoft/DeepSpeed/pull/3441
* Fix spelling errors in DeepSpeed codebase by digger-yu in https://github.com/microsoft/DeepSpeed/pull/3494
* fix spelling error with docs/index.md by digger-yu in https://github.com/microsoft/DeepSpeed/pull/3443
* delete the line to keep user_zero_stages by MrZhengXin in https://github.com/microsoft/DeepSpeed/pull/3473
* Update Inference Engine checkpoint loading + meta tensor assertions by lekurile in https://github.com/microsoft/DeepSpeed/pull/2940
* fix regression in shard checkpoint loading in AutoTP Path caused by qkv_copy() is deleted and add UT case for shard checkpoint loading in AutoTP by sywangyi in https://github.com/microsoft/DeepSpeed/pull/3457
* Add snip_momentum structured pruning which supports higher sparse ratio by ftian1 in https://github.com/microsoft/DeepSpeed/pull/3300
* Update README.md by goodship1 in https://github.com/microsoft/DeepSpeed/pull/3504
* Hybrid Engine Fix Llama by lekurile in https://github.com/microsoft/DeepSpeed/pull/3505
* fix spelling error with deepspeed/runtime/ by digger-yu in https://github.com/microsoft/DeepSpeed/pull/3509
* Skip autoTP if tp_size is 1 by molly-smith in https://github.com/microsoft/DeepSpeed/pull/3449
* Changing monitor loss to aggregate loss over gradient accumulation steps by jomayeri in https://github.com/microsoft/DeepSpeed/pull/3428
* change actions/checkoutv2 to v3 by digger-yu in https://github.com/microsoft/DeepSpeed/pull/3526
* fix typo with docs/ by digger-yu in https://github.com/microsoft/DeepSpeed/pull/3523
* Doc updates by goodship1 in https://github.com/microsoft/DeepSpeed/pull/3520
* Fix bug in Hybrid Engine by mrwyattii in https://github.com/microsoft/DeepSpeed/pull/3497
* Fix wrong passing of offload_optimizer_config to DeepSpeedZeRoOffload by mmhab in https://github.com/microsoft/DeepSpeed/pull/3420
* Fix broadcast error on multi-node training with ZeroStage3 and TensorParallel=2 by YizhouZ in https://github.com/microsoft/DeepSpeed/pull/2999
* share inflight registry between PartitionedParameterCoordinators by HeyangQin in https://github.com/microsoft/DeepSpeed/pull/3462
* Syncing FusedAdam with new Apex features by jomayeri in https://github.com/microsoft/DeepSpeed/pull/3434
* fix typo in comments with deepspeed/ by digger-yu in https://github.com/microsoft/DeepSpeed/pull/3537
* [ROCm] Hip headers fix by rraminen in https://github.com/microsoft/DeepSpeed/pull/3532
* [CPU] Support Intel CPU inference by delock in https://github.com/microsoft/DeepSpeed/pull/3041
* Clone tensors to avoid torch.save bloat by tjruwase in https://github.com/microsoft/DeepSpeed/pull/3348
* Fix attribute error when loading FusedAdamBuilder() by rraminen in https://github.com/microsoft/DeepSpeed/pull/3527
* fix typo by inkcherry in https://github.com/microsoft/DeepSpeed/pull/3559
* Fixing bf16 test by jomayeri in https://github.com/microsoft/DeepSpeed/pull/3551
* Fix Hybrid Engine for BLOOM by lekurile in https://github.com/microsoft/DeepSpeed/pull/3580
* Fix op_builder against PyTorch nightly by malfet in https://github.com/microsoft/DeepSpeed/pull/3596
* data efficiency bug fix, avoid invalid range step size by conglongli in https://github.com/microsoft/DeepSpeed/pull/3609
* DS init should not broadcast or move zero.Init models by tjruwase in https://github.com/microsoft/DeepSpeed/pull/3611
* Expose Consecutive Hysteresis to Users by Quentin-Anthony in https://github.com/microsoft/DeepSpeed/pull/3553
* Align InferenceEngine to store ms in _model_times by HolyFalafel in https://github.com/microsoft/DeepSpeed/pull/3501
* AISC launcher fixes by jeffra in https://github.com/microsoft/DeepSpeed/pull/3637
* stage3.py: do not scale if gradient_predivide_factor is 1.0 by guoyejun in https://github.com/microsoft/DeepSpeed/pull/3630
* Add Ascend NPU accelerator support by CurryRice233 in https://github.com/microsoft/DeepSpeed/pull/3595
* Skip tests on docs-only changes by mrwyattii in https://github.com/microsoft/DeepSpeed/pull/3651
* Update megatron.md by wjessup in https://github.com/microsoft/DeepSpeed/pull/3641
* Typo Correction by MicahZoltu in https://github.com/microsoft/DeepSpeed/pull/3621
* deepspeed/comm/comm.py: fix typo of warning message by guoyejun in https://github.com/microsoft/DeepSpeed/pull/3636
* Fix RuntimeError when using ZeRO Stage3 with mpu: 3564 by eggiter in https://github.com/microsoft/DeepSpeed/pull/3565
* Allow dict datatype for checkpoints (inference) by mrwyattii in https://github.com/microsoft/DeepSpeed/pull/3007
* fix typo with deepspeed/ by digger-yu in https://github.com/microsoft/DeepSpeed/pull/3547
* flops_profiler: add option recompute_fwd_factor for the case of activation c… by guoyejun in https://github.com/microsoft/DeepSpeed/pull/3362
* fix typo deepspeed/runtime by digger-yu in https://github.com/microsoft/DeepSpeed/pull/3663
* Refactor check_enabled root validator in DeepSpeedMonitorConfig by bgr8 in https://github.com/microsoft/DeepSpeed/pull/3616

New Contributors
* jianan-gu made their first contribution in https://github.com/microsoft/DeepSpeed/pull/3170
* YizhouZ made their first contribution in https://github.com/microsoft/DeepSpeed/pull/3372
* PabloEmidio made their first contribution in https://github.com/microsoft/DeepSpeed/pull/3484
* luliyucoordinate made their first contribution in https://github.com/microsoft/DeepSpeed/pull/3482
* ys950902 made their first contribution in https://github.com/microsoft/DeepSpeed/pull/3441
* MrZhengXin made their first contribution in https://github.com/microsoft/DeepSpeed/pull/3473
* ftian1 made their first contribution in https://github.com/microsoft/DeepSpeed/pull/3300
* mmhab made their first contribution in https://github.com/microsoft/DeepSpeed/pull/3420
* malfet made their first contribution in https://github.com/microsoft/DeepSpeed/pull/3596
* HolyFalafel made their first contribution in https://github.com/microsoft/DeepSpeed/pull/3501
* CurryRice233 made their first contribution in https://github.com/microsoft/DeepSpeed/pull/3595
* wjessup made their first contribution in https://github.com/microsoft/DeepSpeed/pull/3641
* MicahZoltu made their first contribution in https://github.com/microsoft/DeepSpeed/pull/3621
* eggiter made their first contribution in https://github.com/microsoft/DeepSpeed/pull/3565
* bgr8 made their first contribution in https://github.com/microsoft/DeepSpeed/pull/3616

**Full Changelog**: https://github.com/microsoft/DeepSpeed/compare/v0.9.2...v0.9.3

0.9.2

Not secure
What's Changed
* MiCS implementation by zarzen in https://github.com/microsoft/DeepSpeed/pull/2964
* Fix formatting by mrwyattii in https://github.com/microsoft/DeepSpeed/pull/3343
* [ROCm] Hipify cooperative_groups headers by rraminen in https://github.com/microsoft/DeepSpeed/pull/3323
* Diffusers 0.15.0 bug fix by molly-smith in https://github.com/microsoft/DeepSpeed/pull/3345
* Print default values for DeepSpeed --help by mrwyattii in https://github.com/microsoft/DeepSpeed/pull/3347
* add bf16 cuda kernel support by dc3671 in https://github.com/microsoft/DeepSpeed/pull/3092
* README.md: Update MosaicML docs link by kobindra in https://github.com/microsoft/DeepSpeed/pull/3344
* hybrid_engine: check tuple size when fusing lora params by adammoody in https://github.com/microsoft/DeepSpeed/pull/3311
* fix mpich launcher issue in multi-node by sywangyi in https://github.com/microsoft/DeepSpeed/pull/3078
* Update DS-Chat issue template by mrwyattii in https://github.com/microsoft/DeepSpeed/pull/3368
* add deepspeed chat blog links, add tags by conglongli in https://github.com/microsoft/DeepSpeed/pull/3369
* Fix redundant shared_params in zero_to_fp32.py by ShijieZZZZ in https://github.com/microsoft/DeepSpeed/pull/3149
* fixing default communication_data_type for bfloat16_enabled and docs by clumsy in https://github.com/microsoft/DeepSpeed/pull/3370
* Auto TP Tutorial with T5 Example by molly-smith in https://github.com/microsoft/DeepSpeed/pull/2962
* stage_1_and_2.py: do gradient scale only for fp16 by guoyejun in https://github.com/microsoft/DeepSpeed/pull/3166
* Fix memory leak in zero2 contiguous gradients by hablb in https://github.com/microsoft/DeepSpeed/pull/3306
* remove megatron-lm, no longer pip installable by jeffra in https://github.com/microsoft/DeepSpeed/pull/3389
* Fix pipeline module evaluation when contiguous activation checkpoin… by hablb in https://github.com/microsoft/DeepSpeed/pull/3005
* doc updates by goodship1 in https://github.com/microsoft/DeepSpeed/pull/3415
* Save tensors in context of memory_efficient_linear by tohtana in https://github.com/microsoft/DeepSpeed/pull/3413
* Add HE support for the rest of model containers by RezaYazdaniAminabadi in https://github.com/microsoft/DeepSpeed/pull/3191
* Update PyTorch Lightning/DeepSpeed examples links by loadams in https://github.com/microsoft/DeepSpeed/pull/3424
* Fix `PipelineEngine.eval_batch` result by nrailgun in https://github.com/microsoft/DeepSpeed/pull/3316
* OPT Activation Function Hotfix by cmikeh2 in https://github.com/microsoft/DeepSpeed/pull/3400
* Add ZeRO 1 support to PP for BF16. by jomayeri in https://github.com/microsoft/DeepSpeed/pull/3399
* [zero_to_fp32] fix shared param recovery by stas00 in https://github.com/microsoft/DeepSpeed/pull/3407
* Adagrad support in ZeRO by jomayeri in https://github.com/microsoft/DeepSpeed/pull/3401
* Update 2020-09-09-sparse-attention.md by goodship1 in https://github.com/microsoft/DeepSpeed/pull/3432

New Contributors
* dc3671 made their first contribution in https://github.com/microsoft/DeepSpeed/pull/3092
* kobindra made their first contribution in https://github.com/microsoft/DeepSpeed/pull/3344
* hablb made their first contribution in https://github.com/microsoft/DeepSpeed/pull/3306
* nrailgun made their first contribution in https://github.com/microsoft/DeepSpeed/pull/3316

**Full Changelog**: https://github.com/microsoft/DeepSpeed/compare/v0.9.1...v0.9.2

0.9.1

Not secure
What's Changed
* Update DS-Chat docs for v0.9.0 by mrwyattii in https://github.com/microsoft/DeepSpeed/pull/3216
* Update DeepSpeed-Chat docs with latest changes to scripts by mrwyattii in https://github.com/microsoft/DeepSpeed/pull/3219
* Nested zero.Init() and dynamically defined model class by tohtana in https://github.com/microsoft/DeepSpeed/pull/2989
* Update torch version check in building sparse_attn by loadams in https://github.com/microsoft/DeepSpeed/pull/3152
* Fix for Stable Diffusion by mrwyattii in https://github.com/microsoft/DeepSpeed/pull/3218
* [update] reference in cifar-10 by dtunai in https://github.com/microsoft/DeepSpeed/pull/3212
* [fp16/doc] correct initial_scale_power default value by stas00 in https://github.com/microsoft/DeepSpeed/pull/3275
* update link to PL docs by Borda in https://github.com/microsoft/DeepSpeed/pull/3237
* fix typo in autotuner.py by eltociear in https://github.com/microsoft/DeepSpeed/pull/3269
* improving int4 asymmetric quantization accuracy by HeyangQin in https://github.com/microsoft/DeepSpeed/pull/3190
* Update install.sh by digger-yu in https://github.com/microsoft/DeepSpeed/pull/3270
* Fix cupy install version detection by mrwyattii in https://github.com/microsoft/DeepSpeed/pull/3276
* [ROCm] temporary workaround till __double2half support enabled in HIP by bmedishe in https://github.com/microsoft/DeepSpeed/pull/3236
* Fix pydantic and autodoc_pydantic version to <2.0.0 until support is added. by loadams in https://github.com/microsoft/DeepSpeed/pull/3290
* Add contribution images to readme by digger-yu in https://github.com/microsoft/DeepSpeed/pull/3282
* remove `torch.cuda.is_available()` check when compiling ops by jinzhen-lin in https://github.com/microsoft/DeepSpeed/pull/3085
* Update MI200 workflow to install apex with changes from pip by loadams in https://github.com/microsoft/DeepSpeed/pull/3294
* Add pre-compiling ops test by loadams in https://github.com/microsoft/DeepSpeed/pull/3277
* Update README.md by digger-yu in https://github.com/microsoft/DeepSpeed/pull/3315
* Update Dockerfile to use python 3.6 specifically by bobowwb in https://github.com/microsoft/DeepSpeed/pull/3298
* zero3 checkpoint frozen params by tjruwase in https://github.com/microsoft/DeepSpeed/pull/3205
* Fix for dist not being initialized when constructing main config by mrwyattii in https://github.com/microsoft/DeepSpeed/pull/3324
* Fix missing scale attributes for GPTJ by cmikeh2 in https://github.com/microsoft/DeepSpeed/pull/3256
* Explicitly check for OPT activation function by cmikeh2 in https://github.com/microsoft/DeepSpeed/pull/3278

New Contributors
* dtunai made their first contribution in https://github.com/microsoft/DeepSpeed/pull/3212
* Borda made their first contribution in https://github.com/microsoft/DeepSpeed/pull/3237
* digger-yu made their first contribution in https://github.com/microsoft/DeepSpeed/pull/3270
* bmedishe made their first contribution in https://github.com/microsoft/DeepSpeed/pull/3236
* jinzhen-lin made their first contribution in https://github.com/microsoft/DeepSpeed/pull/3085
* bobowwb made their first contribution in https://github.com/microsoft/DeepSpeed/pull/3298

**Full Changelog**: https://github.com/microsoft/DeepSpeed/compare/v0.9.0...v0.9.1

0.9.0

Not secure
New features
* 🚀 [DeepSpeed Chat: Easy, Fast and Affordable RLHF Training of ChatGPT-like Models at All Scales](https://github.com/microsoft/DeepSpeed/tree/master/blogs/deepspeed-chat) 🚀

What's Changed
* [docs] add MCR-DL paper to readme/docs by Quentin-Anthony in https://github.com/microsoft/DeepSpeed/pull/3066
* Several fixes to unblock CI by loadams in https://github.com/microsoft/DeepSpeed/pull/3047
* Assert mp_size is factor of model dimensions by molly-smith in https://github.com/microsoft/DeepSpeed/pull/2891
* [CI] follow-up fixes by jeffra in https://github.com/microsoft/DeepSpeed/pull/3072
* fix return prev key and value , added strides to from_blob by mzusman in https://github.com/microsoft/DeepSpeed/pull/2828
* Remove bf16 from inference config dtye enum by molly-smith in https://github.com/microsoft/DeepSpeed/pull/3010
* Softmax Scheduling Cleanup by cmikeh2 in https://github.com/microsoft/DeepSpeed/pull/3046
* Fix nebula in save_16bit_model issue by FreyaRao in https://github.com/microsoft/DeepSpeed/pull/3023
* Allow lists by satpalsr in https://github.com/microsoft/DeepSpeed/pull/3042
* Goodbye Torch 1.8 by mrwyattii in https://github.com/microsoft/DeepSpeed/pull/3082
* Empty ZeRO3 partition cache by tjruwase in https://github.com/microsoft/DeepSpeed/pull/3060
* pre-commit check for torch.cuda in code by delock in https://github.com/microsoft/DeepSpeed/pull/2981
* Move cuda check into utils by loadams in https://github.com/microsoft/DeepSpeed/pull/3074
* update yapf version and style settings by jeffra in https://github.com/microsoft/DeepSpeed/pull/3098
* Fix comms benchmark import issues and support MPI/slurm launching by Quentin-Anthony in https://github.com/microsoft/DeepSpeed/pull/2932
* Disable Stage 1&2 CPUAdam pathways by mrwyattii in https://github.com/microsoft/DeepSpeed/pull/3097
* ♻️ replace deprecated functions for communication by mayank31398 in https://github.com/microsoft/DeepSpeed/pull/2995
* Make fp32 default communication data type by tjruwase in https://github.com/microsoft/DeepSpeed/pull/2970
* Update DeepSpeed copyright license to Apache 2.0 by mrwyattii in https://github.com/microsoft/DeepSpeed/pull/3111
* Add Full Apache License by mrwyattii in https://github.com/microsoft/DeepSpeed/pull/3119
* VL MoE Blog by yaozhewei in https://github.com/microsoft/DeepSpeed/pull/3120
* Update SD triton version in requirements-sd.txt by lekurile in https://github.com/microsoft/DeepSpeed/pull/3135
* Fix launch issue by tjruwase in https://github.com/microsoft/DeepSpeed/pull/3137
* Fix CI badges by mrwyattii in https://github.com/microsoft/DeepSpeed/pull/3138
* Optimize Softmax Kernel by molly-smith in https://github.com/microsoft/DeepSpeed/pull/3112
* Use generic O_DIRECT by tjruwase in https://github.com/microsoft/DeepSpeed/pull/3115
* Enable autoTP for bloom by sywangyi in https://github.com/microsoft/DeepSpeed/pull/3035
* [cleanup] remove `pass` calls where they aren't needed by stas00 in https://github.com/microsoft/DeepSpeed/pull/2826
* [ci] `nv-transformers-v100` - use the same torch version as transformers CI by stas00 in https://github.com/microsoft/DeepSpeed/pull/3096
* Fixes code and tests skipping/asserting incorrectly on torch 2+. by loadams in https://github.com/microsoft/DeepSpeed/pull/3136
* fix example symlink about DeepSpeed+AzureML by EeyoreLee in https://github.com/microsoft/DeepSpeed/pull/3127
* Remove Extra Bracket by VHellendoorn in https://github.com/microsoft/DeepSpeed/pull/3101
* Recover shared parameters by ShijieZZZZ in https://github.com/microsoft/DeepSpeed/pull/3033
* Fix for Diffusers 0.14.0 by molly-smith in https://github.com/microsoft/DeepSpeed/pull/3142
* Fix copyright check, add copyright replace script by mrwyattii in https://github.com/microsoft/DeepSpeed/pull/3141
* Update curriculum-learning.md by goodship1 in https://github.com/microsoft/DeepSpeed/pull/3031
* Remove benchmark code by mrwyattii in https://github.com/microsoft/DeepSpeed/pull/3157
* fixing a bug in CPU Adam and Adagrad by xiexbing in https://github.com/microsoft/DeepSpeed/pull/3109
* op_builder: conditionally compute relative path for hip compiled files by adammoody in https://github.com/microsoft/DeepSpeed/pull/3095
* zero.Init() should pin params in GPU memory as requested by tjruwase in https://github.com/microsoft/DeepSpeed/pull/2953
* deepspeed/runtime/utils.py: reset_peak_memory_stats when empty cache by guoyejun in https://github.com/microsoft/DeepSpeed/pull/2803
* Add DeepSpeed-Chat Blogpost by awan-10 in https://github.com/microsoft/DeepSpeed/pull/3185
* [docs] add run command for 13b by awan-10 in https://github.com/microsoft/DeepSpeed/pull/3187
* add news item. by awan-10 in https://github.com/microsoft/DeepSpeed/pull/3188
* DeepSpeed Chat by tjruwase in https://github.com/microsoft/DeepSpeed/pull/3186
* Fix references to figures by tohtana in https://github.com/microsoft/DeepSpeed/pull/3189
* Fix typo by zhouzaida in https://github.com/microsoft/DeepSpeed/pull/3183
* Fix typo by dawei-wang in https://github.com/microsoft/DeepSpeed/pull/3164
* Chatgpt chinese blog by yaozhewei in https://github.com/microsoft/DeepSpeed/pull/3193
* Add Japanese version of ChatGPT-like pipeline blog by tohtana in https://github.com/microsoft/DeepSpeed/pull/3194
* fix hero figure by conglongli in https://github.com/microsoft/DeepSpeed/pull/3199
* feat: Add support for `NamedTuple` when sharding parameters [3029] by AlexanderVanEck in https://github.com/microsoft/DeepSpeed/pull/3037
* fix license badge by conglongli in https://github.com/microsoft/DeepSpeed/pull/3200
* Update AMD workflows by loadams in https://github.com/microsoft/DeepSpeed/pull/3179
* [CPU support] Optionally bind each rank to different cores on host by delock in https://github.com/microsoft/DeepSpeed/pull/2881

New Contributors
* mzusman made their first contribution in https://github.com/microsoft/DeepSpeed/pull/2828
* FreyaRao made their first contribution in https://github.com/microsoft/DeepSpeed/pull/3023
* sywangyi made their first contribution in https://github.com/microsoft/DeepSpeed/pull/3035
* EeyoreLee made their first contribution in https://github.com/microsoft/DeepSpeed/pull/3127
* VHellendoorn made their first contribution in https://github.com/microsoft/DeepSpeed/pull/3101
* goodship1 made their first contribution in https://github.com/microsoft/DeepSpeed/pull/3031
* zhouzaida made their first contribution in https://github.com/microsoft/DeepSpeed/pull/3183
* dawei-wang made their first contribution in https://github.com/microsoft/DeepSpeed/pull/3164
* AlexanderVanEck made their first contribution in https://github.com/microsoft/DeepSpeed/pull/3037

**Full Changelog**: https://github.com/microsoft/DeepSpeed/compare/v0.8.3...v0.9.0

0.8.3

Not secure
What's Changed
* [deepspeed/autotuner] Bug fix for skipping mbs on gas by rahilbathwal5 in https://github.com/microsoft/DeepSpeed/pull/2171
* Fix issue between our abstract accelerator and colossalai's version of op_builder by jeffra in https://github.com/microsoft/DeepSpeed/pull/2963
* [zero] prevent poor configs from running w. zero-offload by jeffra in https://github.com/microsoft/DeepSpeed/pull/2971
* Fix Meta Tensor checkpoint load for OPT models by lekurile in https://github.com/microsoft/DeepSpeed/pull/2990
* ckpt: create directories in checkpoint_engine by adammoody in https://github.com/microsoft/DeepSpeed/pull/2988
* Fix buffer size for pipeline parallel and communication schedule by tohtana in https://github.com/microsoft/DeepSpeed/pull/2862
* [docs] add new paper to readme/docs by jeffra in https://github.com/microsoft/DeepSpeed/pull/3018
* fix language by stas00 in https://github.com/microsoft/DeepSpeed/pull/3019
* BF Optimizer Attribute Checks by jomayeri in https://github.com/microsoft/DeepSpeed/pull/3022
* [logger] implement `logger.warning_once` by stas00 in https://github.com/microsoft/DeepSpeed/pull/3021
* Convert model parameters from generator to list. by jomayeri in https://github.com/microsoft/DeepSpeed/pull/3017
* Improve loss overflow logs by Quentin-Anthony in https://github.com/microsoft/DeepSpeed/pull/3008
* Fix Broken Links by satpalsr in https://github.com/microsoft/DeepSpeed/pull/3048

New Contributors
* satpalsr made their first contribution in https://github.com/microsoft/DeepSpeed/pull/3048

**Full Changelog**: https://github.com/microsoft/DeepSpeed/compare/v0.8.2...v0.8.3

0.8.2

Not secure
What's Changed
* add auto-generated PR workflow by mrwyattii in https://github.com/microsoft/DeepSpeed/pull/2822
* Fix typo in auto-sync workflow by mrwyattii in https://github.com/microsoft/DeepSpeed/pull/2850
* Fix example command for building wheel with dev version specified. by loadams in https://github.com/microsoft/DeepSpeed/pull/2815
* Create tensor parallelism blog/tutorial by molly-smith in https://github.com/microsoft/DeepSpeed/pull/2766
* Data efficiency library update by conglongli in https://github.com/microsoft/DeepSpeed/pull/2866
* Make z3 respect comm dtype by tjruwase in https://github.com/microsoft/DeepSpeed/pull/2807
* Automatic Tensor Parallelism Blog Links by molly-smith in https://github.com/microsoft/DeepSpeed/pull/2877
* Check device count before running dist tests by HeyangQin in https://github.com/microsoft/DeepSpeed/pull/2799
* AutoTP tutorial web formatting and news by molly-smith in https://github.com/microsoft/DeepSpeed/pull/2883
* Remove deprecated `torch._six` imports by yasyf in https://github.com/microsoft/DeepSpeed/pull/2863
* Reduce I/O size by tjruwase in https://github.com/microsoft/DeepSpeed/pull/2814
* add missing license info to top of all source code by jeffra in https://github.com/microsoft/DeepSpeed/pull/2889
* Enable tensor fragments for zero 2 & 3 by tjruwase in https://github.com/microsoft/DeepSpeed/pull/2727
* better eval sampler for val or test dataset by mayank31398 in https://github.com/microsoft/DeepSpeed/pull/2907
* using container when loading inference checkpoints by HeyangQin in https://github.com/microsoft/DeepSpeed/pull/2875
* Fix CPUAdam for when `vendor_id_raw` is not provided by FarzanT in https://github.com/microsoft/DeepSpeed/pull/2836
* Fix Bloom logits mismatch by molly-smith in https://github.com/microsoft/DeepSpeed/pull/2851
* Fixes `AttributeError` in 2853 by saforem2 in https://github.com/microsoft/DeepSpeed/pull/2854
* Add MPICH Multinode Runner by inkcherry in https://github.com/microsoft/DeepSpeed/pull/2839
* TP unsupported models and assertions by molly-smith in https://github.com/microsoft/DeepSpeed/pull/2810
* AutoTP Assert Kernel Injection Support by molly-smith in https://github.com/microsoft/DeepSpeed/pull/2939
* Check for local CUDA graphs when enable_cuda_graph=True by lekurile in https://github.com/microsoft/DeepSpeed/pull/2941
* Improve overflow handling by tjruwase in https://github.com/microsoft/DeepSpeed/pull/2944
* [RFC] add device abstraction to allow other device than CUDA be used by delock in https://github.com/microsoft/DeepSpeed/pull/2221
* deepspeed.init_distributed() support for TCP protocols by noabauma in https://github.com/microsoft/DeepSpeed/pull/2905

New Contributors
* HeyangQin made their first contribution in https://github.com/microsoft/DeepSpeed/pull/2799
* yasyf made their first contribution in https://github.com/microsoft/DeepSpeed/pull/2863
* mayank31398 made their first contribution in https://github.com/microsoft/DeepSpeed/pull/2907
* FarzanT made their first contribution in https://github.com/microsoft/DeepSpeed/pull/2836
* saforem2 made their first contribution in https://github.com/microsoft/DeepSpeed/pull/2854
* noabauma made their first contribution in https://github.com/microsoft/DeepSpeed/pull/2905

**Full Changelog**: https://github.com/microsoft/DeepSpeed/compare/v0.8.1...v0.8.2

Page 7 of 17

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.