Deepspeed

Latest version: v0.16.2

Safety actively analyzes 693883 Python packages for vulnerabilities to keep your Python projects secure.

Page 2 of 17

0.15.1

Not secure

What's Changed
* Update version.txt after 0.15.0 release by loadams in https://github.com/microsoft/DeepSpeed/pull/6403
* Fix Type Mismatch by jomayeri in https://github.com/microsoft/DeepSpeed/pull/6410
* Fix redundant seq data parallel grp argument in Z3/MiCS by samadejacobs in https://github.com/microsoft/DeepSpeed/pull/5352
* add Huawei Ascend NPU setup guide by xuedinge233 in https://github.com/microsoft/DeepSpeed/pull/6445
* Add documentation for launcher without SSH by dogacancolak-kensho in https://github.com/microsoft/DeepSpeed/pull/6455
* Dtype support check for accelerator in UTs by raza-sikander in https://github.com/microsoft/DeepSpeed/pull/6360
* Store/Load CIFAR from local/offline by raza-sikander in https://github.com/microsoft/DeepSpeed/pull/6390
* Add the accelerator setup guide link in Getting Started page by rogerxfeng8 in https://github.com/microsoft/DeepSpeed/pull/6452
* Allow triton==3.0.x for fp_quantizer by siddartha-RE in https://github.com/microsoft/DeepSpeed/pull/6447
* Change GDS to 1 AIO thread by jomayeri in https://github.com/microsoft/DeepSpeed/pull/6459
* [CCL] fix condition issue in ccl.py by YizhouZ in https://github.com/microsoft/DeepSpeed/pull/6443
* Avoid gds build errors on ROCm by rraminen in https://github.com/microsoft/DeepSpeed/pull/6456
* TestLowCpuMemUsage UT get device by device_name by raza-sikander in https://github.com/microsoft/DeepSpeed/pull/6397
* Add workflow to build DS without torch to better test before releases by loadams in https://github.com/microsoft/DeepSpeed/pull/6450
* Fix patch for parameter partitioning in zero.Init() by tohtana in https://github.com/microsoft/DeepSpeed/pull/6388
* Add default value to "checkpoint_folder" in "load_state_dict" of bf16_optimizer by ljcc0930 in https://github.com/microsoft/DeepSpeed/pull/6446
* DeepNVMe tutorial by tjruwase in https://github.com/microsoft/DeepSpeed/pull/6449
* bf16_optimizer: fixes to different grad acc dtype by nelyahu in https://github.com/microsoft/DeepSpeed/pull/6485
* print warning if actual triton cache dir is on NFS, not just for default by jrandall in https://github.com/microsoft/DeepSpeed/pull/6487
* DS_BUILD_OPS should build only compatible ops by tjruwase in https://github.com/microsoft/DeepSpeed/pull/6489
* Safe usage of popen by tjruwase in https://github.com/microsoft/DeepSpeed/pull/6490
* Handle an edge case where `CUDA_HOME` is not defined on ROCm systems by amorehead in https://github.com/microsoft/DeepSpeed/pull/6488

New Contributors
* xuedinge233 made their first contribution in https://github.com/microsoft/DeepSpeed/pull/6445
* siddartha-RE made their first contribution in https://github.com/microsoft/DeepSpeed/pull/6447
* ljcc0930 made their first contribution in https://github.com/microsoft/DeepSpeed/pull/6446
* jrandall made their first contribution in https://github.com/microsoft/DeepSpeed/pull/6487
* amorehead made their first contribution in https://github.com/microsoft/DeepSpeed/pull/6488

**Full Changelog**: https://github.com/microsoft/DeepSpeed/compare/v0.15.0...v0.15.1

0.15.0

Not secure

What's Changed
* Update version.txt after 0.14.5 release by loadams in https://github.com/microsoft/DeepSpeed/pull/5982
* move pynvml install to setup.py by Rohan138 in https://github.com/microsoft/DeepSpeed/pull/5840
* add moe topk(k>2) gate support by inkcherry in https://github.com/microsoft/DeepSpeed/pull/5881
* Move inf_or_nan_tracker to cpu for cpu offload by BacharL in https://github.com/microsoft/DeepSpeed/pull/5826
* Enable dynamic shapes for pipeline parallel engine inputs by tohtana in https://github.com/microsoft/DeepSpeed/pull/5481
* Add and Remove ZeRO 3 Hooks by jomayeri in https://github.com/microsoft/DeepSpeed/pull/5658
* DeepNVMe GDS by jomayeri in https://github.com/microsoft/DeepSpeed/pull/5852
* Pin transformers version on nv-nightly by loadams in https://github.com/microsoft/DeepSpeed/pull/6002
* DeepSpeed on Window blog by tjruwase in https://github.com/microsoft/DeepSpeed/pull/6364
* Bug Fix 5880 by jomayeri in https://github.com/microsoft/DeepSpeed/pull/6378
* Update linear.py compatible with torch 2.4.0 by terry-for-github in https://github.com/microsoft/DeepSpeed/pull/5811
* GDS Swapping Fix by jomayeri in https://github.com/microsoft/DeepSpeed/pull/6386
* Long sequence parallelism (Ulysses) integration with HuggingFace by samadejacobs in https://github.com/microsoft/DeepSpeed/pull/5774
* reduce cpu host overhead when using moe by ranzhejiang in https://github.com/microsoft/DeepSpeed/pull/5578
* fix fp16 Qwen2 series model to DeepSpeed-FastGen by ZonePG in https://github.com/microsoft/DeepSpeed/pull/6028
* Add Japanese translation of Windows support blog by tohtana in https://github.com/microsoft/DeepSpeed/pull/6394
* Correct op_builder path to xpu files for trigger XPU tests by loadams in https://github.com/microsoft/DeepSpeed/pull/6398
* add pip install cutlass version check by GuanhuaWang in https://github.com/microsoft/DeepSpeed/pull/6393
* [XPU] API align with new intel pytorch extension release by YizhouZ in https://github.com/microsoft/DeepSpeed/pull/6395
* Pydantic v2 migration by mrwyattii in https://github.com/microsoft/DeepSpeed/pull/5167
* Fix torch check by loadams in https://github.com/microsoft/DeepSpeed/pull/6402

New Contributors
* Rohan138 made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5840
* terry-for-github made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5811
* ranzhejiang made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5578

**Full Changelog**: https://github.com/microsoft/DeepSpeed/compare/v0.14.5...v0.15.0

0.14.5

Not secure

What's Changed
* Update version.txt after 0.14.4 release by mrwyattii in https://github.com/microsoft/DeepSpeed/pull/5694
* Fixed Windows inference build. by costin-eseanu in https://github.com/microsoft/DeepSpeed/pull/5609
* Fix memory leak from _hp_mapping by chiragjn in https://github.com/microsoft/DeepSpeed/pull/5643
* Bug fix for the "Link bit16 and fp32 parameters in partition" by U-rara in https://github.com/microsoft/DeepSpeed/pull/5681
* [CPU] add fp16 support to shm inference_all_reduce by delock in https://github.com/microsoft/DeepSpeed/pull/5669
* Universal checkpoint for zero stage 3 by xylian86 in https://github.com/microsoft/DeepSpeed/pull/5475
* inference unit test injectionPolicy split world_size to multiple tests by oelayan7 in https://github.com/microsoft/DeepSpeed/pull/5687
* ENV var added for recaching in INF Unit tests by raza-sikander in https://github.com/microsoft/DeepSpeed/pull/5688
* Disable nvtx decorator to avoid graph break by tohtana in https://github.com/microsoft/DeepSpeed/pull/5697
* Add an argument to enable the injection of missing state during the conversion of universal checkpoints by xylian86 in https://github.com/microsoft/DeepSpeed/pull/5608
* Change source of CPUAdam for xpu accelerator by Liangliang-Ma in https://github.com/microsoft/DeepSpeed/pull/5703
* Add additional paths to trigger xpu tests by loadams in https://github.com/microsoft/DeepSpeed/pull/5707
* Update XPU docker version by loadams in https://github.com/microsoft/DeepSpeed/pull/5712
* update xpu fusedadam opbuilder for pytorch 2.3 by baodii in https://github.com/microsoft/DeepSpeed/pull/5702
* DeepSpeed Universal Checkpointing: Blog and Tutorial by samadejacobs in https://github.com/microsoft/DeepSpeed/pull/5711
* UCP Chinese Blog by HeyangQin in https://github.com/microsoft/DeepSpeed/pull/5713
* Fix tutorial links by samadejacobs in https://github.com/microsoft/DeepSpeed/pull/5714
* Update node16 check on self-hosted runners and remove python 3.6 by loadams in https://github.com/microsoft/DeepSpeed/pull/5756
* fix the missing argument in test and typo by xylian86 in https://github.com/microsoft/DeepSpeed/pull/5730
* [INF] Enable torch compile for inference by oelayan7 in https://github.com/microsoft/DeepSpeed/pull/5612
* Update checkout action for nv-human-eval workflow by loadams in https://github.com/microsoft/DeepSpeed/pull/5757
* Add Windows scripts (deepspeed, ds_report). by costin-eseanu in https://github.com/microsoft/DeepSpeed/pull/5699
* Unit Test: Add error handling for rate limit exceeded in model list by HeyangQin in https://github.com/microsoft/DeepSpeed/pull/5715
* Fix memory leak for pipelined optimizer swapper by mauryaavinash95 in https://github.com/microsoft/DeepSpeed/pull/5700
* Remove duplicated variable by xu-song in https://github.com/microsoft/DeepSpeed/pull/5727
* Fix phi3 mini 128k load error by Yejing-Lai in https://github.com/microsoft/DeepSpeed/pull/5765
* [CPU] Allow deepspeed.comm.inference_all_reduce in torch.compile graph by delock in https://github.com/microsoft/DeepSpeed/pull/5604
* Added wrappers for hpu tensors based on dtype by deepcharm in https://github.com/microsoft/DeepSpeed/pull/5771
* [bugfix] promote state in bf16_optimizer by billishyahao in https://github.com/microsoft/DeepSpeed/pull/5767
* Launcher mode with SSH bypass by dogacancolak-kensho in https://github.com/microsoft/DeepSpeed/pull/5728
* Update the list of supported models in the Chinese README of fastgen by beep-bebop in https://github.com/microsoft/DeepSpeed/pull/5773
* Add support for Microsoft Phi-3 model to DeepSpeed-FastGen by adk9 in https://github.com/microsoft/DeepSpeed/pull/5559
* Misplaced global variable `warned` by anferico in https://github.com/microsoft/DeepSpeed/pull/5725
* Fixes for latest Huggingface_hub changes on modelId -> id by loadams in https://github.com/microsoft/DeepSpeed/pull/5789
* reduce all-to-all communication volume when both expert and non-expert are tensor-parallel by taozhiwei in https://github.com/microsoft/DeepSpeed/pull/5626
* Update Ubuntu version for running python tests by loadams in https://github.com/microsoft/DeepSpeed/pull/5783
* fix: quantization with DeepSpeed HE by Atry in https://github.com/microsoft/DeepSpeed/pull/5624
* [INF] Add Qwen2RMSNorm to loaded layers in auto_tp by oelayan7 in https://github.com/microsoft/DeepSpeed/pull/5786
* Add chatglm2 & chatglm3 autotp by Yejing-Lai in https://github.com/microsoft/DeepSpeed/pull/5540
* Add new autotp supported model in doc by Yejing-Lai in https://github.com/microsoft/DeepSpeed/pull/5785
* Fix accuracy error of NPUFusedAdam by penn513 in https://github.com/microsoft/DeepSpeed/pull/5777
* Update torch version in cpu-torch-latest and nv-torch-latest-v100 tests to 2.4 by loadams in https://github.com/microsoft/DeepSpeed/pull/5797
* move is_checkpointable call reducing torch.compile Graph breaks by NirSonnenschein in https://github.com/microsoft/DeepSpeed/pull/5759
* Unpin transformers version by loadams in https://github.com/microsoft/DeepSpeed/pull/5650
* Update other workflows to run on Ubuntu 22.04 by loadams in https://github.com/microsoft/DeepSpeed/pull/5798
* [XPU]Use host time to replace xpu time when IPEX version slower than 2.5. by ys950902 in https://github.com/microsoft/DeepSpeed/pull/5796
* Update MII tests to pull correct torchvision by loadams in https://github.com/microsoft/DeepSpeed/pull/5800
* Add fp8-fused gemm kernel by sfc-gh-reyazda in https://github.com/microsoft/DeepSpeed/pull/5764
* Add doc of compressed backend in Onebit optimizers by Liangliang-Ma in https://github.com/microsoft/DeepSpeed/pull/5782
* fix: handle exception when loading cache file in test_inference.py by HeyangQin in https://github.com/microsoft/DeepSpeed/pull/5802
* Pin transformers version for MII tests by loadams in https://github.com/microsoft/DeepSpeed/pull/5807
* Fix op_builder for CUDA 12.5 by keshavkowshik in https://github.com/microsoft/DeepSpeed/pull/5806
* Find ROCm on Fedora by trixirt in https://github.com/microsoft/DeepSpeed/pull/5705
* Fix CPU Adam JIT compilation by lekurile in https://github.com/microsoft/DeepSpeed/pull/5780
* GDS AIO Blog by jomayeri in https://github.com/microsoft/DeepSpeed/pull/5817
* [ROCm] Get rocm version from /opt/rocm/.info/version by rraminen in https://github.com/microsoft/DeepSpeed/pull/5815
* sequence parallel with communication overlap by inkcherry in https://github.com/microsoft/DeepSpeed/pull/5691
* Update to ROCm6 by loadams in https://github.com/microsoft/DeepSpeed/pull/5491
* Add fp16 support of Qwen1.5MoE models (A2.7B) to DeepSpeed-FastGen by ZonePG in https://github.com/microsoft/DeepSpeed/pull/5403
* Use accelerator to replace cuda in setup and runner by Andy666G in https://github.com/microsoft/DeepSpeed/pull/5769
* Link GDS blog to site by tjruwase in https://github.com/microsoft/DeepSpeed/pull/5820
* Non-reentrant checkpointing hook fix by ic-synth in https://github.com/microsoft/DeepSpeed/pull/5781
* Fix NV references by tjruwase in https://github.com/microsoft/DeepSpeed/pull/5821
* Fix docs building guide by tjruwase in https://github.com/microsoft/DeepSpeed/pull/5825
* Update clang-format version from 16 to 18. by loadams in https://github.com/microsoft/DeepSpeed/pull/5839
* Add Japanese translation of DeepNVMe blog by tohtana in https://github.com/microsoft/DeepSpeed/pull/5845
* Fix the bug of deepspeed sequence parallel working with batch size larger than 1 by YJHMITWEB in https://github.com/microsoft/DeepSpeed/pull/5823
* Upgrade HPU image to v1.16.2. by vshekhawat-hlab in https://github.com/microsoft/DeepSpeed/pull/5610
* OptimizedLinear updates by jeffra in https://github.com/microsoft/DeepSpeed/pull/5791
* Log operator warnings only in verbose mode by tjruwase in https://github.com/microsoft/DeepSpeed/pull/5917
* Use `torch.nan_to_num` replace numpy wrapper one by jinyouzhi in https://github.com/microsoft/DeepSpeed/pull/5877
* [Zero2] Reduce the unnecessary all-reduce when tensor size is 0. by ys950902 in https://github.com/microsoft/DeepSpeed/pull/5868
* Update container version for Gaudi2 CI by raza-sikander in https://github.com/microsoft/DeepSpeed/pull/5937
* Fix missing ds_id bug by tjruwase in https://github.com/microsoft/DeepSpeed/pull/5824
* Update LR scheduler configuration by xiyang-aads-lilly in https://github.com/microsoft/DeepSpeed/pull/5846
* HPUAccelerator: remove support in set_visible_devices_envs by nelyahu in https://github.com/microsoft/DeepSpeed/pull/5929
* Z3: optimizations for grad norm calculation and gradient clipping by nelyahu in https://github.com/microsoft/DeepSpeed/pull/5504
* Update xpu-max1100.yml with new config and add some tests by Liangliang-Ma in https://github.com/microsoft/DeepSpeed/pull/5668
* Add accelerator setup guides by delock in https://github.com/microsoft/DeepSpeed/pull/5827
* Allow accelerator to instantiate the device by nelyahu in https://github.com/microsoft/DeepSpeed/pull/5255

New Contributors
* U-rara made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5681
* xylian86 made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5475
* mauryaavinash95 made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5700
* billishyahao made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5767
* dogacancolak-kensho made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5728
* beep-bebop made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5773
* anferico made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5725
* Atry made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5624
* sfc-gh-reyazda made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5764
* keshavkowshik made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5806
* trixirt made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5705
* Andy666G made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5769
* ic-synth made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5781
* xiyang-aads-lilly made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5846

**Full Changelog**: https://github.com/microsoft/DeepSpeed/compare/v0.14.4...v0.14.5

0.14.4

Not secure

What's Changed
* Update version.txt after 0.14.3 release by mrwyattii in https://github.com/microsoft/DeepSpeed/pull/5651
* [CPU] SHM based allreduce improvement for small message size by delock in https://github.com/microsoft/DeepSpeed/pull/5571
* _exec_forward_pass: place zeros(1) on the same device as the param by nelyahu in https://github.com/microsoft/DeepSpeed/pull/5576
* [XPU] adapt lazy_call func to different versions by YizhouZ in https://github.com/microsoft/DeepSpeed/pull/5670
* fix IDEX dependence in xpu accelerator by Liangliang-Ma in https://github.com/microsoft/DeepSpeed/pull/5666
* Remove compile wrapper to simplify access to model attributes by tohtana in https://github.com/microsoft/DeepSpeed/pull/5581
* Fix hpZ with zero element by samadejacobs in https://github.com/microsoft/DeepSpeed/pull/5652
* Fixing the reshape bug in sequence parallel alltoall, which corrupted all QKV data by YJHMITWEB in https://github.com/microsoft/DeepSpeed/pull/5664
* enable yuan autotp & add conv tp by Yejing-Lai in https://github.com/microsoft/DeepSpeed/pull/5428
* Fix latest pytorch '_get_socket_with_port' import error by Yejing-Lai in https://github.com/microsoft/DeepSpeed/pull/5654
* Fix numpy upgrade to 2.0.0 BUFSIZE import error by Yejing-Lai in https://github.com/microsoft/DeepSpeed/pull/5680
* Update BUFSIZE to come from autotuner's constants.py, not numpy by loadams in https://github.com/microsoft/DeepSpeed/pull/5686
* [XPU] support op builder from intel_extension_for_pytorch kernel path by YizhouZ in https://github.com/microsoft/DeepSpeed/pull/5425

New Contributors
* YJHMITWEB made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5664

**Full Changelog**: https://github.com/microsoft/DeepSpeed/compare/v0.14.3...v0.14.4

0.14.3

Not secure

What's Changed
* Update version.txt after 0.14.2 release by mrwyattii in https://github.com/microsoft/DeepSpeed/pull/5458
* Add getter and setter methods for compile_backend across accelerators. by vshekhawat-hlab in https://github.com/microsoft/DeepSpeed/pull/5299
* Fix torch.compile error for PyTorch v2.3 by tohtana in https://github.com/microsoft/DeepSpeed/pull/5463
* Revert "stage3: efficient compute of scaled_global_grad_norm (5256)" by lekurile in https://github.com/microsoft/DeepSpeed/pull/5461
* Update ds-chat CI workflow paths to include zero stage 1-3 files by lekurile in https://github.com/microsoft/DeepSpeed/pull/5462
* Update with ops not supported on Windows by loadams in https://github.com/microsoft/DeepSpeed/pull/5468
* fix: swapping order of parameters in create_dir_symlink method. by alvieirajr in https://github.com/microsoft/DeepSpeed/pull/5465
* Un-pin torch version in nv-torch-latest back to latest and skip test_compile_zero tests on v100 by loadams in https://github.com/microsoft/DeepSpeed/pull/5459
* re-introduce: stage3: efficient compute of scaled_global_grad_norm by nelyahu in https://github.com/microsoft/DeepSpeed/pull/5493
* Fix crash when creating Torch tensor on NPU with device=get_accelerator().current_device() by harygo2 in https://github.com/microsoft/DeepSpeed/pull/5464
* Fix compile wrapper by BacharL in https://github.com/microsoft/DeepSpeed/pull/5455
* enable phi3_mini autotp by Yejing-Lai in https://github.com/microsoft/DeepSpeed/pull/5501
* Fused adam for HPU by BacharL in https://github.com/microsoft/DeepSpeed/pull/5500
* [manifest] update mainfest to add hpp file in csrc. by ys950902 in https://github.com/microsoft/DeepSpeed/pull/5522
* enable phi2 autotp by Yejing-Lai in https://github.com/microsoft/DeepSpeed/pull/5436
* Switch pynvml to nvidia-ml-py by loadams in https://github.com/microsoft/DeepSpeed/pull/5529
* Switch from double quotes to match single quotes by loadams in https://github.com/microsoft/DeepSpeed/pull/5530
* [manifest] update mainfest to add hpp file in deepspeed. by ys950902 in https://github.com/microsoft/DeepSpeed/pull/5533
* New integration - CometMonitor by alexkuzmik in https://github.com/microsoft/DeepSpeed/pull/5466
* Improve _configure_optimizer() final optimizer log by nelyahu in https://github.com/microsoft/DeepSpeed/pull/5528
* Enhance testing: Skip fused_optimizer tests if not supported. by vshekhawat-hlab in https://github.com/microsoft/DeepSpeed/pull/5159
* Skip the UT cases that use unimplemented op builders. by foin6 in https://github.com/microsoft/DeepSpeed/pull/5372
* rocblas -> hipblas changes for ROCm by rraminen in https://github.com/microsoft/DeepSpeed/pull/5401
* Rocm warp size fix by rraminen in https://github.com/microsoft/DeepSpeed/pull/5402
* CPUAdam fp16 and bf16 support by BacharL in https://github.com/microsoft/DeepSpeed/pull/5409
* Optimize zero3 fetch params using all_reduce by deepcharm in https://github.com/microsoft/DeepSpeed/pull/5420
* Fix the TypeError for XPU Accelerator by shiyang-weng in https://github.com/microsoft/DeepSpeed/pull/5531
* Fix RuntimeError for moe on XPU: tensors found at least two devices by shiyang-weng in https://github.com/microsoft/DeepSpeed/pull/5519
* Remove synchronize calls from allgather params by BacharL in https://github.com/microsoft/DeepSpeed/pull/5516
* Avoid overwrite of compiled module wrapper attributes by deepcharm in https://github.com/microsoft/DeepSpeed/pull/5549
* Small typos in functions set_none_gradients_to_zero by TravelLeraLone in https://github.com/microsoft/DeepSpeed/pull/5557
* Adapt doc for 4405 by oraluben in https://github.com/microsoft/DeepSpeed/pull/5552
* Update to HF_HOME from TRANSFORMERS_CACHE by loadams in https://github.com/microsoft/DeepSpeed/pull/4816
* [INF] DSAttention allow input_mask to have false as value by oelayan7 in https://github.com/microsoft/DeepSpeed/pull/5546
* Add throughput timer configuration by deepcharm in https://github.com/microsoft/DeepSpeed/pull/5363
* Add Ulysses DistributedAttention compatibility by Kwen-Chen in https://github.com/microsoft/DeepSpeed/pull/5525
* Add hybrid_engine.py as path to trigger the DS-Chat GH workflow by lekurile in https://github.com/microsoft/DeepSpeed/pull/5562
* Update HPU docker version by loadams in https://github.com/microsoft/DeepSpeed/pull/5566
* Rename files in fp_quantize op from quantize.* to fp_quantize.* by loadams in https://github.com/microsoft/DeepSpeed/pull/5577
* [MiCS] Remove the handle print on DeepSpeed side by ys950902 in https://github.com/microsoft/DeepSpeed/pull/5574
* Update to fix sidebar over text by loadams in https://github.com/microsoft/DeepSpeed/pull/5567
* DeepSpeedCheckpoint: support custom final ln idx by nelyahu in https://github.com/microsoft/DeepSpeed/pull/5506
* Update minor CUDA version compatibility by adk9 in https://github.com/microsoft/DeepSpeed/pull/5591
* Add slide deck for meetup in Japan by tohtana in https://github.com/microsoft/DeepSpeed/pull/5598
* Fixed the Windows build. by costin-eseanu in https://github.com/microsoft/DeepSpeed/pull/5596
* estimate_zero2_model_states_mem_needs: fixing memory estiamtion by nelyahu in https://github.com/microsoft/DeepSpeed/pull/5099
* Fix cuda hardcode for inference woq by Liangliang-Ma in https://github.com/microsoft/DeepSpeed/pull/5565
* fix sequence parallel(Ulysses) grad scale for zero0 by inkcherry in https://github.com/microsoft/DeepSpeed/pull/5555
* Add Compressedbackend for Onebit optimizers by Liangliang-Ma in https://github.com/microsoft/DeepSpeed/pull/5473
* Updated hpu-gaudi2 tests content. by vshekhawat-hlab in https://github.com/microsoft/DeepSpeed/pull/5622
* Pin transformers version for MII tests by loadams in https://github.com/microsoft/DeepSpeed/pull/5629
* WA for Torch-compile-Z3-act-apt accuracy issue from the Pytorch repo by NirSonnenschein in https://github.com/microsoft/DeepSpeed/pull/5590
* stage_1_and_2: optimize clip calculation to use clamp by nelyahu in https://github.com/microsoft/DeepSpeed/pull/5632
* Fix overlap communication of ZeRO stage 1 and 2 by penn513 in https://github.com/microsoft/DeepSpeed/pull/5606
* fixes in _partition_param_sec function by mmhab in https://github.com/microsoft/DeepSpeed/pull/5613
* assumption of torch.initial_seed function accepting seed arg in DeepSpeedAccelerator abstract class is incorrect by polisettyvarma in https://github.com/microsoft/DeepSpeed/pull/5569
* pipe/_exec_backward_pass: fix immediate grad update by nelyahu in https://github.com/microsoft/DeepSpeed/pull/5605
* Monitor was always enabled causing performance degradation by deepcharm in https://github.com/microsoft/DeepSpeed/pull/5633

New Contributors
* alvieirajr made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5465
* harygo2 made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5464
* alexkuzmik made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5466
* foin6 made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5372
* shiyang-weng made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5531
* TravelLeraLone made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5557
* oraluben made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5552
* Kwen-Chen made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5525
* adk9 made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5591
* costin-eseanu made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5596
* NirSonnenschein made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5590
* penn513 made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5606

**Full Changelog**: https://github.com/microsoft/DeepSpeed/compare/v0.14.2...v0.14.3

0.14.2

Not secure

What's Changed
* Update version.txt after 0.14.1 release by mrwyattii in https://github.com/microsoft/DeepSpeed/pull/5413
* Remove dtype(fp16) condition check for residual_add unit test by raza-sikander in https://github.com/microsoft/DeepSpeed/pull/5329
* [XPU] Use non_daemonic_proc by default on XPU device by ys950902 in https://github.com/microsoft/DeepSpeed/pull/5412
* Fix a convergence issues in TP topology caused by incorrect grad_norm. by inkcherry in https://github.com/microsoft/DeepSpeed/pull/5411
* Update 'create-pr' action in release workflow to latest by loadams in https://github.com/microsoft/DeepSpeed/pull/5415
* Update engine.py to avoid torch warning by etiennebonnafoux in https://github.com/microsoft/DeepSpeed/pull/5408
* Update _sidebar.scss by fasterinnerlooper in https://github.com/microsoft/DeepSpeed/pull/5293
* Add more tests into XPU CI by Liangliang-Ma in https://github.com/microsoft/DeepSpeed/pull/5427
* [CPU] Support SHM based inference_all_reduce in TorchBackend by delock in https://github.com/microsoft/DeepSpeed/pull/5391
* Add required paths to trigger AMD tests on PRs by loadams in https://github.com/microsoft/DeepSpeed/pull/5406
* Bug fix in `split_index` method by bm-synth in https://github.com/microsoft/DeepSpeed/pull/5292
* Parallel map step for `DistributedDataAnalyzer` map-reduce by bm-synth in https://github.com/microsoft/DeepSpeed/pull/5291
* Selective dequantization by RezaYazdaniAminabadi in https://github.com/microsoft/DeepSpeed/pull/5375
* Fix sorting of shard optimizer states files for universal checkpoint by tohtana in https://github.com/microsoft/DeepSpeed/pull/5395
* add device config env for the accelerator by shiyuan680 in https://github.com/microsoft/DeepSpeed/pull/5396
* 64bit indexing fused adam by garrett4wade in https://github.com/microsoft/DeepSpeed/pull/5187
* Improve parallel process of universal checkpoint conversion by tohtana in https://github.com/microsoft/DeepSpeed/pull/5343
* set the default to use set_to_none for clearing gradients in BF16 optimizer. by inkcherry in https://github.com/microsoft/DeepSpeed/pull/5434
* OptimizedLinear implementation by jeffra in https://github.com/microsoft/DeepSpeed/pull/5355
* Update README.md by Jhonso7393 in https://github.com/microsoft/DeepSpeed/pull/5453
* Update PyTest torch version to match PyTorch latest official (2.3.0) by loadams in https://github.com/microsoft/DeepSpeed/pull/5454

New Contributors
* etiennebonnafoux made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5408
* fasterinnerlooper made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5293
* shiyuan680 made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5396
* garrett4wade made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5187
* Jhonso7393 made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5453

**Full Changelog**: https://github.com/microsoft/DeepSpeed/compare/v0.14.1...v0.14.2

Page 2 of 17

Releases

Has known vulnerabilities

Previous Next

Deepspeed

Page 2 of 17

0.15.1

0.15.0

0.14.5

0.14.4

0.14.3

0.14.2

Page 2 of 17

Links

Releases