What's Changed
* Update version.txt after 0.14.3 release by mrwyattii in https://github.com/microsoft/DeepSpeed/pull/5651
* [CPU] SHM based allreduce improvement for small message size by delock in https://github.com/microsoft/DeepSpeed/pull/5571
* _exec_forward_pass: place zeros(1) on the same device as the param by nelyahu in https://github.com/microsoft/DeepSpeed/pull/5576
* [XPU] adapt lazy_call func to different versions by YizhouZ in https://github.com/microsoft/DeepSpeed/pull/5670
* fix IDEX dependence in xpu accelerator by Liangliang-Ma in https://github.com/microsoft/DeepSpeed/pull/5666
* Remove compile wrapper to simplify access to model attributes by tohtana in https://github.com/microsoft/DeepSpeed/pull/5581
* Fix hpZ with zero element by samadejacobs in https://github.com/microsoft/DeepSpeed/pull/5652
* Fixing the reshape bug in sequence parallel alltoall, which corrupted all QKV data by YJHMITWEB in https://github.com/microsoft/DeepSpeed/pull/5664
* enable yuan autotp & add conv tp by Yejing-Lai in https://github.com/microsoft/DeepSpeed/pull/5428
* Fix latest pytorch '_get_socket_with_port' import error by Yejing-Lai in https://github.com/microsoft/DeepSpeed/pull/5654
* Fix numpy upgrade to 2.0.0 BUFSIZE import error by Yejing-Lai in https://github.com/microsoft/DeepSpeed/pull/5680
* Update BUFSIZE to come from autotuner's constants.py, not numpy by loadams in https://github.com/microsoft/DeepSpeed/pull/5686
* [XPU] support op builder from intel_extension_for_pytorch kernel path by YizhouZ in https://github.com/microsoft/DeepSpeed/pull/5425
New Contributors
* YJHMITWEB made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5664
**Full Changelog**: https://github.com/microsoft/DeepSpeed/compare/v0.14.3...v0.14.4