Intel-optimization-for-horovod

Latest version: v0.28.1.6

Safety actively analyzes 722491 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 5

0.28.1.5

Changed

- Upgraded Driver version and OneAPI version.
- Turned on TensorFlow NextPluggableDevice by default.
- Fixed inplace reduce_scatter for OneCCL api change.

0.28.1.4

Added

- Supported async wait for ccl::event.
- Supported TensorFlow NextPluggableDevice for Intel GPU device.
- Enabled TensorFlow AllReduceXLAOp for Intel GPU device.

Changed

- Skipped Pytorch AllReduce bf16 grad UT when rank > 2 for accuracy issue.

0.28.1.3

Changed

- Updated driver version to LTS-803.

0.28.1.2

Added

- Supported empty input buffer for AlltoAll primitive.
- Enabled TorusAllreduce for Intel GPU device.

Changed

- Updated Pytorch ResNet50 example with Intel GPU support.
- Supported ReduceScatter in Pytorch UTs.

Deprecated

Removed

Fixed

- Set special range for half and bf16 tensor for multi cards UTs.
- Fixed GCC13 CPU build issue.
- Fixed oneccl link path in cmakelist.
- Replaced '/gpu' to '/xpu' in tensorflow UTs.
- Fixed GPU check condition bug in pytorch sync_batch_norm.

0.28.1

Fixed

- Fixed build with gcc 12. ([3925](https://github.com/horovod/horovod/pull/3925))
- PyTorch: Fixed build on ROCm. ([3928](https://github.com/horovod/horovod/pull/3928))
- TensorFlow: Fixed local_rank_op. ([3940](https://github.com/horovod/horovod/pull/3940))

0.28.1.0

Added

- Added support for torch conv3d with channels_last_3d format.

Changed

- Refined batch memory copy kernel and supported padding to align w/ public logic, and updated corresponding cases.
- Rebased code to public v0.28.1 release.
- Aligned installation method w/ public HVD.
- Refined BroadcastInplaceOp for TF.
- Enabled public horovod examples of tensorflow for IOH.
- Skipped accuracy check for bf16/fp16 on ranks > 2 temporarily because not sure how to change threshold when rank increase.

Fixed

- Fixed SDL warning.
- Fixed hvd.join with allreduce.
- Fixed scale factor related accuracy issue for bf16/fp16.
- Fixed cpu_operation from CCL to MPI when enable INTEL GPU.

Page 1 of 5

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.