Bagua

Latest version: v0.9.2

Safety actively analyzes 685525 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 3

0.9.2

Bug Fixes

Python

- fix qadam NAN problem (654)
- fix: fail to compile Aluminum

0.9.1

Bug Fixes

Python

- Revert "fix: to_bagua_tensor compatibility with torch 1.6.0 (355)"

Features

Python, core

- improve NCCL lib version check (525)

0.9.0

Bug Fixes

Other

- Reuse fused parameter tensors in fuse_step (410)
- Call step closure in qadam optimizer step (432)
- Fix need_reset condition (454)
- Do negotiation in async native op (447)
- Fix find_unused_parameters (452)
- Fix qadam non-deterministic (459)
- Add `LIBRARY_PATH` env in `install_master.sh` (465)
- Fix typo in `install_master.sh` (471)

Python

- CUDA 11.5 can't get nccl package (415)
- Fix process group compatibility with torch 1.6.0 (413)
- Fix ci random fail (445)
- Fix async algorithm (479)

Features

Core

- Initial support for C interface (325)

Other

- Support NODE_RANK environment variable (426)
- Choose bagua service port dynamically (431)
- Use bagua_module_name to identify different modules (438)
- Add algorithm registry (433)
- Add compatibility for NCCL version under 2.10 (449)
- Add broadcast object api (437)
- Support qadam in fused optimizer (477)

Python

- Support PyTorch DDP compatible distributed training API (312)
- Support torch-api-compatiable all_reduce (377)
- Associate PyTorch Process Group with Bagua Process Group using cache (402)
- Support find_unused_parameters on BaguaDDP (409)
- Add `BAGUA_AUTOTUNE_SERVER_WAIT_TIME` env (474)

0.8.2

Bug Fixes

Other

- Fuse optimizer oom and make it stateless (207)
- To_bagua_tensor compatibility with torch 1.6.0 (355)

Python

- Use separate process group for async communication thread to avoid potential hangs (298)
- Do not fail if checkpoints path exist (305)
- Fix is_moe_param (306)
- Change `to_bagua_tensor` API to support PyTorch 1.10 (338)
- Fix fused optimizer with multiple param groups (356)

Features

Python

- Support switching between different algorithms (299)
- Separate algorithm declaration and implementation (246)

Python, core

- Support process group in `with_bagua`, support hierarchical communication in bytegrad algorithm (300)
- Support mutable bucket tensors (271)
- Support all_to_all_single (361)

0.8.1

Features

Other

- Use single bucket for decentralized algorithm to improve performance (275)
- Support process group (228)
- Add barrier api (290)

Python

- Support moe (208)
- Support checkpointing for moe (242)

0.8.0

Bug Fixes

Ci

- Only run publish once on git tag

Core

- Fix compressed buffer can not be scattered to odd number of ranks

Other

- Fix ci pypi versioning
- Remove __init__.py and python __version__, use cargo version
- Move import bagua_install_library to install library function
- Merge bagua_install_library and setup.py, remove nccl<=2.6 support
- Fix alltoall_v parameter (17)
- Reduce and allgather python interface
- Fix decompress incorrect pointer and typo in error msg
- Fix python gil deadlock during getting data ptr
- Fix benchmark script requirements
- Fix alltoall_v parameter types (27)
- Always mark bagua padding tensor as ready
- Make compress/decompress of BaguaTensor `method` string consistent (33)
- Fix scatter and reduce_scatter implementation (40)
- Substract overflow error for decentralized op (39)
- Fix QADAM params (17)
- Fix assert precision (18)
- Replace mutex with atomic bool for async op and add Aluminum submodule update (67)
- Fix duplicated dependency downloading during installation (77)
- Fix async algorithm aborting and hanging (78, 81)
- Fix qadam algorithm call (20)
- Fix missing symbols in the zip library (24)
- Fix random autotune server hang (206)
- Bagua-net library path mismatch, make `--enable_bagua_net` argument style consistent with other args (218)

Python

- Fix random autotune-service hang
- Handle conflicts caused by sklearn upgrade (225)

Features

Ci

- Only publish pypi for master commits

Other

- Add async model average algorithm (110)
- Add cached dataset wrapper (148)
- Support sync batchnorm (151)
- Add `--enable-bagua-net` option in launcher (183)
- Add pytorch examples for MNIST, ImageNet, SQuAD training (1)
- Add requirements.txt, only download dataset on local rank 0 (2)
- Add python packaging related files
- Add `__version__` variable
- Install nccl deps in bagua core and add generated `__version__` variable
- Add version.py placeholder to prevent file not found error
- Initial support for python op (2)
- Add 5 min timeout for buckets' comm op (5)
- Replace NCCL with Aluminum (7)
- Add synethetic benchmark script (5)
- Add elastic training example (7)
- Support alltoall_v (vector alltoall) (14)
- Add reduce and allgather python interface
- Support reduce and allgather op with Reduction op enum
- Support creating BaguaTensor by passing torch tensor directly (19)
- Compatible mode for getting pytorch tensor info with Python interpreter
- Better debug log including tensor info when executing ops
- Add native low precision decentralized operator (26)
- Add (scatter, gather, scatter_reduce) and all inplace version communication primitives (37)
- Make full precision decentralized op stateless (36)
- Add communication_primitives example (12)
- Use nccl 2.10 avg op for all algorithms using averaging (46, 45)
- Add opentelemetry to report tensor ready order (42)
- Add deterministic flag (15)
- Add native async model average algorithm (41)
- Add examples for async model average algorithm (14)
- Support packet splitting and multi-stream parallel transmission (5)
- Support ncclnet v3 and remove the dependency on nccl in the installation environment (17)
- Add sync interval param to async examples (19)
- Suppport tokio backend (21)
- Support bagua-net (89)

Python

- Broadcast scalars for optimizers (202)

Page 1 of 3

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.