Megablocks

Latest version: v0.8.0

Safety actively analyzes 723685 Python packages for vulnerabilities to keep your Python projects secure.

Page 2 of 2

0.5.0

What's New

Several improvements to avoid CPU <> GPU device synchronizations, GLU support, and support for some new models 👀

What's Changed
* Update version by mvpatel2000 in https://github.com/stanford-futuredata/megablocks/pull/36
* Avoid duplicate `.cpu()` call by mvpatel2000 in https://github.com/stanford-futuredata/megablocks/pull/37
* Have megablocks rely on torch default precision by mvpatel2000 in https://github.com/stanford-futuredata/megablocks/pull/39
* Add GLU support by sashaDoubov in https://github.com/stanford-futuredata/megablocks/pull/38
* Enable generic dimentionality for input by vchiley in https://github.com/stanford-futuredata/megablocks/pull/41
* Removing an extra size call by bcui19 in https://github.com/stanford-futuredata/megablocks/pull/43
* Fix bug in topology kernel for ffn_hidden_size>4096. by tgale96 in https://github.com/stanford-futuredata/megablocks/pull/47

New Contributors
* sashaDoubov made their first contribution in https://github.com/stanford-futuredata/megablocks/pull/38
* bcui19 made their first contribution in https://github.com/stanford-futuredata/megablocks/pull/43

**Full Changelog**: https://github.com/stanford-futuredata/megablocks/compare/v0.4.0...v0.5.0

0.4.0

What's Changed
* Unpack saved context once by mvpatel2000 in https://github.com/stanford-futuredata/megablocks/pull/33
* Refactoring class hierarchy for FSDP wrapping by tgale96 in https://github.com/stanford-futuredata/megablocks/pull/34

**Full Changelog**: https://github.com/stanford-futuredata/megablocks/compare/v0.3.3...v0.4.0

0.3.3

What's Changed
* Enable running MegaBlocks MoE without bias by vchiley in https://github.com/stanford-futuredata/megablocks/pull/31

**Full Changelog**: https://github.com/stanford-futuredata/megablocks/compare/v0.3.2...v0.3.3

0.3.2

What's Changed

- Support for bfloat16
- Optimizations for top_k > 1
- Support for fully-sharded data parallelism
- Support tensor model parallelism when expert_parallel_world_size > num_experts
- Optimizations for activation memory
- Support activation quantization (thanks dblalock!)
- Optimizations for SM90 (Hopper)
- Lots of bug fixes, cleanup and small optimizations

New Contributors
* vchiley made their first contribution in https://github.com/stanford-futuredata/megablocks/pull/9
* deepakn94 made their first contribution in https://github.com/stanford-futuredata/megablocks/pull/16
* b-chu made their first contribution in https://github.com/stanford-futuredata/megablocks/pull/19

**Full Changelog**: https://github.com/stanford-futuredata/megablocks/compare/v0.1...v0.3.2

0.1

Initial release documenting repository state prior to MLSys'23 camera-ready publication.

Page 2 of 2

Releases

Has known vulnerabilities

Megablocks

Page 2 of 2

0.5.0

0.4.0

0.3.3

0.3.2

0.1

Page 2 of 2

Links

Releases