Added
- Unified distributed layers
- MoE support
- DevOps tools such as github action, code review automation, etc.
- New project official website
Changes
- refactored the APIs for usability, flexibility and modularity
- adapted PyTorch AMP for tensor parallel
- refactored utilities for tensor parallel and pipeline parallel
- Separated benchmarks and examples as independent repositories
- Updated pipeline parallelism to support non-interleaved and interleaved versions
- refactored installation scripts for convenience
Fixed
- zero level 3 runtime error
- incorrect calculation in gradient clipping