Opening Thoughts
Thank you, everyone! Your overwhelming support continues to fuel our passion for innovation. With your engagement, we've pushed the boundaries further in this release!
**We are hosting our 1st IRL event, 'Scaling AI Infra - GPUs, Kernels, LLMs and More'. We will discuss Liger-Kernel and invite speakers to talk about DeepSpeed, SGLang, and the TensorCore team. Please RSVP at [our event page](https://scalingaiinfragpuskernelsllmsa.splashthat.com).** | [<img width="1280" alt="Screenshot 2024-09-13 at 2 39 20 PM" src="https://github.com/user-attachments/assets/b733d388-7234-4d6e-9c53-44c1c9f8c96b">](https://scalingaiinfragpuskernelsllmsa.splashthat.com) |
| --------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------ |
What's New
🌐 Large Vision Language Model Support
Welcome Qwen-VL, our first venture into the large vision language models! This expansion allows more versatility in applying our solutions across different AI domains.
✨ Patch Kernels on Model Instances
Enhancing flexibility, our latest API update supports model name string and instance as input, streamlining the integration with Hugging Face's SFT trainer. This enhancement ensures that you can easily patch Liger kernels into your models, whether you're starting from scratch or adapting an existing model setup.
🚀 SWIFT Trainer Integration
We're excited to be integrated into the [SWIFT Trainer Framework](https://github.com/modelscope/ms-swift). This integration signifies our commitment to delivering cutting-edge tools that empower the community toward enhancing training efficiency across all supported models.
🔧 New Kernels and Features
**KL Divergence Kernel**: Dive deeper into model behaviors with our new KL divergence kernel, perfect for those needing model distillation, alignment, and beyond.
**Experimental Kernel for Embedding**: Explore acceleration possibilities with our experimental kernel that optimizes embedding operations.
**Extended Cross Entropy Functionality**: Now we support label smoothing and sum reduction, enabling more robust training and flexible loss calculations for neural networks.
Get Involved and Stay Tuned
Join us on our journey! Connect with us on our CUDA MODE server's Discord channel, and don't forget to follow our official account on X for the latest updates: https://x.com/liger_kernel.
A Look Ahead
We're not stopping here! Looking forward, we plan to expand our support to include even more model families and to explore further optimizations and innovative features. Your feedback is invaluable, so please keep it coming as we shape the future of Liger together!
🌟 Acknowledgments
Your contributions make a difference! Thanks to everyone who has starred, contributed, and provided feedback. Each contribution enriches our community and helps us grow stronger together.
What's Changed
* Skip Tests for GPUs Not Supporting `bf16` by austin362667 in https://github.com/linkedin/Liger-Kernel/pull/159
* [Operators] LayerNorm Kernels + LigerLayerNorm by AndreSlavescu in https://github.com/linkedin/Liger-Kernel/pull/169
* README: ensure modeling code is patched before model instantiation by tmm1 in https://github.com/linkedin/Liger-Kernel/pull/170
* Updated wave snippet to use AutoLigerKernelForCausalLM by shimizust in https://github.com/linkedin/Liger-Kernel/pull/181
* [Documentation] LayerNorm added to README by AndreSlavescu in https://github.com/linkedin/Liger-Kernel/pull/180
* Remove torch compile from benchmark scripts by shimizust in https://github.com/linkedin/Liger-Kernel/pull/183
* Update release guide by yundai424 in https://github.com/linkedin/Liger-Kernel/pull/167
* Extract forward/backward core computation bits outside of torch autograd context for easy reuse by qingquansong in https://github.com/linkedin/Liger-Kernel/pull/178
* custom Embedding kernel by AndreSlavescu in https://github.com/linkedin/Liger-Kernel/pull/135
* Feat/functional api by S1ro1 in https://github.com/linkedin/Liger-Kernel/pull/172
* [feat] FusedLinearCrossEntropy support for Mixtral by ryankert01 in https://github.com/linkedin/Liger-Kernel/pull/136
* [Docs] Update README to include LigerEmbedding by AndreSlavescu in https://github.com/linkedin/Liger-Kernel/pull/186
* compute quantiles for memory usage by kvignesh1420 in https://github.com/linkedin/Liger-Kernel/pull/187
* TypoFixed repo_foward -> rope_forward by LucioPalmucci in https://github.com/linkedin/Liger-Kernel/pull/191
* Switch Lightning 1 GPU example to Qwen2 0.5B instruct model with 1024 max seq length by qingquansong in https://github.com/linkedin/Liger-Kernel/pull/193
* [BUILD] Add pyproject.toml by AndreSlavescu in https://github.com/linkedin/Liger-Kernel/pull/150
* ci fix by AndreSlavescu in https://github.com/linkedin/Liger-Kernel/pull/202
* Update the casting logic of RMSNorm by lancerts in https://github.com/linkedin/Liger-Kernel/pull/201
* Update test_rms_norm.py by lancerts in https://github.com/linkedin/Liger-Kernel/pull/203
* Refactored benchmark tests by shimizust in https://github.com/linkedin/Liger-Kernel/pull/196
* Update layer_norm.py by lancerts in https://github.com/linkedin/Liger-Kernel/pull/207
* Uplift kernel APIs to top level by austin362667 in https://github.com/linkedin/Liger-Kernel/pull/210
* Feat: Kl Divergence kernel by S1ro1 in https://github.com/linkedin/Liger-Kernel/pull/194
* minor refactor of rms and layernorm by lancerts in https://github.com/linkedin/Liger-Kernel/pull/213
* Fix compatibility issue on triton=2.3.1 by Tcc0403 in https://github.com/linkedin/Liger-Kernel/pull/219
* Elaborate ack section by ByronHsu in https://github.com/linkedin/Liger-Kernel/pull/222
* Add license in ack section by ByronHsu in https://github.com/linkedin/Liger-Kernel/pull/224
* Reference Unsloth in header by momochen in https://github.com/linkedin/Liger-Kernel/pull/216
* Add label smoothing for cross entropy by Tcc0403 in https://github.com/linkedin/Liger-Kernel/pull/198
* Added HF use-case benchmark script by shimizust in https://github.com/linkedin/Liger-Kernel/pull/223
* (fix) fix pyproject.toml by wizyoung in https://github.com/linkedin/Liger-Kernel/pull/218
* Update swiglu and geglu forward: zeros_like -> empty_like by IvanYashchuk in https://github.com/linkedin/Liger-Kernel/pull/217
* add repr infomation for layer_norm and rms_norm by wizyoung in https://github.com/linkedin/Liger-Kernel/pull/220
* (fix) fix pyproject.toml by wizyoung in https://github.com/linkedin/Liger-Kernel/pull/226
* Refactor/benchmarking visualizer by S1ro1 in https://github.com/linkedin/Liger-Kernel/pull/212
* Feat: add kl div to readme by S1ro1 in https://github.com/linkedin/Liger-Kernel/pull/229
* Monkeypatch for Qwen2-VL by tyler-romero in https://github.com/linkedin/Liger-Kernel/pull/175
* Optimize fused_linear_cross_entropy when weight does not require grads by hansonw in https://github.com/linkedin/Liger-Kernel/pull/237
* SWIFT Trainer Integration by tastelikefeet in https://github.com/linkedin/Liger-Kernel/pull/240
* Add label smoothing to FLCE and unit tests by Tcc0403 in https://github.com/linkedin/Liger-Kernel/pull/244
* Restore monkey patched modules by austin362667 in https://github.com/linkedin/Liger-Kernel/pull/232
* Support for patching post-model initialization by shimizust in https://github.com/linkedin/Liger-Kernel/pull/199
* Reduction support for CrossEntropy and Division by 0 Fix by shivam15s in https://github.com/linkedin/Liger-Kernel/pull/153
* Release Liger-Kernel version 0.3.0 by qingquansong in https://github.com/linkedin/Liger-Kernel/pull/246
New Contributors
* austin362667 made their first contribution in https://github.com/linkedin/Liger-Kernel/pull/159
* tmm1 made their first contribution in https://github.com/linkedin/Liger-Kernel/pull/170
* S1ro1 made their first contribution in https://github.com/linkedin/Liger-Kernel/pull/172
* ryankert01 made their first contribution in https://github.com/linkedin/Liger-Kernel/pull/136
* kvignesh1420 made their first contribution in https://github.com/linkedin/Liger-Kernel/pull/187
* LucioPalmucci made their first contribution in https://github.com/linkedin/Liger-Kernel/pull/191
* momochen made their first contribution in https://github.com/linkedin/Liger-Kernel/pull/216
* wizyoung made their first contribution in https://github.com/linkedin/Liger-Kernel/pull/218
* IvanYashchuk made their first contribution in https://github.com/linkedin/Liger-Kernel/pull/217
* hansonw made their first contribution in https://github.com/linkedin/Liger-Kernel/pull/237
* tastelikefeet made their first contribution in https://github.com/linkedin/Liger-Kernel/pull/240
**Full Changelog**: https://github.com/linkedin/Liger-Kernel/compare/v0.2.1...v0.3.0