Highlights
1. **Post Training Loss**: Introducing the first open-source optimized post-training losses in Liger Kernel with ~80% memory reduction, featuring DPO, CPO, ORPO, SimPO, JSD, and more. No more OOM nightmares for post-training ML researchers!
<img src="https://github.com/user-attachments/assets/19efbd07-f70b-4573-a3be-33fa80e7c4e1" alt="image" width="400">
2. **AMD CI**: With AMD’s generous sponsorship of MI300s, we’ve integrated them into our CI. Special thanks to Embedded LLM for building the AMD CI infrastructure. 428
3. **XPU Support**: In collaboration with Intel, we now support XPU, demonstrating comparable performance gains with other vendors. 407
What's Changed
* Adds the CPO Alignment Loss Function by pramodith in https://github.com/linkedin/Liger-Kernel/pull/382
* Qwen2-VL Training Example w/ Liger by tyler-romero in https://github.com/linkedin/Liger-Kernel/pull/389
* Support Qwen2-VL's multimodal RoPE implementation by li-plus in https://github.com/linkedin/Liger-Kernel/pull/384
* add xpu device support for `rms_norm` by faaany in https://github.com/linkedin/Liger-Kernel/pull/379
* fix qwen2 import failure in test by ByronHsu in https://github.com/linkedin/Liger-Kernel/pull/394
* Add Chunked SimPO Loss by pramodith in https://github.com/linkedin/Liger-Kernel/pull/386
* Add script to reproducibly run examples on Modal by tyler-romero in https://github.com/linkedin/Liger-Kernel/pull/397
* add nn.module support for chunked loss function by shivam15s in https://github.com/linkedin/Liger-Kernel/pull/402
* Generalize JSD to FKL/RKL by yundai424 in https://github.com/linkedin/Liger-Kernel/pull/393
* Enable keyword arguments for liger functional by hongpeng-guo in https://github.com/linkedin/Liger-Kernel/pull/400
* add reference model logps to chunkedloss interface and fix dpo loss fn by shivam15s in https://github.com/linkedin/Liger-Kernel/pull/405
* Optimize CE Loss by casting dtype to float32 inside kernel by pramodith in https://github.com/linkedin/Liger-Kernel/pull/406
* Xpu support by mgrabban in https://github.com/linkedin/Liger-Kernel/pull/407
* Fix `get_batch_loss_metrics` comments by austin362667 in https://github.com/linkedin/Liger-Kernel/pull/413
* Add rebuild to CI by ByronHsu in https://github.com/linkedin/Liger-Kernel/pull/415
* Fix os env by ByronHsu in https://github.com/linkedin/Liger-Kernel/pull/416
* Adjust QWEN2 VL Loss `rtol` by austin362667 in https://github.com/linkedin/Liger-Kernel/pull/412
* [tiny] Add QwQ to readme (same arch as Qwen2) by tyler-romero in https://github.com/linkedin/Liger-Kernel/pull/424
* Enhance Cross Entropy Softcap Unit Test by austin362667 in https://github.com/linkedin/Liger-Kernel/pull/423
* Add ORPO Trainer + support HF metrics directly from chunked loss functions + fixes to avoid torch compile recompilations by shivam15s in https://github.com/linkedin/Liger-Kernel/pull/429
* Add Build Success/Fail Badge by hebiao064 in https://github.com/linkedin/Liger-Kernel/pull/431
* Switch amd-ci to use MI300X runner. by saienduri in https://github.com/linkedin/Liger-Kernel/pull/428
* [CI] rename ci and add cron job for amd by ByronHsu in https://github.com/linkedin/Liger-Kernel/pull/433
* [CI] shorten ci name by ByronHsu in https://github.com/linkedin/Liger-Kernel/pull/434
* update ci icon on readme by bboyleonp666 in https://github.com/linkedin/Liger-Kernel/pull/440
* Introduce Knowledge Distillation Base by austin362667 in https://github.com/linkedin/Liger-Kernel/pull/432
* [AMD] [CI] Clean up `amd-ci` by tjtanaa in https://github.com/linkedin/Liger-Kernel/pull/436
* Add xpu in env report by abhilash1910 in https://github.com/linkedin/Liger-Kernel/pull/443
* Specify scheduled CI in AMD badge by ByronHsu in https://github.com/linkedin/Liger-Kernel/pull/446
* improve code quality for chunk loss by ByronHsu in https://github.com/linkedin/Liger-Kernel/pull/448
* Add paper link and formula for preference loss by ByronHsu in https://github.com/linkedin/Liger-Kernel/pull/449
* Make kernel doc lean by ByronHsu in https://github.com/linkedin/Liger-Kernel/pull/450
* Fix LigerCrossEntropyLoss Reduction Behavior for "None" Mode by hebiao064 in https://github.com/linkedin/Liger-Kernel/pull/435
* add eng blog by ByronHsu in https://github.com/linkedin/Liger-Kernel/pull/452
* add chunked loss to readme by shivam15s in https://github.com/linkedin/Liger-Kernel/pull/453
* change chunked readme by shivam15s in https://github.com/linkedin/Liger-Kernel/pull/454
* add sponsorship and collab by ByronHsu in https://github.com/linkedin/Liger-Kernel/pull/457
* version bump to 0.5.0 by shivam15s in https://github.com/linkedin/Liger-Kernel/pull/455
* Add HIP (ROCm) and Liger Kernel to env report by Comet0322 in https://github.com/linkedin/Liger-Kernel/pull/456
New Contributors
* li-plus made their first contribution in https://github.com/linkedin/Liger-Kernel/pull/384
* faaany made their first contribution in https://github.com/linkedin/Liger-Kernel/pull/379
* hongpeng-guo made their first contribution in https://github.com/linkedin/Liger-Kernel/pull/400
* mgrabban made their first contribution in https://github.com/linkedin/Liger-Kernel/pull/407
* hebiao064 made their first contribution in https://github.com/linkedin/Liger-Kernel/pull/431
* saienduri made their first contribution in https://github.com/linkedin/Liger-Kernel/pull/428
* bboyleonp666 made their first contribution in https://github.com/linkedin/Liger-Kernel/pull/440
* abhilash1910 made their first contribution in https://github.com/linkedin/Liger-Kernel/pull/443
* Comet0322 made their first contribution in https://github.com/linkedin/Liger-Kernel/pull/456