- Supports recent merging methods including Mixture-of-Experts and Layer-wise merging
- Flexible merging choice for each layer
- Base Models supported : [Llama](https://llama.meta.com/) and [Mistral](https://huggingface.co/docs/transformers/en/model_doc/mistral)
- Trainers supported : 🤗 [Trainer](https://huggingface.co/docs/transformers/en/main_classes/trainer), [SFTrainer](https://huggingface.co/docs/trl/en/sft_trainer)
- Device Supported: CPU, MPS, GPU
- Training choices: Finetune Only Router of MoE layers, Fully fine-tuning of Merged LLM
🔧 Fixes & Refactoring