New major features:
1. Support for LoRA for the following model architectures - llama3, llama3.1, granite (GPTBigCode and LlamaForCausalLM), mistral, mixtral, and allam
2. Support for QLora for the following model architectures - llama3, granite (GPTBigCode and LlamaForCausalLM), mistral, mixtral
3. Addition of post-processing function to format tuned adapters as required by vLLM for inference. Refer to [README](https://github.com/foundation-model-stack/fms-hf-tuning?tab=readme-ov-file#post-processing-needed-for-inference-on-vllm) on how to run as a script. When tuning on image, post-processing can be enabled using the flag `lora_post_process_for_vllm`. See [build README](https://github.com/foundation-model-stack/fms-hf-tuning/blob/main/build/README.md#configuration) for details on how to set this flag.
5. Enablement of new flags for throughput improvements: `padding_free` to process multiple examples without adding padding tokens, `multipack` for multi-GPU training to balance the number of tokens processed on each device, and `fast_kernels` for optimized tuning with fused operations and triton kernels. See [README](https://github.com/foundation-model-stack/fms-hf-tuning?tab=readme-ov-file#fms-acceleration) for details on how to set these flags and use cases.
Dependency upgrades:
1. Upgraded `transformers` to version 4.44.2 needed for tuning of all models
2. Upgraded `accelerate` to version 0.33 needed for tuning of all models. Version 0.34.0 has a bug for FSDP.
API /interface changes:
1. `train()` API now returns a tuple of trainer instance and additional metadata as a dict
Additional features and fixes
1. Support of resume tuning from the existing checkpoint. Refer to [README](https://github.com/foundation-model-stack/fms-hf-tuning?tab=readme-ov-file#resuming-tuning-from-checkpoints) on how to use it as a flag. Flag `resume_training` defaults to `True`.
2. Addition of default pad token in tokenizer when `EOS` and `PAD` tokens are equal to improve training quality.
3. JSON compatability for input datasets. See [docs](https://github.com/foundation-model-stack/fms-hf-tuning?tab=readme-ov-file#data-format) for details on data formats.
4. Fix to not resize embedding layer by default, embedding layer can continue to be resized as needed using flag `embedding_size_multiple_of`.
Full List of what's Changed
* fix: do not resize embedding layer by default by kmehant in https://github.com/foundation-model-stack/fms-hf-tuning/pull/310
* fix: logger is unbound error by HarikrishnanBalagopal in https://github.com/foundation-model-stack/fms-hf-tuning/pull/308
* feat: Enable JSON dataset compatibility by willmj in https://github.com/foundation-model-stack/fms-hf-tuning/pull/297
* doc: How to tune LoRA lm_head by aluu317 in https://github.com/foundation-model-stack/fms-hf-tuning/pull/305
* docs: Add findings from exploration into model tuning performance degradation by willmj in https://github.com/foundation-model-stack/fms-hf-tuning/pull/315
* fix: warnings about casing when building the Docker image by HarikrishnanBalagopal in https://github.com/foundation-model-stack/fms-hf-tuning/pull/318
* fix: need to pass skip_prepare_dataset for pretokenized dataset due to breaking change in HF SFTTrainer by HarikrishnanBalagopal in https://github.com/foundation-model-stack/fms-hf-tuning/pull/326
* feat: install fms-acceleration to enable qlora by anhuong in https://github.com/foundation-model-stack/fms-hf-tuning/pull/284
* feat: Migrating the trainer controller to python logger by seshapad in https://github.com/foundation-model-stack/fms-hf-tuning/pull/309
* fix: remove fire ported from Hari's PR 303 by HarikrishnanBalagopal in https://github.com/foundation-model-stack/fms-hf-tuning/pull/324
* dep: cap transformers version due to FSDP bug by anhuong in https://github.com/foundation-model-stack/fms-hf-tuning/pull/335
* deps: Add protobuf to support aLLaM models by willmj in https://github.com/foundation-model-stack/fms-hf-tuning/pull/336
* fix: add enable_aim build args in all stages needed by anhuong in https://github.com/foundation-model-stack/fms-hf-tuning/pull/337
* fix: remove lm_head post processing by Abhishek-TAMU in https://github.com/foundation-model-stack/fms-hf-tuning/pull/333
* doc: Add qLoRA README by aluu317 in https://github.com/foundation-model-stack/fms-hf-tuning/pull/322
* feat: Add deps to evaluate qLora tuned model by aluu317 in https://github.com/foundation-model-stack/fms-hf-tuning/pull/312
* feat: Add support for smoothly resuming training from a saved checkpoint by Abhishek-TAMU in https://github.com/foundation-model-stack/fms-hf-tuning/pull/300
* ci: add a github workflow to label pull requests based on their title by HarikrishnanBalagopal in https://github.com/foundation-model-stack/fms-hf-tuning/pull/298
* fix: Addition of default pad token in tokenizer when EOS and PAD token are equal by Abhishek-TAMU in https://github.com/foundation-model-stack/fms-hf-tuning/pull/343
* feat: Add DataClass Arguments to Activate Padding-Free and MultiPack Plugin and FastKernels by achew010 in https://github.com/foundation-model-stack/fms-hf-tuning/pull/280
* fix: cap transformers at v4.44 by anhuong in https://github.com/foundation-model-stack/fms-hf-tuning/pull/349
* fix: utilities to post process checkpoint for LoRA by Ssukriti in https://github.com/foundation-model-stack/fms-hf-tuning/pull/338
* feat: Add post processing logic to accelerate launch by willmj in https://github.com/foundation-model-stack/fms-hf-tuning/pull/351
* build: install additional fms-acceleration plugins by anhuong in https://github.com/foundation-model-stack/fms-hf-tuning/pull/350
* fix: unable to find output_dir in multi-GPU during resume_from_checkpoint check by Abhishek-TAMU in https://github.com/foundation-model-stack/fms-hf-tuning/pull/352
* fix: check for wte.weight along with embed_tokens.weight by willmj in https://github.com/foundation-model-stack/fms-hf-tuning/pull/356
* release: merge set of changes for v2.0.0 by Abhishek-TAMU in https://github.com/foundation-model-stack/fms-hf-tuning/pull/357
* build(deps): unset hardcoded trl version to get latest updates by anhuong in https://github.com/foundation-model-stack/fms-hf-tuning/pull/358
New Contributors
* achew010 made their first contribution in https://github.com/foundation-model-stack/fms-hf-tuning/pull/280
**Full Changelog**: https://github.com/foundation-model-stack/fms-hf-tuning/compare/v1.2.2...v2.0.0