Mamba Model Support
- Add mamba and cuda dependencies to support tuning Mamba and Jamba models.
Configuration Updates
- New `add_special_tokens` argument for special tokens to be added to the tokenizer's vocabulary.
Data Preprocessor Updates
- Enable `streaming` in `DataSetConfig` to load large datasets by utilizing HF `IterableDatasets`.
- Add support for `chat_template` in `DataPreProcessorConfig`. Fix parsing issue for passing `chat_template` via CLI args.
- Allow a tokenizer data handler for users to have direct control over dataset and truncating samples.
- Fix multi-processes broadcast error when running data preprocessor.
- Update data handler backend to support additional types of handlers beyond Map style. Adds support for Filter type data handlers.
- Fix data handler to string conversion bug.
- Allow users to use `tools` and `documents` in chat templates and pass in additional data arguments to the data handler to enable processing sub field of a data sample.
- Add data preprocessing script, decoupling data preprocessor from the tuning. Users can execute the script [`offline_data_processing.py`](https://github.com/foundation-model-stack/fms-hf-tuning/blob/main/scripts/offline_data_processing.py) to run data preprocessor standalone.
Dependency Updates
- simpleeval from <1.0 to <2.0
- transformers updated from 4.48.1 to 4.49
- datasets from <3.0 to <4.0
Additional changes
- Support automatic HuggingFace checkpointing for ScatterMoE. Converted checkpoint can be found at `hf_converted_checkpoint` folder within every saved checkpoint directory.
- Add Github Action to free up disk space.
- Remove deprecated `push_to_hub_token` to resolve warning.
Full list of Changes
* build(deps): Update simpleeval requirement from <1.0,>=0.9.13 to >=0.9.13,<2.0 by dependabot in https://github.com/foundation-model-stack/fms-hf-tuning/pull/369
* feat: Enable streaming in data preprocessor by willmj in https://github.com/foundation-model-stack/fms-hf-tuning/pull/437
* feat: Support for add special tokens via cli args by YashasviChaurasia in https://github.com/foundation-model-stack/fms-hf-tuning/pull/473
* feat: add support for chat template from data config by YashasviChaurasia in https://github.com/foundation-model-stack/fms-hf-tuning/pull/474
* feat: Add tokenizer data handler. by dushyantbehl in https://github.com/foundation-model-stack/fms-hf-tuning/pull/487
* build(deps): upgrade transformers to 4.49 by anhuong in https://github.com/foundation-model-stack/fms-hf-tuning/pull/485
* fix: use main_process_first instead of broadcast_object_list by willmj in https://github.com/foundation-model-stack/fms-hf-tuning/pull/458
* feat: Update data handler backend and introduce filter based handlers. by dushyantbehl in https://github.com/foundation-model-stack/fms-hf-tuning/pull/488
* fix: data handler to string conversion bug by dushyantbehl in https://github.com/foundation-model-stack/fms-hf-tuning/pull/490
* feat: add sum op for trainer controller by kmehant in https://github.com/foundation-model-stack/fms-hf-tuning/pull/491
* feat: support moe hf chkpt by kmehant in https://github.com/foundation-model-stack/fms-hf-tuning/pull/486
* chore: Add a GH runner to free up disk space by aluu317 in https://github.com/foundation-model-stack/fms-hf-tuning/pull/496
* build(deps): Update datasets requirement from <3.0,>=2.15.0 to >=2.15.0,<4.0 by dependabot in https://github.com/foundation-model-stack/fms-hf-tuning/pull/340
* fix: Remove deprecated push_to_hub_token to resolve warning by Luka-D in https://github.com/foundation-model-stack/fms-hf-tuning/pull/419
* feat: Add tools and documents usage in chat template by dushyantbehl in https://github.com/foundation-model-stack/fms-hf-tuning/pull/495
* test:Addition of data preprocessing script, decoupling data preprocessor from the tuning by Abhishek-TAMU in https://github.com/foundation-model-stack/fms-hf-tuning/pull/459
* fix: bug which caused test case failures after PR merge by dushyantbehl in https://github.com/foundation-model-stack/fms-hf-tuning/pull/500
* build(deps): changes needed to support mamba/jamba model by anhuong in https://github.com/foundation-model-stack/fms-hf-tuning/pull/400
**Full Changelog**: https://github.com/foundation-model-stack/fms-hf-tuning/compare/v2.6.0...v2.7.1