What's New
Features
- **MLflow Logging Backend** (680)
- Added `MLflowHandler` class for logging training metrics to MLflow
- New `TrainingArgs` fields: `mlflow_tracking_uri`, `mlflow_experiment_name`, `mlflow_run_name`
- Added `wandb_project`, `wandb_entity`, `wandb_run_name` fields for W&B configuration
- Added `tensorboard_log_dir` field for configurable TensorBoard log directory
- New optional install targets: `requirements-mlflow.txt`, `requirements-wandb.txt`, `requirements-tensorboard.txt`
- **Transformers v5 Compatibility** (681)
- Updated tokenizer API calls to use `extra_special_tokens` instead of `additional_special_tokens`
- Suppressed verbose httpx HTTP request logs from huggingface_hub
Bug Fixes
- **HYBRID_SHARD Failure Fix** (682)
- Added detection for when `world_size < num_devices_per_node` in FSDP configuration
- Automatically falls back to `FULL_SHARD` with a warning when `HYBRID_SHARD` would fail
Development
- **Tox-UV Integration** (676)
- Added `tox-uv` as a tox requirement with `uv-venv-runner`
- Updated GitHub workflows to use `uv` for package installation
- Replaced `pip install` with `uv pip install` in CI workflows
What's Changed
* adds integration for tox-uv and updates workflows to use tox-uv by RobotSail in https://github.com/instructlab/training/pull/676
* Add transformers v5 compatibility by Maxusmusti in https://github.com/instructlab/training/pull/681
* Fix HYBRID_SHARD failure when world_size < available GPUs by rtj1 in https://github.com/instructlab/training/pull/682
* Add MLflow support and expose logging configuration in TrainingArgs by RobotSail in https://github.com/instructlab/training/pull/680
New Contributors
* rtj1 made their first contribution in https://github.com/instructlab/training/pull/682 🎉
Files Changed
18 files changed with 482 insertions and 83 deletions:
- Core training modules: `logger.py`, `config.py`, `accelerator.py`, `data_process.py`, `tokenizer_utils.py`, `main_ds.py`
- New requirements files for optional logging backends
- Updated CI workflows and tox configuration
**Full Changelog:** https://github.com/instructlab/training/compare/v0.13.0...v0.14.0