New Features
Open Source Embedding + Contrastive Code (https://github.com/mosaicml/llm-foundry/pull/1615)
LLM foundry now supports finetuning embedding models with contrastive loss. Foundry now supports various approaches to selecting negative passages for contrastive loss which can be either randomly selected or pre-defined. For more information, please view the the [readme](https://github.com/mosaicml/llm-foundry/blob/main/llmfoundry/models/llm_embed/README.md).
PyTorch 2.5.1 (https://github.com/mosaicml/llm-foundry/pull/1665)
This release updates LLM Foundry to the PyTorch 2.5.1 release, bringing with it support for the new features and optimizations in PyTorch 2.5.1.
Improved error messages (https://github.com/mosaicml/llm-foundry/pull/1657, https://github.com/mosaicml/llm-foundry/pull/1660, https://github.com/mosaicml/llm-foundry/pull/1623, https://github.com/mosaicml/llm-foundry/pull/1625)
Various improved error messages, making debugging user errors more clear.
What's Changed
* Update mcli examples to use 0.14.0 by irenedea in https://github.com/mosaicml/llm-foundry/pull/1624
* Open Source Embedding + Contrastive Code by KuuCi in https://github.com/mosaicml/llm-foundry/pull/1615
* Catch delta table not found error by milocress in https://github.com/mosaicml/llm-foundry/pull/1625
* Add Mlflow 403 PL UserError by mattyding in https://github.com/mosaicml/llm-foundry/pull/1623
* Catches when data prep cluster fails to start by milocress in https://github.com/mosaicml/llm-foundry/pull/1628
* Bump mlflow max version by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1629
* add another cluster connection failure wrapper by milocress in https://github.com/mosaicml/llm-foundry/pull/1630
* Add MLflow `log_model` option by nancyhung in https://github.com/mosaicml/llm-foundry/pull/1544
* Move loss generating token counting to the dataloader by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1632
* Bump databricks-connect from 14.1.0 to 15.4.3 by dependabot in https://github.com/mosaicml/llm-foundry/pull/1636
* Fix dataset download location by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1639
* Revert "Bump databricks-connect from 14.1.0 to 15.4.3" by XiaohanZhangCMU in https://github.com/mosaicml/llm-foundry/pull/1640
* Bump transformers version by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1631
* Fix gpu tests test_tp_train and test_huggingface_conversion_callback_interval by irenedea in https://github.com/mosaicml/llm-foundry/pull/1642
* Update datasets requirement from <2.20,>=2.19 to >=2.20.0,<2.21 by dependabot in https://github.com/mosaicml/llm-foundry/pull/1330
* Add max shard size to transformers save_pretrained by b-chu in https://github.com/mosaicml/llm-foundry/pull/1648
* Update huggingface-hub requirement from <0.25,>=0.19.0 to >=0.19.0,<0.27 by dependabot in https://github.com/mosaicml/llm-foundry/pull/1652
* Update accelerate requirement from <0.34,>=0.25 to >=0.25,<1.2 by dependabot in https://github.com/mosaicml/llm-foundry/pull/1633
* Catch Delta Table Not Found by KuuCi in https://github.com/mosaicml/llm-foundry/pull/1653
* Add Exception for missing UC column by milocress in https://github.com/mosaicml/llm-foundry/pull/1654
* Infer step size for Embeddings by KuuCi in https://github.com/mosaicml/llm-foundry/pull/1647
* Pin FAv2 by mvpatel2000 in https://github.com/mosaicml/llm-foundry/pull/1656
* Retry catching BlockingIOError by KuuCi in https://github.com/mosaicml/llm-foundry/pull/1657
* Catch bad data prep by milocress in https://github.com/mosaicml/llm-foundry/pull/1644
* Update pytest-cov requirement from <6,>=4 to >=4,<7 by dependabot in https://github.com/mosaicml/llm-foundry/pull/1663
* Bump coverage[toml] from 7.6.1 to 7.6.4 by dependabot in https://github.com/mosaicml/llm-foundry/pull/1650
* Move transform_model_pre_registration in hf_checkpointer by irenedea in https://github.com/mosaicml/llm-foundry/pull/1664
* Catch Cluster Permissions Error by KuuCi in https://github.com/mosaicml/llm-foundry/pull/1660
* Mosaicml version bump by j316chuck in https://github.com/mosaicml/llm-foundry/pull/1661
* Changes for removing unused terms in CE loss fn by gupta-abhay in https://github.com/mosaicml/llm-foundry/pull/1643
* Update setuptools requirement from <68.0.0 to <76.0.0 by dependabot in https://github.com/mosaicml/llm-foundry/pull/1662
* Bump docker version to torch 2.5.1 by j316chuck in https://github.com/mosaicml/llm-foundry/pull/1665
* Bump ubuntu 22.04 + torch 2.5.1 by KuuCi in https://github.com/mosaicml/llm-foundry/pull/1666
New Contributors
* mattyding made their first contribution in https://github.com/mosaicml/llm-foundry/pull/1623
**Full Changelog**: https://github.com/mosaicml/llm-foundry/compare/v0.14.5...v0.15.0