Llm-foundry

Latest version: v0.18.0

Safety actively analyzes 723296 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 3 of 5

0.13.1

What's Changed
* Add configurability to HF checkpointer timeout by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1599

**Full Changelog**: https://github.com/mosaicml/llm-foundry/compare/v0.13.0...v0.13.1

0.13.0

🛠️ Bug Fixes & Cleanup
Pytorch 2.4 Checkpointing (1569, 1581, 1583)
Resolved issues related to checkpointing for Curriculum Learning (CL) callbacks.

🔧 Dependency Updates
Bumped tiktoken from 0.4.0 to 0.8.0 (1572)
Updated onnxruntime from 1.19.0 to 1.19.2 (1590)

What's Changed
* Update mcli yamls by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1552
* Use `allenai/c4` instead of `c4` dataset by eitanturok in https://github.com/mosaicml/llm-foundry/pull/1554
* Tensor Parallelism by eitanturok in https://github.com/mosaicml/llm-foundry/pull/1521
* Insufficient Permissions Error when trying to access table by KuuCi in https://github.com/mosaicml/llm-foundry/pull/1555
* Add NoOp optimizer by snarayan21 in https://github.com/mosaicml/llm-foundry/pull/1560
* Deterministic GCRP Errors by KuuCi in https://github.com/mosaicml/llm-foundry/pull/1559
* Simplify CL API by b-chu in https://github.com/mosaicml/llm-foundry/pull/1510
* Reapply 1389 by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1561
* Add dataset swap callback by b-chu in https://github.com/mosaicml/llm-foundry/pull/1536
* Add error to catch more unknown example types by milocress in https://github.com/mosaicml/llm-foundry/pull/1562
* Add FileExtensionNotFoundError by b-chu in https://github.com/mosaicml/llm-foundry/pull/1564
* Add InvalidConversationError by b-chu in https://github.com/mosaicml/llm-foundry/pull/1565
* Release docker img by KuuCi in https://github.com/mosaicml/llm-foundry/pull/1547
* Revert FT dataloader changes from 1561, keep 1564 by snarayan21 in https://github.com/mosaicml/llm-foundry/pull/1566
* Cleanup TP by eitanturok in https://github.com/mosaicml/llm-foundry/pull/1556
* Changes for dataset swap callback by gupta-abhay in https://github.com/mosaicml/llm-foundry/pull/1569
* Do not consider run_name when auto-detecting autoresume by irenedea in https://github.com/mosaicml/llm-foundry/pull/1571
* Allow parameters with requires_grad=False in meta init by sashaDoubov in https://github.com/mosaicml/llm-foundry/pull/1567
* Bump tiktoken from 0.4.0 to 0.8.0 by dependabot in https://github.com/mosaicml/llm-foundry/pull/1572
* Add extensions to FinetuningFileNotFoundError by b-chu in https://github.com/mosaicml/llm-foundry/pull/1578
* Handle long file names in convert text to mds by irenedea in https://github.com/mosaicml/llm-foundry/pull/1579
* Set streaming log level by mvpatel2000 in https://github.com/mosaicml/llm-foundry/pull/1582
* Fix pytorch checkpointing for CL callback by b-chu in https://github.com/mosaicml/llm-foundry/pull/1581
* Fix pytorch checkpointing for CL callback by b-chu in https://github.com/mosaicml/llm-foundry/pull/1583
* Error if filtered dataset contains 0 examples by irenedea in https://github.com/mosaicml/llm-foundry/pull/1585
* Change cluster errors from NetworkError to UserError by irenedea in https://github.com/mosaicml/llm-foundry/pull/1586
* Do not autoresume if a default name is set, only on user defined ones by irenedea in https://github.com/mosaicml/llm-foundry/pull/1588
* Bump onnxruntime from 1.19.0 to 1.19.2 by dependabot in https://github.com/mosaicml/llm-foundry/pull/1590
* Make FinetuningStreamingDataset parameters more flexible by XiaohanZhangCMU in https://github.com/mosaicml/llm-foundry/pull/1580
* Add build callback tests by irenedea in https://github.com/mosaicml/llm-foundry/pull/1577
* Bump version to 0.14.0.dev0 by irenedea in https://github.com/mosaicml/llm-foundry/pull/1587
* Fix typo in eval code by using 'fsdp' instead of 'fsdp_config' by irenedea in https://github.com/mosaicml/llm-foundry/pull/1593


**Full Changelog**: https://github.com/mosaicml/llm-foundry/compare/v0.12.0...v0.13.0

0.12.0

New Features

PyTorch 2.4 (1505)
This release updates LLM Foundry to the PyTorch 2.4 release, bringing with it support for the new features and optimizations in PyTorch 2.4

Extensibility improvements (1450, 1449, 1468, 1467, 1478, 1493, 1495, 1511, 1512, 1527)
Numerous improvements to the extensibility of the modeling and data loading code, enabling easier reuse for subclassing and extending. Please see the linked PRs for more details on each change.

Improved error messages (1457, 1459, 1519, 1518, 1522, 1534, 1548, 1551)
Various improved error messages, making debugging user errors more clear.

Sliding window in torch attention (1455)
We've added support for sliding window attention to the reference attention implementation, allowing easier testing and comparison against more optimized attention variants.

Bug fixes

Extra BOS token for llama 3.1 with completion data (1476)
A bug resulted in an extra BOS token being added between prompt and response during finetuning. This is fixed so that the prompt and response supplied by the user are concatenated without any extra tokens put between them.

What's Changed
* Add test for logged_config transforms by b-chu in https://github.com/mosaicml/llm-foundry/pull/1441
* Bump version to 0.12.0.dev0. by irenedea in https://github.com/mosaicml/llm-foundry/pull/1447
* Update pytest-codeblocks requirement from <0.17,>=0.16.1 to >=0.16.1,<0.18 by dependabot in https://github.com/mosaicml/llm-foundry/pull/1445
* Bump coverage[toml] from 7.4.4 to 7.6.1 by dependabot in https://github.com/mosaicml/llm-foundry/pull/1442
* Enabled generalizing build_inner_model in ComposerHFCausalLM by gupta-abhay in https://github.com/mosaicml/llm-foundry/pull/1450
* Update llm foundry version in mcli yamls by irenedea in https://github.com/mosaicml/llm-foundry/pull/1451
* merge to main by XiaohanZhangCMU in https://github.com/mosaicml/llm-foundry/pull/865
* allow embedding resizing passed through by jdchang1 in https://github.com/mosaicml/llm-foundry/pull/1449
* Update packaging requirement from <23,>=21 to >=21,<25 by dependabot in https://github.com/mosaicml/llm-foundry/pull/1444
* Update pytest requirement from <8,>=7.2.1 to >=7.2.1,<9 by dependabot in https://github.com/mosaicml/llm-foundry/pull/1443
* Implement ruff rules enforcing PEP 585 by snarayan21 in https://github.com/mosaicml/llm-foundry/pull/1453
* Adding sliding window attn to scaled_multihead_dot_product_attention by ShashankMosaicML in https://github.com/mosaicml/llm-foundry/pull/1455
* Add user error for UnicodeDeocdeError in convert text to mds by irenedea in https://github.com/mosaicml/llm-foundry/pull/1457
* Fix log_config by josejg in https://github.com/mosaicml/llm-foundry/pull/1432
* Add EnvironmentLogger Callback by josejg in https://github.com/mosaicml/llm-foundry/pull/1350
* Update mosaicml/ci-testing to 0.1.2 by irenedea in https://github.com/mosaicml/llm-foundry/pull/1458
* Correct error message for inference wrapper by josejg in https://github.com/mosaicml/llm-foundry/pull/1459
* Update CI tests to v0.1.2 by KuuCi in https://github.com/mosaicml/llm-foundry/pull/1466
* Bump onnxruntime from 1.18.1 to 1.19.0 by dependabot in https://github.com/mosaicml/llm-foundry/pull/1461
* Update tenacity requirement from <9,>=8.2.3 to >=8.2.3,<10 by dependabot in https://github.com/mosaicml/llm-foundry/pull/1460
* Simple change to enable mapping functions for ft constructor by gupta-abhay in https://github.com/mosaicml/llm-foundry/pull/1468
* use default eval interval from composer by milocress in https://github.com/mosaicml/llm-foundry/pull/1369
* Consistent Naming EnviromentLoggingCallback by josejg in https://github.com/mosaicml/llm-foundry/pull/1470
* Register NaN Monitor Callback by josejg in https://github.com/mosaicml/llm-foundry/pull/1471
* Add train subset num batches by mvpatel2000 in https://github.com/mosaicml/llm-foundry/pull/1472
* Parent class hf models by jdchang1 in https://github.com/mosaicml/llm-foundry/pull/1467
* Remove extra bos for prompt/response data with llama3.1 by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1476
* Add prepare fsdp back by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1477
* Add date_string when applying tokenizer chat template by snarayan21 in https://github.com/mosaicml/llm-foundry/pull/1474
* Make sample tokenization extensible by gupta-abhay in https://github.com/mosaicml/llm-foundry/pull/1478
* Use Streaming version 0.8.1 by snarayan21 in https://github.com/mosaicml/llm-foundry/pull/1479
* Bump hf-transfer from 0.1.3 to 0.1.8 by dependabot in https://github.com/mosaicml/llm-foundry/pull/1480
* fix hf checkpointer by milocress in https://github.com/mosaicml/llm-foundry/pull/1489
* Fix device mismatch when running hf.generate by ShashankMosaicML in https://github.com/mosaicml/llm-foundry/pull/1486
* Bump composer to 0.24.1 + FSDP config device_mesh deprecation by snarayan21 in https://github.com/mosaicml/llm-foundry/pull/1487
* master_weights_dtype not supported by ComposerHFCausalLM.__init__() by eldarkurtic in https://github.com/mosaicml/llm-foundry/pull/1485
* Detect loss spikes and high losses during training by joyce-chen-uni in https://github.com/mosaicml/llm-foundry/pull/1473
* Enable passing in external position ids by gupta-abhay in https://github.com/mosaicml/llm-foundry/pull/1493
* Align logged attributes for errors and run metadata in kill_loss_spike_callback.py by joyce-chen-uni in https://github.com/mosaicml/llm-foundry/pull/1494
* tokenizer is never built when converting finetuning dataset by eldarkurtic in https://github.com/mosaicml/llm-foundry/pull/1496
* Removing error message for reusing kv cache with torch attn by ShashankMosaicML in https://github.com/mosaicml/llm-foundry/pull/1497
* Fix formatting of loss spike & high loss error messages by joyce-chen-uni in https://github.com/mosaicml/llm-foundry/pull/1498
* Enable cross attention layers by gupta-abhay in https://github.com/mosaicml/llm-foundry/pull/1495
* Update to ci-testing 0.2.0 by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1500
* [WIP] Torch 2.4 in docker images by snarayan21 in https://github.com/mosaicml/llm-foundry/pull/1491
* [WIP] Only torch 2.4.0 compatible by snarayan21 in https://github.com/mosaicml/llm-foundry/pull/1505
* Update mlflow requirement from <2.16,>=2.14.1 to >=2.14.1,<2.17 by dependabot in https://github.com/mosaicml/llm-foundry/pull/1506
* Update ci-testing to 0.2.2 by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1503
* Allow passing key_value_statest for x-attn through MPT Block by gupta-abhay in https://github.com/mosaicml/llm-foundry/pull/1511
* Fix cross attention for blocks by gupta-abhay in https://github.com/mosaicml/llm-foundry/pull/1512
* Put 2.3 image back in release examples by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1513
* Sort callbacks so that CheckpointSaver goes before HuggingFaceCheckpointer by irenedea in https://github.com/mosaicml/llm-foundry/pull/1515
* Raise MisconfiguredDatasetError from original error by irenedea in https://github.com/mosaicml/llm-foundry/pull/1519
* Peft fsdp by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1520
* Raise DatasetTooSmall exception if canonical nodes is less than num samples by irenedea in https://github.com/mosaicml/llm-foundry/pull/1518
* Add permissions check for delta table reading by irenedea in https://github.com/mosaicml/llm-foundry/pull/1522
* Add HuggingFaceCheckpointer option for only registering final checkpoint by irenedea in https://github.com/mosaicml/llm-foundry/pull/1516
* Replace FSDP args by KuuCi in https://github.com/mosaicml/llm-foundry/pull/1517
* enable correct padding_idx for embedding layers by gupta-abhay in https://github.com/mosaicml/llm-foundry/pull/1527
* Revert "Replace FSDP args" by KuuCi in https://github.com/mosaicml/llm-foundry/pull/1533
* Delete unneeded inner base model in PEFT HF Checkpointer by snarayan21 in https://github.com/mosaicml/llm-foundry/pull/1532
* Add deprecation warning to fsdp_config by KuuCi in https://github.com/mosaicml/llm-foundry/pull/1530
* Fix reuse kv cache for torch attention by ShashankMosaicML in https://github.com/mosaicml/llm-foundry/pull/1539
* Error on text dataset file not found by milocress in https://github.com/mosaicml/llm-foundry/pull/1534
* Make ICL tasks not required for eval by snarayan21 in https://github.com/mosaicml/llm-foundry/pull/1540
* Bumping flash attention version to 2.6.3 and adding option for softcap in attention and lm_head logits. by ShashankMosaicML in https://github.com/mosaicml/llm-foundry/pull/1374
* Register mosaic logger by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1542
* Hfcheckpointer optional generation config by KuuCi in https://github.com/mosaicml/llm-foundry/pull/1543
* Bump composer version to 0.25.0 by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1546
* Bump streaming version to 0.9.0 by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1550
* Bump version to 0.13.0.dev0 by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1549
* Add proper user error for accessing schema by KuuCi in https://github.com/mosaicml/llm-foundry/pull/1548
* Validate Cluster Access Mode by KuuCi in https://github.com/mosaicml/llm-foundry/pull/1551

New Contributors
* jdchang1 made their first contribution in https://github.com/mosaicml/llm-foundry/pull/1449
* joyce-chen-uni made their first contribution in https://github.com/mosaicml/llm-foundry/pull/1473

**Full Changelog**: https://github.com/mosaicml/llm-foundry/compare/v0.11.0...v0.12.0

0.11.0

New Features

LLM Foundry CLI Commands (1337, 1345, 1348, 1354)
We've added CLI commands for our commonly used scripts.

For example, instead of calling `composer llm-foundry/scripts/train.py parameters.yaml`, you can now do `composer -c llm-foundry train parameters.yaml`.


Docker Images Contain All Optional Dependencies (1431)
[LLM Foundry Docker images](https://github.com/mosaicml/llm-foundry?tab=readme-ov-file#mosaicml-docker-images) now have all optional dependencies.


Support for Llama3 Rope Scaling (1391)
To use it, you can add the following to your parameters:

model:
name: mpt_causal_lm
attn_config:
rope: true
...
rope_impl: hf
rope_theta: 500000
rope_hf_config:
type: llama3
...



Tokenizer Registry (1386)
We now have a tokenizer registry so you can easily add custom tokenizers.

LoadPlanner and SavePlanner Registries (1358)
We now have LoadPlanner and SavePlanner registries so you can easily add custom checkpoint loading and saving logic.


Faster Auto-packing (1435)
The auto packing startup is now much faster. To use auto packing with finetuning datasets, you can add `packing_ratio: auto` to your config like so:

train_loader:
name: finetuning
dataset:
...
packing_ratio: auto



What's Changed
* Extra serverless by XiaohanZhangCMU in https://github.com/mosaicml/llm-foundry/pull/1320
* Fixing sequence_id =-1 bug, adding tests by ShashankMosaicML in https://github.com/mosaicml/llm-foundry/pull/1324
* Registry docs update by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1323
* Add dependabot by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1322
* `HUGGING_FACE_HUB_TOKEN` -> `HF_TOKEN` by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1321
* Bump version by b-chu in https://github.com/mosaicml/llm-foundry/pull/1326
* Relax hf hub pin by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1314
* Error if metadata matches existing keys by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1313
* Update transformers requirement from <4.41,>=4.40 to >=4.42.3,<4.43 by dependabot in https://github.com/mosaicml/llm-foundry/pull/1327
* Bump einops from 0.7.0 to 0.8.0 by dependabot in https://github.com/mosaicml/llm-foundry/pull/1328
* Bump onnxruntime from 1.15.1 to 1.18.1 by dependabot in https://github.com/mosaicml/llm-foundry/pull/1329
* Bump onnx from 1.14.0 to 1.16.1 by dependabot in https://github.com/mosaicml/llm-foundry/pull/1331
* Currently multi-gpu generate does not work with hf.generate for hf checkpoints. This PR fixes that. by ShashankMosaicML in https://github.com/mosaicml/llm-foundry/pull/1332
* Fix registry for callbacks with configs by mvpatel2000 in https://github.com/mosaicml/llm-foundry/pull/1333
* Adding a child class of hf's rotary embedding to make hf generate work on multiple gpus. by ShashankMosaicML in https://github.com/mosaicml/llm-foundry/pull/1334
* Add a config arg to just save an hf checkpoint by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1335
* Deepcopy config in callbacks_with_config by mvpatel2000 in https://github.com/mosaicml/llm-foundry/pull/1336
* Avoid HF race condition by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1338
* Nicer error message for undefined symbol by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1339
* Bump sentencepiece from 0.1.97 to 0.2.0 by dependabot in https://github.com/mosaicml/llm-foundry/pull/1342
* Removing logging exception through update run metadata by jjanezhang in https://github.com/mosaicml/llm-foundry/pull/1292
* [MCLOUD-4910] Escape UC names during data prep by naren-loganathan in https://github.com/mosaicml/llm-foundry/pull/1343
* Add CLI for train.py by KuuCi in https://github.com/mosaicml/llm-foundry/pull/1337
* Add fp32 to the set of valid inputs to attention layer by j316chuck in https://github.com/mosaicml/llm-foundry/pull/1347
* Log all extraneous_keys in one go for ease of development by josejg in https://github.com/mosaicml/llm-foundry/pull/1344
* Fix MLFlow Save Model for TE by j316chuck in https://github.com/mosaicml/llm-foundry/pull/1353
* Add flag for saving only composer checkpoint by irenedea in https://github.com/mosaicml/llm-foundry/pull/1356
* Expose flag for should_save_peft_only by irenedea in https://github.com/mosaicml/llm-foundry/pull/1357
* Command utils + train by KuuCi in https://github.com/mosaicml/llm-foundry/pull/1361
* Readd Clear Resolver by KuuCi in https://github.com/mosaicml/llm-foundry/pull/1365
* Add Eval to Foundry CLI by KuuCi in https://github.com/mosaicml/llm-foundry/pull/1345
* Enhanced Logging for convert_delta_to_json and convert_text_to_mds by vanshcsingh in https://github.com/mosaicml/llm-foundry/pull/1366
* Add convert_dataset_hf to CLI by KuuCi in https://github.com/mosaicml/llm-foundry/pull/1348
* Add missing init by KuuCi in https://github.com/mosaicml/llm-foundry/pull/1368
* Make ICL dataloaders build lazily by josejg in https://github.com/mosaicml/llm-foundry/pull/1359
* Add option to unfuse Wqkv by snarayan21 in https://github.com/mosaicml/llm-foundry/pull/1367
* Add convert_dataset_json to CLI by KuuCi in https://github.com/mosaicml/llm-foundry/pull/1349
* Add convert_text_to_mds to CLI by KuuCi in https://github.com/mosaicml/llm-foundry/pull/1352
* Fix hf dataset hang on small dataset by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1370
* Add LoadPlanner and SavePlanner registries by irenedea in https://github.com/mosaicml/llm-foundry/pull/1358
* Load config on rank 0 first by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1371
* Add convert_finetuning_dataset to CLI by KuuCi in https://github.com/mosaicml/llm-foundry/pull/1354
* Allow for transforms on the model before MLFlow registration by snarayan21 in https://github.com/mosaicml/llm-foundry/pull/1372
* Allow flash attention up to 3 by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1377
* Update accelerate requirement from <0.26,>=0.25 to >=0.32.1,<0.33 by dependabot in https://github.com/mosaicml/llm-foundry/pull/1341
* update runners by KevDevSha in https://github.com/mosaicml/llm-foundry/pull/1360
* Allow for multiple workers when autopacking by b-chu in https://github.com/mosaicml/llm-foundry/pull/1375
* Allow train.py-like config for eval.py by josejg in https://github.com/mosaicml/llm-foundry/pull/1351
* Fix load and save planner config logic by irenedea in https://github.com/mosaicml/llm-foundry/pull/1385
* Do dtype conversion in torch hook to save memory by irenedea in https://github.com/mosaicml/llm-foundry/pull/1384
* Get a shared file system safe signal file name by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1381
* Add transformation method to hf_causal_lm by irenedea in https://github.com/mosaicml/llm-foundry/pull/1383
* [kushalkodnad/tokenizer-registry] Introduce new registry for tokenizers by kushalkodn-db in https://github.com/mosaicml/llm-foundry/pull/1386
* Bump transformers version to 4.43.1 by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1388
* Add convert_delta_to_json to CLI by KuuCi in https://github.com/mosaicml/llm-foundry/pull/1355
* Revert "Use utils to get shared fs safe signal file name (1381)" by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1389
* Avoid race condition in convert text to mds script by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1390
* Refactor loss function for ComposerMPTCausalLM by irenedea in https://github.com/mosaicml/llm-foundry/pull/1387
* Revert "Allow for multiple workers when autopacking (1375)" by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1392
* Bump transformers to 4.43.2 by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1393
* Support rope scaling by milocress in https://github.com/mosaicml/llm-foundry/pull/1391
* Removing the extra LlamaRotaryEmbedding import by ShashankMosaicML in https://github.com/mosaicml/llm-foundry/pull/1394
* Dtensor oom by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1395
* Condition the meta initialization for hf_causal_lm on pretrain by irenedea in https://github.com/mosaicml/llm-foundry/pull/1397
* Fix license link in readme by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1398
* Enable passing epsilon when building norm layers by gupta-abhay in https://github.com/mosaicml/llm-foundry/pull/1399
* Add pre register method for mlflow by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1396
* add it by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1400
* Remove orig params default by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1401
* Add spin_dataloaders flag by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1405
* Remove curriculum learning error when duration less than saved timestamp by b-chu in https://github.com/mosaicml/llm-foundry/pull/1406
* Set pretrained model name correctly, if provided, in HF Checkpointer by snarayan21 in https://github.com/mosaicml/llm-foundry/pull/1407
* Enable QuickGelu Function for CLIP models by gupta-abhay in https://github.com/mosaicml/llm-foundry/pull/1408
* Bump streaming version to v0.8.0 by mvpatel2000 in https://github.com/mosaicml/llm-foundry/pull/1411
* Kevin/ghcr build by KevDevSha in https://github.com/mosaicml/llm-foundry/pull/1413
* Update accelerate requirement from <0.33,>=0.25 to >=0.25,<0.34 by dependabot in https://github.com/mosaicml/llm-foundry/pull/1403
* Update huggingface-hub requirement from <0.24,>=0.19.0 to >=0.19.0,<0.25 by dependabot in https://github.com/mosaicml/llm-foundry/pull/1379
* Make Pytest log in color in Github Action by eitanturok in https://github.com/mosaicml/llm-foundry/pull/1412
* Read Package Version Better by eitanturok in https://github.com/mosaicml/llm-foundry/pull/1415
* Log original config by josejg in https://github.com/mosaicml/llm-foundry/pull/1410
* Replace pydocstyle with Ruff by eitanturok in https://github.com/mosaicml/llm-foundry/pull/1417
* test cpu by KevDevSha in https://github.com/mosaicml/llm-foundry/pull/1416
* Update pr-gpu.yaml by KevDevSha in https://github.com/mosaicml/llm-foundry/pull/1420
* Additional registry entrypoint documentation by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1414
* Remove type ignore by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1421
* Update pytest-cov requirement from <5,>=4 to >=4,<6 by dependabot in https://github.com/mosaicml/llm-foundry/pull/1423
* Bump onnx from 1.16.1 to 1.16.2 by dependabot in https://github.com/mosaicml/llm-foundry/pull/1425
* Add transforms to logged config by b-chu in https://github.com/mosaicml/llm-foundry/pull/1428
* Install all optional dependencies in the docker images by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1431
* Raise error when not enough data when converting text to MDS by KuuCi in https://github.com/mosaicml/llm-foundry/pull/1430
* Bump yaml versions by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1433
* Automatically get the portion of the dataset config that is constructor args by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1434
* Remove flash patching for HF by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1436
* Fix the context size in long context gauntlet for wikiqa by bfontain in https://github.com/mosaicml/llm-foundry/pull/1439
* Update mlflow requirement from <2.15,>=2.14.1 to >=2.14.1,<2.16 by dependabot in https://github.com/mosaicml/llm-foundry/pull/1424
* Add special errors for bad chat/ift types by milocress in https://github.com/mosaicml/llm-foundry/pull/1437
* Make autopacking faster by b-chu in https://github.com/mosaicml/llm-foundry/pull/1435
* Use the pretrained generation config if it exists for HF models by irenedea in https://github.com/mosaicml/llm-foundry/pull/1440

New Contributors
* dependabot made their first contribution in https://github.com/mosaicml/llm-foundry/pull/1327
* naren-loganathan made their first contribution in https://github.com/mosaicml/llm-foundry/pull/1343
* vanshcsingh made their first contribution in https://github.com/mosaicml/llm-foundry/pull/1366
* KevDevSha made their first contribution in https://github.com/mosaicml/llm-foundry/pull/1360
* kushalkodn-db made their first contribution in https://github.com/mosaicml/llm-foundry/pull/1386
* gupta-abhay made their first contribution in https://github.com/mosaicml/llm-foundry/pull/1399
* bfontain made their first contribution in https://github.com/mosaicml/llm-foundry/pull/1439

**Full Changelog**: https://github.com/mosaicml/llm-foundry/compare/v0.10.0...v0.11.0

0.10.0

New Features

Registry for ICL datasets (https://github.com/mosaicml/llm-foundry/pull/1252)
ICL datasets have now been added as a registry.

Curriculum Learning Callback (https://github.com/mosaicml/llm-foundry/pull/1256)
You can now switch dataloaders while training which enables curriculum learning.

train_loader:
<dataloader parameters>
callback:
curriculum_learning:
- duration: <number>tok
train_loader: matches top level train_loader
<dataloader parameters>
- duration: <number>tok
train_loader:
<dataloader parameters>
- duration: <number>tok
train_loader:
<dataloader parameters>


[Experimental] Interweave Attention Layers (https://github.com/mosaicml/llm-foundry/pull/1299)
You can now override default block configs for certain layers, allowing for different sliding window sizes, reusing the previous layer's kv cache, etc.

model:
...
(usual model configs)
...
block_overrides:
order:
- name: default
- order:
- name: sliding_window_layer
- name: sliding_window_layer_reuse
- name: sliding_window_layer
- repeat: 2
name: sliding_window_layer_reuse
- name: reuse_kv_layer
repeat: 2
overrides:
sliding_window_layer:
attn_config:
sliding_window_size: 1024
sliding_window_layer_reuse:
attn_config:
sliding_window_size: 1024
reuse_kv_layer_idx: -1 Relative index of the layer whose kv cache to reuse
reuse_kv_layer:
attn_config:
reuse_kv_layer_idx: -6 Relative index of the layer whose kv cache to reuse


Bug fixes
* Fix packing + streaming + resumption by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1277

What's Changed
* Bump Version to 0.10.0.dev0 by KuuCi in https://github.com/mosaicml/llm-foundry/pull/1255
* Fix typo in setup.py by XiaohanZhangCMU in https://github.com/mosaicml/llm-foundry/pull/1263
* Update TE Dockerfile by j316chuck in https://github.com/mosaicml/llm-foundry/pull/1265
* Revert "Update TE Dockerfile (1265)" by j316chuck in https://github.com/mosaicml/llm-foundry/pull/1266
* Revert to older TE version by mvpatel2000 in https://github.com/mosaicml/llm-foundry/pull/1267
* Bump Composer to version 0.23.2 by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1269
* fix linting by milocress in https://github.com/mosaicml/llm-foundry/pull/1270
* Add torch 2.3.1 docker images by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1275
* Make expandable segments on by default by b-chu in https://github.com/mosaicml/llm-foundry/pull/1278
* Adds CI for torch 2.3.1 by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1281
* Update README.md to use variables by milocress in https://github.com/mosaicml/llm-foundry/pull/1282
* Add registry for ICL datasets by sanjari-orb in https://github.com/mosaicml/llm-foundry/pull/1252
* Fix typo in CI by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1284
* Fix backwards compatibility for ICL arg by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1286
* Fix packing + streaming + resumption by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1277
* Dbfs HF by KuuCi in https://github.com/mosaicml/llm-foundry/pull/1214
* Bump mlflow to 2.13.2 by KuuCi in https://github.com/mosaicml/llm-foundry/pull/1285
* Add missing dependency group by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1287
* Update Dockerfile with TE main by j316chuck in https://github.com/mosaicml/llm-foundry/pull/1273
* Fix TE HF checkpoint saving by j316chuck in https://github.com/mosaicml/llm-foundry/pull/1280
* added systemMetricsMonitor callback by JackZ-db in https://github.com/mosaicml/llm-foundry/pull/1260
* Extendability refactors by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1290
* Small refactor for update batch size by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1293
* Bump min composer version to 0.23.3 by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1294
* Fix grad accum typing by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1296
* Bump composer to 0.23.4 by mvpatel2000 in https://github.com/mosaicml/llm-foundry/pull/1297
* Allow passing in lbl_process_group directly by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1298
* Add `all` transforms to train script by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1300
* Add Retries to run_query by KuuCi in https://github.com/mosaicml/llm-foundry/pull/1302
* Bumping mlflow version to include buffering by JackZ-db in https://github.com/mosaicml/llm-foundry/pull/1303
* Ignore mosaicml logger for exception if excephook is active by jjanezhang in https://github.com/mosaicml/llm-foundry/pull/1301
* Add curriculum learning callback by b-chu in https://github.com/mosaicml/llm-foundry/pull/1256
* Avoid circular import in hf checkpointer by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1304
* Remove codeql workflow by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1305
* Update CI test to v0.0.8 by KuuCi in https://github.com/mosaicml/llm-foundry/pull/1306
* Upgrade ci testing to 0.0.8 by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1307
* Bump ci-testing to 0.0.9 by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1310
* Fix 4 gpu tests by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1311
* Bump recommended images to 2.3.1 and remove 2.3.0 CI by dakinggg in https://github.com/mosaicml/llm-foundry/pull/1312
* Provide default seed value in TrainConfig, matching EvalConfig by mvpatel2000 in https://github.com/mosaicml/llm-foundry/pull/1315
* Refactor hf checkpointer for config transformations by irenedea in https://github.com/mosaicml/llm-foundry/pull/1318
* Allows interweaving of arbitrary kinds of 'attention' layers, like sliding window, reuse prev layer kv cache etc. by ShashankMosaicML in https://github.com/mosaicml/llm-foundry/pull/1299
* Add optional logging of text output to EvalOutputLogging by sjawhar in https://github.com/mosaicml/llm-foundry/pull/1283

New Contributors
* sanjari-orb made their first contribution in https://github.com/mosaicml/llm-foundry/pull/1252
* JackZ-db made their first contribution in https://github.com/mosaicml/llm-foundry/pull/1260
* sjawhar made their first contribution in https://github.com/mosaicml/llm-foundry/pull/1283

**Full Changelog**: https://github.com/mosaicml/llm-foundry/compare/v0.9.1...v0.10.0

0.9.1

This is a minor patch release to bump the minimum version of mlflow to make sure to buffer writes (https://github.com/mosaicml/composer/pull/3401)

Whats changed
* Bumping mlflow version to include buffering by JackZ-db in https://github.com/mosaicml/llm-foundry/pull/1303

**Full Changelog**: https://github.com/mosaicml/llm-foundry/compare/v0.9.0...v0.9.1

Page 3 of 5

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.