Axolotl

Latest version: v0.5.2

Safety actively analyzes 681775 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 2

0.5.2

What's Changed
* move deprecated kwargs from trainer to trainingargs by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/2028
* add axolotlai docker hub org to publish list by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/2031
* update actions version for node16 deprecation by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/2037
* replace references to personal docker hub to org docker hub by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/2036
* feat: add metharme chat_template by NanoCode012 in https://github.com/axolotl-ai-cloud/axolotl/pull/2033
* change deprecated Stub to App by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/2038
* fix: handle sharegpt dataset missing by NanoCode012 in https://github.com/axolotl-ai-cloud/axolotl/pull/2035
* add P2P env when multi-gpu but not the full node by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/2041
* invert the string in string check for p2p device check by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/2044
* feat: print out dataset length even if not preprocess by NanoCode012 in https://github.com/axolotl-ai-cloud/axolotl/pull/2034
* Add example YAML file for training Mistral using DPO by olivermolenschot in https://github.com/axolotl-ai-cloud/axolotl/pull/2029
* fix: inference not using chat_template by NanoCode012 in https://github.com/axolotl-ai-cloud/axolotl/pull/2019
* feat: cancel ongoing tests if new CI is triggered by NanoCode012 in https://github.com/axolotl-ai-cloud/axolotl/pull/2046
* feat: upgrade to liger 0.4.1 by NanoCode012 in https://github.com/axolotl-ai-cloud/axolotl/pull/2045
* run pypi release action on tag create w version by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/2047
* make sure to tag images in docker for tagged releases by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/2051
* retry flaky test_packing_stream_dataset test that timesout on read by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/2052
* install default torch version if not already, new xformers wheels for torch 2.5.x by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/2049
* fix push to main and tag semver build for docker ci by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/2054
* Update unsloth for torch.cuda.amp deprecation by bursteratom in https://github.com/axolotl-ai-cloud/axolotl/pull/2042
* don't cancel the tests on main automatically for concurrency by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/2055
* ADOPT optimizer integration by bursteratom in https://github.com/axolotl-ai-cloud/axolotl/pull/2032
* Grokfast support by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1917
* upgrade to flash-attn 2.7.0 by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/2048
* make sure to add tags for versioned tag on cloud docker images by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/2060
* fix duplicate base build by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/2061
* fix env var extraction by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/2043
* gradient accumulation tests, embeddings w pad_token fix, smaller models by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/2059
* upgrade datasets==3.1.0 and add upstream check by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/2067
* update to be deprecated evaluation_strategy by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1682
* remove the bos token from dpo outputs by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1733
* support passing trust_remote_code to dataset loading by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/2050
* support for schedule free and e2e ci smoke test by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/2066
* Fsdp grad accum monkeypatch by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/2064
* fix: loading locally downloaded dataset by NanoCode012 in https://github.com/axolotl-ai-cloud/axolotl/pull/2056
* Update `get_unpad_data` patching for multipack by chiragjn in https://github.com/axolotl-ai-cloud/axolotl/pull/2013
* increase worker count to 8 for basic pytests by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/2075
* upgrade autoawq==0.2.7.post2 for transformers fix by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/2070
* optim e2e tests to run a bit faster by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/2069
* don't build bdist by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/2076
* static assets, readme, and badges update v1 by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/2077
* Readme updates v2 by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/2078
* bump transformers for fsdp-grad-accum fix, remove patch by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/2079
* Feat: Drop long samples and shuffle rl samples by NanoCode012 in https://github.com/axolotl-ai-cloud/axolotl/pull/2040
* add optimizer step to prevent warning in tests by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1502
* fix brackets on docker ci builds, add option to skip e2e builds by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/2080
* remove deprecated extra metadata kwarg from pydantic Field by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/2081
* release version 0.5.1 by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/2082
* make sure action has permission to create release by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/2083
* set manifest and fix for source dist by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/2084
* add missing dunder-init for monkeypatches and add tests for install from sdist by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/2085

New Contributors
* olivermolenschot made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/2029

**Full Changelog**: https://github.com/axolotl-ai-cloud/axolotl/compare/v0.5.0...v0.5.2

0.5.0

What's Changed
* fix(log): improve warning to clarify that lora_modules_to_save expect a list by NanoCode012 in https://github.com/axolotl-ai-cloud/axolotl/pull/1197
* Add: colab example by JohanWork in https://github.com/axolotl-ai-cloud/axolotl/pull/1196
* Feat/chatml add system message by mhenrichsen in https://github.com/axolotl-ai-cloud/axolotl/pull/1117
* fix learning rate scheduler's warnings by RicardoDominguez in https://github.com/axolotl-ai-cloud/axolotl/pull/1135
* precompute dpo logprobs setting and fixes by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1199
* Update deps 202401 by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1204
* make sure to register the base chatml template even if no system message is provided by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1207
* workaround for transformers bug requireing do_sample for saveing pretrained by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1206
* more checks and fixes for deepspeed and fsdp by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1208
* drop py39 docker images, add py311, upgrade pytorch to 2.1.2 by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1205
* Update qlora.yml - DeprecationWarning: `max_packed_sequence_len` is n… by 7flash in https://github.com/axolotl-ai-cloud/axolotl/pull/1210
* Respect sliding_window=None by DreamGenX in https://github.com/axolotl-ai-cloud/axolotl/pull/1214
* ensure the tests use the same version of torch as the latest base docker images by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1215
* ADD: warning if hub_model_id ist set but not any save strategy by JohanWork in https://github.com/axolotl-ai-cloud/axolotl/pull/1202
* run PR e2e docker CI tests in Modal by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1217
* Revert "run PR e2e docker CI tests in Modal" by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1220
* FEAT: add tagging support to axolotl for DPOTrainer by filippo82 in https://github.com/axolotl-ai-cloud/axolotl/pull/1209
* Peft lotfq by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1222
* Fix typos (pretained -> pretrained) by xhedit in https://github.com/axolotl-ai-cloud/axolotl/pull/1231
* Fix and document test_datasets by DreamGenX in https://github.com/axolotl-ai-cloud/axolotl/pull/1228
* set torch version to what is installed during axolotl install by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1234
* Cloud motd by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1235
* [Nit] Fix callout by hamelsmu in https://github.com/axolotl-ai-cloud/axolotl/pull/1237
* Support for additional_special_tokens by DreamGenX in https://github.com/axolotl-ai-cloud/axolotl/pull/1221
* Peft deepspeed resume by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1227
* support for true batches with multipack by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1230
* add contact info for dedicated support for axolotl by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1243
* fix(model): apply gate fp32 only for mixtral by NanoCode012 in https://github.com/axolotl-ai-cloud/axolotl/pull/1241
* relora: magnitude pruning of the optimizer by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1245
* Pretrain transforms by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1261
* Fix typo `bloat16` -> `bfloat16` by chiragjn in https://github.com/axolotl-ai-cloud/axolotl/pull/1257
* Add more save strategies for DPO training. by PhilipMay in https://github.com/axolotl-ai-cloud/axolotl/pull/1255
* BUG FIX: lock pytorch version in colab example by JohanWork in https://github.com/axolotl-ai-cloud/axolotl/pull/1247
* Fix typo preventing `model_kwargs` being injected by zacbrannelly in https://github.com/axolotl-ai-cloud/axolotl/pull/1262
* contributor avatars by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1269
* simplify haldning for newer multipack patches so they can be added in a single place by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1270
* Add link to axolotl cloud image on latitude by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1275
* copy edits by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1276
* allow remote data paths by hamelsmu in https://github.com/axolotl-ai-cloud/axolotl/pull/1278
* add support for https remote yamls by hamelsmu in https://github.com/axolotl-ai-cloud/axolotl/pull/1277
* run the docker image builds and push on gh action gpu runners by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1218
* Update README.md by hamelsmu in https://github.com/axolotl-ai-cloud/axolotl/pull/1281
* don't use load and push together by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1284
* Add MPS support by maximegmd in https://github.com/axolotl-ai-cloud/axolotl/pull/1264
* allow the optimizer prune ration for relora to be configurable by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1287
* Scheduler implementation of Continual Pre-Training of Large Language Models: How to (re)warm your model? by jinwonkim93 in https://github.com/axolotl-ai-cloud/axolotl/pull/1273
* Add seq2seq eval benchmark callback by LeonardoEmili in https://github.com/axolotl-ai-cloud/axolotl/pull/1274
* Validation always happens on first step by LeonardoEmili in https://github.com/axolotl-ai-cloud/axolotl/pull/1300
* fix(examples): remove is_*_derived as it's parsed automatically by NanoCode012 in https://github.com/axolotl-ai-cloud/axolotl/pull/1297
* Allow load_best_model_at_end to be configured for early stopping on custom evaluation datasets by dameikle in https://github.com/axolotl-ai-cloud/axolotl/pull/1291
* Add instructions for playing with qlora model to colab example by jaredpalmer in https://github.com/axolotl-ai-cloud/axolotl/pull/1290
* fix(readme): update inference md link by NanoCode012 in https://github.com/axolotl-ai-cloud/axolotl/pull/1311
* Adding Google's gemma Model by monk1337 in https://github.com/axolotl-ai-cloud/axolotl/pull/1312
* multipack for gemma by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1313
* deprecate: pytorch 2.0.1 image by NanoCode012 in https://github.com/axolotl-ai-cloud/axolotl/pull/1315
* fix(readme): Clarify doc for tokenizer_config by NanoCode012 in https://github.com/axolotl-ai-cloud/axolotl/pull/1323
* [bug-report template] Use yaml codeblock for config.yaml field by kallewoof in https://github.com/axolotl-ai-cloud/axolotl/pull/1303
* make mlflow optional by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1317
* Pydantic 2.x cfg by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1239
* chore: update readme to be more clear by NanoCode012 in https://github.com/axolotl-ai-cloud/axolotl/pull/1326
* ADD: push checkpoints to mlflow artifact registry by JohanWork in https://github.com/axolotl-ai-cloud/axolotl/pull/1295
* hotfix for capabilities loading by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1331
* hotfix for lora rank by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1332
* hotfix for missing outputs params by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1333
* hotfix to exclude_unset from pydantic config when converting back to a dict by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1334
* Add StableLM 2 Example Scripts by ncoop57 in https://github.com/axolotl-ai-cloud/axolotl/pull/1327
* add lion-pytorch optimizer by maximegmd in https://github.com/axolotl-ai-cloud/axolotl/pull/1299
* Support user-defined prompt processing strategies for dpo by nopperl in https://github.com/axolotl-ai-cloud/axolotl/pull/1248
* more pydantic fixes by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1338
* Mps mistral lora by maximegmd in https://github.com/axolotl-ai-cloud/axolotl/pull/1292
* fix: checkpoint saving with deepspeed by NanoCode012 in https://github.com/axolotl-ai-cloud/axolotl/pull/1321
* Update debugging.md by hamelsmu in https://github.com/axolotl-ai-cloud/axolotl/pull/1339
* fix steps check for anneal on first cycle by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1316
* Update fastchat_conversation_turns.py by eltociear in https://github.com/axolotl-ai-cloud/axolotl/pull/1294
* add gemma instruct chat template by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1341
* more fixes 20240228 by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1342
* deprecate py 3.9 support, set min pytorch version by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1343
* Fix `use_mlflow` to be bool instead of str by chiragjn in https://github.com/axolotl-ai-cloud/axolotl/pull/1344
* fix for protected model_ namespace w pydantic by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1345
* run tests again on Modal by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1289
* chore: enable sample_packing for Gemma [skip ci] by NanoCode012 in https://github.com/axolotl-ai-cloud/axolotl/pull/1351
* Fix validation for early stopping by chiragjn in https://github.com/axolotl-ai-cloud/axolotl/pull/1358
* plain input/output prompt strategy w/o chat templates by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1346
* lora+ support by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1352
* allow the sharegpt handler to also better handle datasets destined for openai finetuning by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1361
* Update tinyllama lora.yml to fix eval packing issue by rasbt in https://github.com/axolotl-ai-cloud/axolotl/pull/1362
* add starcoder2 by ehartford in https://github.com/axolotl-ai-cloud/axolotl/pull/1349
* Fix supported python versions in README, as python 3.9 was recently deprecated by nirogu in https://github.com/axolotl-ai-cloud/axolotl/pull/1364
* support for DoRA w/ PEFT by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1363
* add docs for `input_output` format by hamelsmu in https://github.com/axolotl-ai-cloud/axolotl/pull/1367
* update flash attention for gemma support by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1368
* JarvisLabs by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1372
* FDSP + QLoRA by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1378
* validation for fsdp and deepspeed by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1388
* support for rslora by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1387
* Fix pydantic configuration for the max_memory input by dandm1 in https://github.com/axolotl-ai-cloud/axolotl/pull/1385
* Set `gradient_clipping` to `auto` in DeepSpeed configs by seungduk-yanolja in https://github.com/axolotl-ai-cloud/axolotl/pull/1382
* Add Glaive conversation format support by brianfitzgerald in https://github.com/axolotl-ai-cloud/axolotl/pull/1365
* chore: lint by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1389
* add handling for argilla dpo-mix by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1397
* Update ChatTemplate enum to include alpaca and gemma by chiragjn in https://github.com/axolotl-ai-cloud/axolotl/pull/1396
* Add QLoRA + FSDP Docs by hamelsmu in https://github.com/axolotl-ai-cloud/axolotl/pull/1403
* Don't disable existing loggers when configuring axolotl logging by chiragjn in https://github.com/axolotl-ai-cloud/axolotl/pull/1395
* Train parameters exclusively in specific ranges by seungduk-yanolja in https://github.com/axolotl-ai-cloud/axolotl/pull/1390
* Fix Gemma 7b qlora.yml by rasbt in https://github.com/axolotl-ai-cloud/axolotl/pull/1405
* beta support for multipack with gemmoe by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1402
* Feat(readme): Add instructions for Google GPU VM instances by NanoCode012 in https://github.com/axolotl-ai-cloud/axolotl/pull/1410
* Fix(readme): Improve README QuickStart info by NanoCode012 in https://github.com/axolotl-ai-cloud/axolotl/pull/1408
* chore(script): remove redundant setting by NanoCode012 in https://github.com/axolotl-ai-cloud/axolotl/pull/1411
* Add Phorm AI Badge (Morph Labs) by bentleylong in https://github.com/axolotl-ai-cloud/axolotl/pull/1418
* ORPO by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1419
* fix(config): passing gradient_checkpoint_kwargs by NanoCode012 in https://github.com/axolotl-ai-cloud/axolotl/pull/1412
* Add a config not to shuffle merged dataset by seungduk-yanolja in https://github.com/axolotl-ai-cloud/axolotl/pull/1394
* Feat: Add sharegpt multirole by NanoCode012 in https://github.com/axolotl-ai-cloud/axolotl/pull/1137
* support galore once upstreamed into transformers by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1409
* fixes for dpo and orpo template loading by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1424
* HF / FEAT: Optimize HF tags by younesbelkada in https://github.com/axolotl-ai-cloud/axolotl/pull/1425
* strip out hacky qlora-fsdp workarounds now that qlora-fsdp fixes are upstreamed by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1428
* Bootstrap Hosted Axolotl Docs w/Quarto by hamelsmu in https://github.com/axolotl-ai-cloud/axolotl/pull/1429
* Orpo fix wip by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1433
* chore(config): refactor old mistral config by NanoCode012 in https://github.com/axolotl-ai-cloud/axolotl/pull/1435
* docs: update link to docs of advanced topics in README.md by pphuc25 in https://github.com/axolotl-ai-cloud/axolotl/pull/1437
* fix(dataset): normalize tokenizer config and change hash from tokenizer class to tokenizer path by NanoCode012 in https://github.com/axolotl-ai-cloud/axolotl/pull/1298
* make sure to capture non-null defaults from config validation by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1415
* Turn on sample_packing for Gemma training by satpalsr in https://github.com/axolotl-ai-cloud/axolotl/pull/1438
* Fix falcon tokenization step by pharaouk in https://github.com/axolotl-ai-cloud/axolotl/pull/1441
* Remove seq_len arg in rotary_emb by BMPixel in https://github.com/axolotl-ai-cloud/axolotl/pull/1443
* fix for accelerate env var for auto bf16, add new base image and expand torch_cuda_arch_list support by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1413
* support layer replication for peft and fix rslora integration by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1445
* fix layer_replication arg to peft by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1446
* Jamba by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1451
* Support loading datasets saved via save_to_disk by fozziethebeat in https://github.com/axolotl-ai-cloud/axolotl/pull/1432
* fix some of the edge cases for Jamba by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1452
* configure nightly docker builds by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1454
* fix how nightly tag is generated by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1456
* fix yaml parsing for workflow by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1457
* Nightlies fix v4 by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1458
* qwen2_moe support w multipack by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1455
* make sure to install causal_conv1d in docker by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1459
* Lisa by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1469
* feat: add deepspeed 3 with cpuoffload by NanoCode012 in https://github.com/axolotl-ai-cloud/axolotl/pull/1466
* reduce verbosity of the special tokens by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1472
* Reorganize Docs by hamelsmu in https://github.com/axolotl-ai-cloud/axolotl/pull/1468
* fix pretraining_ on odd datasets by mapmeld in https://github.com/axolotl-ai-cloud/axolotl/pull/1463
* Added pip install ninja to accelerate installation of flash-attn by melvinebenezer in https://github.com/axolotl-ai-cloud/axolotl/pull/1461
* Pretrain multipack v2 by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1470
* Feat: update doc by NanoCode012 in https://github.com/axolotl-ai-cloud/axolotl/pull/1475
* refactor utils.data module for line count linter by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1476
* don't use deepspeed or fsdp when merging loras by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1479
* add support for cohere chat template by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1478
* feat: validate sample packing requires flash_attention by NanoCode012 in https://github.com/axolotl-ai-cloud/axolotl/pull/1465
* fix: reduce sample_packing FA error to warning by NanoCode012 in https://github.com/axolotl-ai-cloud/axolotl/pull/1484
* drop empty token from beginning if tokenizer has no bos_token (in the case of qwen) by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1490
* Remove `validate_quantized_dora` by xzuyn in https://github.com/axolotl-ai-cloud/axolotl/pull/1485
* ignore issues with calculating params when printing by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1493
* add field to sft dataset pydantic for completion support by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1497
* Fix the wrong adapter in qwen2-moe-qlora example by maziyarpanahi in https://github.com/axolotl-ai-cloud/axolotl/pull/1501
* Print versions by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1496
* Correctly handle splits for datasets.arrow_dataset.Dataset objects by scottfleming in https://github.com/axolotl-ai-cloud/axolotl/pull/1504
* WIP: Support table logging for mlflow, too by DavidFarago in https://github.com/axolotl-ai-cloud/axolotl/pull/1506
* use locale agnostic seperator to make large nums easier to read by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1503
* Update SaveAxolotlConfigtoWandBCallback to use artifact instead of save by tcapelle in https://github.com/axolotl-ai-cloud/axolotl/pull/1483
* DBRX Model Support by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1462
* Unsloth gradient checkpointing offload by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1528
* add docs around pre-processing by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1529
* Update README.md by emilytin0206 in https://github.com/axolotl-ai-cloud/axolotl/pull/1521
* Update Readme to include support for Mixtral8X22B by Barbarian7676 in https://github.com/axolotl-ai-cloud/axolotl/pull/1518
* Create mixtral_22.yml by Barbarian7676 in https://github.com/axolotl-ai-cloud/axolotl/pull/1514
* feat(doc): Add config example for pad_token by NanoCode012 in https://github.com/axolotl-ai-cloud/axolotl/pull/1535
* llama-3 examples by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1537
* Adding Llama-3 qlora by monk1337 in https://github.com/axolotl-ai-cloud/axolotl/pull/1536
* fix broken linting by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1541
* fix(packages): lock datasets version by NanoCode012 in https://github.com/axolotl-ai-cloud/axolotl/pull/1545
* fix(yml): update llama-3 config by NanoCode012 in https://github.com/axolotl-ai-cloud/axolotl/pull/1543
* ORPO Trainer replacement by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1551
* wrap prepared_ds_path in str() to avoid TypeError in fsspec package by FrankRuis in https://github.com/axolotl-ai-cloud/axolotl/pull/1548
* Add support for Gemma chat template by Haoxiang-Wang in https://github.com/axolotl-ai-cloud/axolotl/pull/1530
* make sure everything stays in the same dtype when using dpo + FSDP by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1559
* Add ORPO example and e2e test by tokestermw in https://github.com/axolotl-ai-cloud/axolotl/pull/1572
* Pose context length ext by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1567
* chore: clarify microbatch size by NanoCode012 in https://github.com/axolotl-ai-cloud/axolotl/pull/1579
* Add debug option for RL dataset preprocessing by abhinand5 in https://github.com/axolotl-ai-cloud/axolotl/pull/1404
* ADD: warning hub model by JohanWork in https://github.com/axolotl-ai-cloud/axolotl/pull/1301
* FIX: TRL trainer preprocessing step was running in one process by ali-mosavian in https://github.com/axolotl-ai-cloud/axolotl/pull/1583
* Pass weakref to model in the SIGINT handler to free up model post train function by chiragjn in https://github.com/axolotl-ai-cloud/axolotl/pull/1581
* improve save callbacks by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1592
* fix for jupyterlab on cloud start by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1594
* add torch 2.3.0 to builds by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1593
* docs(config.qmd): add loraplus example by tpoisonooo in https://github.com/axolotl-ai-cloud/axolotl/pull/1577
* Gradio configuration parameters by marijnfs in https://github.com/axolotl-ai-cloud/axolotl/pull/1591
* Pass `deepspeed` and `fsdp` as `None` explicitly when merging adapters to allow custom device_map by chiragjn in https://github.com/axolotl-ai-cloud/axolotl/pull/1575
* feat: exclude mamba blocks for jamba when load8bit by NanoCode012 in https://github.com/axolotl-ai-cloud/axolotl/pull/1578
* improve tool handling roles by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1587
* make sure to save the lora adapter at the end of RL/dpo training by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1573
* ignore the fsdp_config section too by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1606
* adding llama3 fastchat conversation monkeypatch by TJ-Solergibert in https://github.com/axolotl-ai-cloud/axolotl/pull/1539
* feat: Add LLaMA-3 instruct prompt strategies for fine-tuning by 0-hero in https://github.com/axolotl-ai-cloud/axolotl/pull/1553
* Llama3 dpo by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1610
* add dstack section by deep-diver in https://github.com/axolotl-ai-cloud/axolotl/pull/1612
* fix attention mask collation by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1603
* make sure to save on the last step by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1615
* FIX: max_length and max_prompt_length was not being sent to ORPOTrainer by ali-mosavian in https://github.com/axolotl-ai-cloud/axolotl/pull/1584
* Fix `total_num_steps` by bofenghuang in https://github.com/axolotl-ai-cloud/axolotl/pull/1566
* update torch 2.2.1 -> 2.2.2 by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1622
* update outputs path so that we can mount workspace to /workspace/data by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1623
* bump versions of deps by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1621
* fix symlinks for axolotl outputs by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1625
* fix setting the authorized keys when there are more than one in the env var by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1626
* install rsync too by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1627
* cloud image w/o tmux by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1628
* more fixes to work with runpod + skypilot by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1629
* fix ray install by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1630
* add save_only_model option by jquesnelle in https://github.com/axolotl-ai-cloud/axolotl/pull/1634
* Unsloth optims for Llama by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1609
* fixes to save on fractional save_steps by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1643
* Add KTO support by benredmond in https://github.com/axolotl-ai-cloud/axolotl/pull/1640
* Fix llama3 chat_template (extra <|eot_id|> on last turn) by lhl in https://github.com/axolotl-ai-cloud/axolotl/pull/1635
* allow report_to for multiple providers by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1647
* Enable LoRA+ setting for dpo trainer by thepowerfuldeez in https://github.com/axolotl-ai-cloud/axolotl/pull/1646
* Update tiny-llama qlora.yml addressing eval packing error by jaydeepthik in https://github.com/axolotl-ai-cloud/axolotl/pull/1638
* support for custom messages field in sharegpt by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1651
* Switch to parallel FFD bin packing algorithm. by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1619
* document how to use `share_strategy="no"` by charlesfrye in https://github.com/axolotl-ai-cloud/axolotl/pull/1653
* update deps by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1663
* Fix Google Colab notebook 2024-05 by maciejgryka in https://github.com/axolotl-ai-cloud/axolotl/pull/1662
* Generalizing the chat_template prompt strategy by fozziethebeat in https://github.com/axolotl-ai-cloud/axolotl/pull/1660
* Fix Lora config error for Llama3 by oaishi in https://github.com/axolotl-ai-cloud/axolotl/pull/1659
* fix lint issue that snuck through by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1665
* Fix: ensure correct handling of `val_set_size` as `float` or `int` by davidecaroselli in https://github.com/axolotl-ai-cloud/axolotl/pull/1655
* Correct name of MixtralBlockSparseTop2MLP (L -> l) by seungduk-yanolja in https://github.com/axolotl-ai-cloud/axolotl/pull/1667
* Fix README quick start example usage model dirs by abevoelker in https://github.com/axolotl-ai-cloud/axolotl/pull/1668
* make sure the CI fails when pytest script fails by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1669
* handle the system role too for chat templates by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1671
* revert multipack batch sampler changes by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1672
* re-enable phi for tests in modal ci by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1373
* use mixins for orpo and kto configs so they work with axolotl customi zations by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1674
* set chat_template in datasets config automatically by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1664
* load explicit splits on datasets by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1652
* cleanup the deepspeed proxy model at the end of training by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1675
* need to add back drop_last for sampler by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1676
* Fix the broken link in README by saeedesmaili in https://github.com/axolotl-ai-cloud/axolotl/pull/1678
* re-enable DPO for tests in modal ci by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1374
* add support for rpo_alpha by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1681
* Phi-3 conversation format, example training script and perplexity metric by brianfitzgerald in https://github.com/axolotl-ai-cloud/axolotl/pull/1582
* Adding Phi-3 model by monk1337 in https://github.com/axolotl-ai-cloud/axolotl/pull/1580
* ensure explicit eval_sample_packing to avoid mismatch issues by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1692
* add qwen2-72b fsdp example by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1696
* add back packing efficiency estimate so epochs and multi-gpu works properly by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1697
* Sample packing eval fix by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1695
* bump deepspeed for fix for grad norm compute putting tensors on different devices by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1699
* verbose failure message by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1694
* download model weights on preprocess step by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1693
* drop length column for issues with eval without packing by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1711
* add support for multipack for deepseek_v2 by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1712
* Allow "weight: 0" in messages to mask them by DavidFarago in https://github.com/axolotl-ai-cloud/axolotl/pull/1703
* improve Pre-Tokenized Dataset docs by josharian in https://github.com/axolotl-ai-cloud/axolotl/pull/1684
* support for gemma2 w sample packing by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1718
* add support for .env files for env vars by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1724
* full weights fsdp training seems broken with fsdp_cpu_ram_efficient_loading by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1726
* sanity check ranges in freeze.py by josharian in https://github.com/axolotl-ai-cloud/axolotl/pull/1686
* bump trl and accelerate for latest releases by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1730
* Fixes the urls after org move by mhenrichsen in https://github.com/axolotl-ai-cloud/axolotl/pull/1734
* add tests so CI can catch updates where patches will break with unsloth by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1737
* typo by Klingefjord in https://github.com/axolotl-ai-cloud/axolotl/pull/1685
* add torch 2.3.1 base image by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1745
* fixes to prevent vram spike when train starts by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1742
* update to pytorch 2.3.1 by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1746
* bump xformers to 0.0.27 by akshaylive in https://github.com/axolotl-ai-cloud/axolotl/pull/1740
* Changed URL for dataset docs by dameikle in https://github.com/axolotl-ai-cloud/axolotl/pull/1744
* Fix eval_sample_packing in llama-3 lora example by RodriMora in https://github.com/axolotl-ai-cloud/axolotl/pull/1716
* bump flash attention 2.5.8 -> 2.6.1 by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1738
* add basic support for the optimi adamw optimizer by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1727
* update modal package and don't cache pip install by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1757
* torch compile and cuda alloc improvements by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1755
* support for llama multipack using updated code/patches by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1754
* fix num gpu check by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1760
* fixes to accelerator so that iterable pretraining datasets work by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1759
* add torch_compile_mode options by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1763
* re-enable PYTORCH_CUDA_ALLOC_CONF expandable_segments by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1765
* set the number of dataset processes on the DPO Config rather than the trainer by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1762
* Unsloth rope by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1767
* bump transformers and set roundup_power2_divisions for more VRAM improvements, low bit ao optimizers by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1769
* Fix untrained tokens by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1771
* Add a `chat_template` prompt strategy for DPO by fozziethebeat in https://github.com/axolotl-ai-cloud/axolotl/pull/1725
* swaps to use newer sample packing for mistral by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1773
* bump transformers for updated llama 3.1 by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1778
* bump flash attention to 2.6.2 by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1781
* fix fsdp loading of models, esp 70b by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1780
* add support for simpo via cpo trainer by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1772
* Bump deepspeed 20240727 by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1790
* various batch of fixes by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1785
* Add flexible configuration options for `chat_template` dataset training by Tostino in https://github.com/axolotl-ai-cloud/axolotl/pull/1756
* Update README.md by mhenrichsen in https://github.com/axolotl-ai-cloud/axolotl/pull/1792
* move to supporting mostly 12.1 w 2.3.1 and add new 12.4 with 2.4.0 by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1793
* fix dockerfile and base builder by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1795
* use 12.4.1 instead of 12.4 [skip-ci] by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1796
* update test and main/nightly builds by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1797
* publish axolotl images without extras in the tag name by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1798
* qlora-fsdp ram efficient loading with hf trainer by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1791
* fix roles to train defaults and make logging less verbose by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1801
* Fix colab example notebook by srib in https://github.com/axolotl-ai-cloud/axolotl/pull/1805
* Fix setting correct repo id when pushing dataset to hub by chrislee973 in https://github.com/axolotl-ai-cloud/axolotl/pull/1657
* Update instruct-lora-8b.yml by monk1337 in https://github.com/axolotl-ai-cloud/axolotl/pull/1789
* Update conversation.qmd by penfever in https://github.com/axolotl-ai-cloud/axolotl/pull/1788
* One cycle lr by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1803
* remove un-necessary zero-first guard as it's already called in a parent fn by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1810
* set z3 leaf for deepseek v2 by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1809
* logging improvements by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1808
* update peft and transformers by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1811
* skip no commit to main on ci by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1814
* fix z3 leaf configuration when not using lists by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1817
* update tinyllama to use final instead of checkpoints [skip ci] by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1820
* Attempt to run multigpu in PR CI for now to ensure it works by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1815
* fix the incorrect `max_length` for chat template by chiwanpark in https://github.com/axolotl-ai-cloud/axolotl/pull/1818
* bump hf dependencies by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1823
* fix: parse eager_attention from cfg by NanoCode012 in https://github.com/axolotl-ai-cloud/axolotl/pull/1824
* fix: parse model_kwargs by NanoCode012 in https://github.com/axolotl-ai-cloud/axolotl/pull/1825
* update sklearn versrion, torch compile env vars, don't worry about failure on preprocess load model by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1821
* add validation to prevent 8bit lora finetuning on H100s by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1827
* optionally save the final FSDP model as a sharded state dict by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1828
* fix: dont change quant storage dtype in case of fsdp by xgal in https://github.com/axolotl-ai-cloud/axolotl/pull/1837
* pretrain: fix with sample_packing=false by tmm1 in https://github.com/axolotl-ai-cloud/axolotl/pull/1841
* feat: add jamba chat_template by xgal in https://github.com/axolotl-ai-cloud/axolotl/pull/1843
* examples: fix tiny-llama pretrain yml syntax by tmm1 in https://github.com/axolotl-ai-cloud/axolotl/pull/1840
* rename jamba example by xgal in https://github.com/axolotl-ai-cloud/axolotl/pull/1846
* numpy 2.1.0 was released, but incompatible with numba by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1849
* ensure that the bias is also in the correct dtype by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1848
* make the train_on_eos default to turn so all eos tokens are treated the same by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1847
* fix: prompt phi by JohanWork in https://github.com/axolotl-ai-cloud/axolotl/pull/1845
* docs: minor syntax highlight fix by tmm1 in https://github.com/axolotl-ai-cloud/axolotl/pull/1839
* ensure that the hftrainer deepspeed config is set before the trainer class is ever init'ed by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1850
* run nightly ci builds against upstream main by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1851
* rename nightly test and add badge by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1853
* most model types now support flash attention 2 regardless of multipack support by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1854
* add axolotl community license by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1862
* don't mess with bnb since it needs compiled wheels by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1859
* Liger Kernel integration by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1861
* add liger example by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1864
* add liger to readme by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1865
* change up import to prevent AttributeError by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1863
* simplify logic by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1856
* better handling of llama-3 tool role by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1782
* Spectrum plugin by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1866
* update specturm authors by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1869
* Fix `drop_long_seq` bug due to truncation in prompt tokenization strategies when using `chat_template` by chiwanpark in https://github.com/axolotl-ai-cloud/axolotl/pull/1867
* clear cuda cache to help with memory leak/creep by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1858
* Add Liger Kernal support for Qwen2 by chiwanpark in https://github.com/axolotl-ai-cloud/axolotl/pull/1871
* Sample pack trust remote code v2 by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1873
* monkey-patch transformers to simplify monkey-patching modeling code by tmm1 in https://github.com/axolotl-ai-cloud/axolotl/pull/1877
* fix liger plugin load issues by tmm1 in https://github.com/axolotl-ai-cloud/axolotl/pull/1876
* deepseekv2 liger support by tmm1 in https://github.com/axolotl-ai-cloud/axolotl/pull/1878
* Add liger kernel to features section by ByronHsu in https://github.com/axolotl-ai-cloud/axolotl/pull/1881
* pin liger-kernel to latest 0.2.1 by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1882
* Update supported models for Liger Kernel by DocShotgun in https://github.com/axolotl-ai-cloud/axolotl/pull/1875
* run pytests with varied pytorch versions too by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1883
* Fix RMSNorm monkey patch for Gemma models by chiwanpark in https://github.com/axolotl-ai-cloud/axolotl/pull/1886
* add e2e smoke tests for llama liger integration by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1884
* support for auto_find_batch_size when packing by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1885
* fix optimizer + fsdp combination in example [skip ci] by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1893
* Docs for AMD-based HPC systems by tijmen in https://github.com/axolotl-ai-cloud/axolotl/pull/1891
* lint fix and update gha regex by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1899
* Fix documentation for pre-tokenized dataset by alpayariyak in https://github.com/axolotl-ai-cloud/axolotl/pull/1894
* fix zero3 integration by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1897
* bump accelerate to 0.34.2 by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1901
* remove dynamic module loader monkeypatch as this was fixed upstream by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1914
* Trigger the original tokenization behavior when no advanced turn settings are provided by fozziethebeat in https://github.com/axolotl-ai-cloud/axolotl/pull/1915
* validation fixes 20240923 by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1925
* update upstream deps versions and replace lora+ by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1928
* fix for empty lora+ lr embedding by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1932
* bump transformers to 4.45.1 by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1936
* Multimodal Vision Llama - rudimentary support by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1940
* add 2.4.1 to base models by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1953
* upgrade pytorch from 2.4.0 => 2.4.1 by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1950
* fix(log): update perplexity log to clarify from eval split by NanoCode012 in https://github.com/axolotl-ai-cloud/axolotl/pull/1952
* Fix type annotations in relora.py by bxptr in https://github.com/axolotl-ai-cloud/axolotl/pull/1941
* Comet integration by Lothiraldan in https://github.com/axolotl-ai-cloud/axolotl/pull/1939
* Fixing/Adding Mistral Templates by pandora-s-git in https://github.com/axolotl-ai-cloud/axolotl/pull/1927
* lm_eval harness post train by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1926
* Axo logo new by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1956
* Add Support for `revision` Dataset Parameter to specify reading from Huggingface Dataset Revision by thomascleberg in https://github.com/axolotl-ai-cloud/axolotl/pull/1912
* Add MLFlow run name option in config by awhazell in https://github.com/axolotl-ai-cloud/axolotl/pull/1961
* add warning that sharegpt will be deprecated by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1957
* Handle image input as string paths for MMLMs by afrizalhasbi in https://github.com/axolotl-ai-cloud/axolotl/pull/1958
* update hf deps by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1964
* only install torchao for torch versions >= 2.4.0 by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1963
* Fixing Validation - Mistral Templates by pandora-s-git in https://github.com/axolotl-ai-cloud/axolotl/pull/1962
* fix(doc): update eval causal lm metrics doc to add perplexity by NanoCode012 in https://github.com/axolotl-ai-cloud/axolotl/pull/1951
* Add support for qwen 2.5 chat template by amazingvince in https://github.com/axolotl-ai-cloud/axolotl/pull/1934
* wip add new proposed message structure by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1904
* Reward model by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1879
* add ds zero3 to multigpu biweekly tests by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1900
* upgrade accelerate to 1.0.1 by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1969
* examples: Fix config llama3 by JohanWork in https://github.com/axolotl-ai-cloud/axolotl/pull/1833
* also debug if other debug args are set by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1977
* memoize dataset length for eval sample packing by bursteratom in https://github.com/axolotl-ai-cloud/axolotl/pull/1974
* add pytorch 2.5.0 base images by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1979
* first pass at pytorch 2.5.0 support by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1982
* fix builds so pytorch version isn't clobbered by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1986
* use torch 2.4.1 images as latest now that torch 2.5.0 is out by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1987
* Log checkpoints as mlflow artifacts by awhazell in https://github.com/axolotl-ai-cloud/axolotl/pull/1976
* revert image tagged as main-latest by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1990
* Refactor func load_model to class ModelLoader by MengqingCao in https://github.com/axolotl-ai-cloud/axolotl/pull/1909
* Fix: Gradient Accumulation issue by NanoCode012 in https://github.com/axolotl-ai-cloud/axolotl/pull/1980
* fix zero3 by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1994
* add option for resizing embeddings when adding new tokens by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/2000
* Feat: Add support for tokenizer’s or custom jinja chat_template by NanoCode012 in https://github.com/axolotl-ai-cloud/axolotl/pull/1970
* Hardware requirements by OliverKunc in https://github.com/axolotl-ai-cloud/axolotl/pull/1997
* feat: update yml chat_template to specify dataset field by NanoCode012 in https://github.com/axolotl-ai-cloud/axolotl/pull/2001
* remove skipped test by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/2002
* feat: add Exaone3 chat_template by shing100 in https://github.com/axolotl-ai-cloud/axolotl/pull/1995
* Fix get_chat_template call for trainer builder by chiragjn in https://github.com/axolotl-ai-cloud/axolotl/pull/2003
* Fix: modelloader handling of model_kwargs load_in*bit by NanoCode012 in https://github.com/axolotl-ai-cloud/axolotl/pull/1999
* Add plugin manager's callback hooks to training flow by chiragjn in https://github.com/axolotl-ai-cloud/axolotl/pull/2006
* add retries for load datasets requests failures by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/2007
* Base 2 5 1 by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/2010
* only run the remainder of the gpu test suite if one case passes first by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/2009
* upgrade liger to 0.4.0 by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/1973
* janky workaround to install FA2 on torch 2.5.1 base image since it takes forever to build by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/2022
* upgrade pytorch to 2.5.1 by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/2024
* Add weighted optimisation support for trl DPO trainer integration by bursteratom in https://github.com/axolotl-ai-cloud/axolotl/pull/2016
* remove fastchat and sharegpt by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/2021
* increment version to 0.5.0 for next release by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/2025
* make publish to pypi manually dispatchable as a workflow by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/2026
* remove unused direct dependency on fused dense lib by winglian in https://github.com/axolotl-ai-cloud/axolotl/pull/2027

New Contributors
* 7flash made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1210
* DreamGenX made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1214
* filippo82 made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1209
* xhedit made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1231
* chiragjn made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1257
* PhilipMay made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1255
* zacbrannelly made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1262
* LeonardoEmili made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1274
* dameikle made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1291
* jaredpalmer made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1290
* monk1337 made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1312
* ncoop57 made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1327
* nopperl made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1248
* rasbt made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1362
* nirogu made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1364
* dandm1 made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1385
* brianfitzgerald made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1365
* bentleylong made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1418
* pphuc25 made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1437
* satpalsr made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1438
* pharaouk made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1441
* BMPixel made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1443
* fozziethebeat made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1432
* mapmeld made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1463
* melvinebenezer made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1461
* maziyarpanahi made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1501
* scottfleming made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1504
* DavidFarago made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1506
* tcapelle made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1483
* emilytin0206 made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1521
* Barbarian7676 made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1518
* FrankRuis made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1548
* abhinand5 made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1404
* ali-mosavian made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1583
* tpoisonooo made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1577
* marijnfs made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1591
* TJ-Solergibert made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1539
* 0-hero made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1553
* deep-diver made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1612
* jquesnelle made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1634
* benredmond made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1640
* lhl made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1635
* thepowerfuldeez made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1646
* jaydeepthik made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1638
* charlesfrye made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1653
* maciejgryka made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1662
* oaishi made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1659
* davidecaroselli made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1655
* abevoelker made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1668
* saeedesmaili made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1678
* josharian made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1684
* Klingefjord made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1685
* akshaylive made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1740
* RodriMora made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1716
* Tostino made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1756
* srib made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1805
* chrislee973 made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1657
* penfever made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1788
* chiwanpark made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1818
* xgal made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1837
* ByronHsu made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1881
* DocShotgun made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1875
* tijmen made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1891
* bxptr made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1941
* Lothiraldan made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1939
* pandora-s-git made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1927
* thomascleberg made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1912
* awhazell made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1961
* afrizalhasbi made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1958
* amazingvince made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1934
* bursteratom made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1974
* MengqingCao made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1909
* OliverKunc made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1997
* shing100 made their first contribution in https://github.com/axolotl-ai-cloud/axolotl/pull/1995

**Full Changelog**: https://github.com/axolotl-ai-cloud/axolotl/compare/v0.4.0...v0.5.0

0.4.0

New Features (highlights)

- Streaming multipack for continued pre-training
- Mistral & Mixtral support
- Simplified Multipack for Mistral, Falcon, Qwen2, and Phi
- DPO/IPO/KTO-pairs RL-training support via trl
- Improve BatchSampler for multipack support, allows for resume from checkpointing, shuffling data each epoch
- bf16: auto support
- add MLFlow support
- save YAML configs to WandB
- save predictions during evals to WandB
- more tests! more smoke tests for smol model training
- NEFTune support

What's Changed

* document that packaging needs to be installed before flash-attn by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/559
* Fix pretraining with iterable/streaming Dataset by jphme in https://github.com/OpenAccess-AI-Collective/axolotl/pull/556
* Add training callback to send predictions to WandB table by Glavin001 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/521
* fix wandb so mypy doesn't complain by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/562
* check for the existence of the default accelerate config that can create headaches by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/561
* add optimization for group-by-len by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/563
* gracefully handle length feature used for group by by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/565
* improve how we setup eval/save strategies and steps by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/547
* let hf trainer handle torch compile by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/516
* Model parallel by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/538
* fix save_steps so it doesn't get duplicated by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/567
* set auto for other params that hf trainer sets for ds. include zero1 json by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/570
* remove columns after tokenizing for pretraining by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/571
* mypy wandb ignore by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/572
* Phi examples by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/569
* e2e testing by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/574
* E2e device cuda by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/575
* E2e passing tests by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/576
* refactor scripts/finetune.py into new cli modules by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/550
* update support matrix with btlm and phi by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/579
* prevent cli functions from getting fired on import by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/581
* Fix Codellama examples by Kimiko-AI in https://github.com/OpenAccess-AI-Collective/axolotl/pull/582
* support custom field for completion from yml by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/580
* Feat(doc): Add features to doc by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/583
* Support Sample packing for phi arch by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/586
* don't resize embeddings if it's already large enough by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/577
* Enable full (non-sharded) model saving with SHARDED_STATE_DICT by jphme in https://github.com/OpenAccess-AI-Collective/axolotl/pull/584
* make phi training work with Loras by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/588
* optionally configure sample packing for evals by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/589
* don't add position_ids for evals when not using eval sample packing by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/591
* gather/broadcast the max value of the packing efficiency automatically by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/463
* Feat(data): Allow loading local csv and text by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/594
* add bf16 check by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/587
* btlm and falcon monkey patches for flash attn by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/566
* minor tweaks to simplify by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/597
* Fix for check with cfg and merge_lora by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/600
* improve handling for empty text on the tokenization step by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/502
* more sane defaults for openllama 3b used for quickstarts by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/602
* update dockerfile to not build evoformer since it fails the build by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/607
* Delete duplicate lines in models.py by bofenghuang in https://github.com/OpenAccess-AI-Collective/axolotl/pull/606
* support to disable exllama for gptq by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/604
* Update requirements.txt - Duplicated package by Psancs05 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/610
* Only run tests when a change to python files is made by maximegmd in https://github.com/OpenAccess-AI-Collective/axolotl/pull/614
* Create multi-node.md by maximegmd in https://github.com/OpenAccess-AI-Collective/axolotl/pull/613
* fix distributed devices by maximegmd in https://github.com/OpenAccess-AI-Collective/axolotl/pull/612
* ignore wandb to resolve isort headaches by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/619
* skip the gpu memory checks if the device is set to 'auto' by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/609
* let MAX_JOBS use the default since we're not resource constrained on our self-hosted runners by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/427
* run eval on the first step to get a baseline by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/617
* split completion text to sequence_len by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/616
* misc fixes to add gptq tests by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/621
* chore(callback): Remove old peft saving code by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/510
* update README w deepspeed info by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/605
* create a model card with axolotl badge by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/624
* better handling and logging of empty sharegpt turns by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/603
* tweak: improve base builder for smaller layers by maximegmd in https://github.com/OpenAccess-AI-Collective/axolotl/pull/500
* Feat(doc): Add eval_sample_packing to doc by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/625
* Fix: Fail bf16 check when running on cpu during merge by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/631
* default model changed by mhenrichsen in https://github.com/OpenAccess-AI-Collective/axolotl/pull/629
* Added quotes to the pip install -e command in the documentation to fix an incompatibility … by Nan-Do in https://github.com/OpenAccess-AI-Collective/axolotl/pull/632
* Feat: Add support for upstream FA2 by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/626
* eval_table isn't quite stable enough to be in default llama configs by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/637
* attention_mask not needed for training by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/642
* update for recent transformers updates by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/636
* use fastchat conversations template by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/578
* skip some flash attn patches unless explicitly enabled by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/643
* Correct typos in datasets.py by felixonmars in https://github.com/OpenAccess-AI-Collective/axolotl/pull/639
* Fix bug in dataset loading by ethanhs in https://github.com/OpenAccess-AI-Collective/axolotl/pull/284
* Warn users to login to HuggingFace by Napuh in https://github.com/OpenAccess-AI-Collective/axolotl/pull/645
* Mistral flash attn packing by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/646
* Fix(cfg): Add validation for save_strategy and eval_strategy by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/633
* Feat: Add example for Mistral by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/644
* Add mistral/README.md by adarshxs in https://github.com/OpenAccess-AI-Collective/axolotl/pull/647
* fix for flash attn w mistral w/o sammple packing by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/648
* don't strip the prompt for check since we don't strip to tokenize anymore by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/650
* add support for defined train split by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/654
* Fix bug when using pretokenized datasets by ein-ich in https://github.com/OpenAccess-AI-Collective/axolotl/pull/652
* Make dataset_processes configurable by corbt in https://github.com/OpenAccess-AI-Collective/axolotl/pull/651
* add mistral e2e tests by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/649
* removed duplicate on requirements.txt by Napuh in https://github.com/OpenAccess-AI-Collective/axolotl/pull/661
* make sure we also run CI tests when requirements.txt changes by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/663
* prepared dataset caching, other misc fixes by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/665
* remove patch fix for phi by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/664
* refactor to set eval_batch_size earlier if unset, so we can warn if mismatched by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/662
* Feat: Add config yaml to section for reprod in bug-report.yaml by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/667
* Feat: Allow usage of native Mistral FA when no sample_packing by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/669
* chore: Clean up repetitive model kwargs by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/670
* Fix(version): Update FA to work with Mistral SWA by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/673
* Fix(tokenizer): Set rstrip,lstrip,norm to False by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/678
* Fix: Future deprecation warning with use_auth_token by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/680
* Feat: Set WORKDIR to /workspace/axolotl by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/679
* Fix: ValueError when FA + Mistral when padding_side=right by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/681
* flash_attention + sample packing for stablelm 3b by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/671
* Adding qlora config for Mistral by TokenBender in https://github.com/OpenAccess-AI-Collective/axolotl/pull/675
* Fix: Higher vram usage for mistral and sample_packing by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/691
* fix multiline for docker by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/694
* update mistral lr, sample pack by mhenrichsen in https://github.com/OpenAccess-AI-Collective/axolotl/pull/693
* apex not needed as amp is part of pytorch by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/696
* add docker images for pytorch 2.10 by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/697
* fix unneeded space by mhenrichsen in https://github.com/OpenAccess-AI-Collective/axolotl/pull/699
* Update README with some explanations by seungduk-yanolja in https://github.com/OpenAccess-AI-Collective/axolotl/pull/700
* Get qlora mistral-7b fine tuning working on a single 4090 by lukemarsden in https://github.com/OpenAccess-AI-Collective/axolotl/pull/708
* fix(doc): Add note on inference w sample packing by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/712
* Fix: lowercase `True` values in config by atgctg in https://github.com/OpenAccess-AI-Collective/axolotl/pull/713
* fix(doc): update default doc according to arg by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/714
* Save Axolotl config as WandB artifact by jphme in https://github.com/OpenAccess-AI-Collective/axolotl/pull/716
* improve handling of the prepared ds path and other cfg defaults by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/701
* fix pytorch 2.1.0 build, add multipack docs by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/722
* add noisy embedding by maximegmd in https://github.com/OpenAccess-AI-Collective/axolotl/pull/721
* pin xformers >= 0.0.22 by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/724
* misc sharegpt fixes by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/723
* workaround for installing xformers w torch 2.1.0 by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/725
* tweak for xformers install w pytorch 2.1.0 by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/727
* fixes for alpaca w chatml, and don't include attention_mask w mistral for flash attention by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/728
* Clarify custom format example by casper-hansen in https://github.com/OpenAccess-AI-Collective/axolotl/pull/729
* Mistral: Sliding Window Attention with Flash Attention and Sample Packing by casper-hansen in https://github.com/OpenAccess-AI-Collective/axolotl/pull/732
* badge by mhenrichsen in https://github.com/OpenAccess-AI-Collective/axolotl/pull/739
* catch ConnectionError when checking dataset from HuggingFace by Napuh in https://github.com/OpenAccess-AI-Collective/axolotl/pull/743
* Fix(model): Linear detected and added to target module with rope linear by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/738
* improve: Enhance code readability of prompt_tokenizers.py by seungduk-yanolja in https://github.com/OpenAccess-AI-Collective/axolotl/pull/707
* add a latest tag for regular axolotl image, cleanup extraneous print statement by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/746
* Fix DeepSpeed Zero 3 Saving by tokestermw in https://github.com/OpenAccess-AI-Collective/axolotl/pull/709
* chore: bump transformers to v4.34.1 to fix tokenizer issue by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/745
* add to docs by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/703
* Implement fused modules by casper-hansen in https://github.com/OpenAccess-AI-Collective/axolotl/pull/747
* remove lora fused packing test by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/758
* Fix: eval table conflict with eval_sample_packing by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/769
* Fix: Cannot tokenize with bf16 and on cpu by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/766
* Hotfix for fused QKV not saving the trained weights of o_proj by casper-hansen in https://github.com/OpenAccess-AI-Collective/axolotl/pull/762
* convert exponential notation lr to floats by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/771
* Fix: Warn when fullfinetune without adapter by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/770
* simplify by removing duplicate base_model_config by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/772
* disable eval table w sample packing in examples by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/778
* refactor setup trainer so we can add more hooks by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/773
* chore: refactor truthy check and fix mypy by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/780
* chore(readme): Improve documentation on conversation field by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/782
* Threaded MultipackDistributedDataloader with prefetched samples by casper-hansen in https://github.com/OpenAccess-AI-Collective/axolotl/pull/759
* Create preprocess CLI by casper-hansen in https://github.com/OpenAccess-AI-Collective/axolotl/pull/785
* Add docker advanced instruction to README by gordicaleksa in https://github.com/OpenAccess-AI-Collective/axolotl/pull/792
* Fix Deepspeed Zero3 Config by teknium1 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/791
* Update to adapt to sharegpt datasets with "assistant" rather than "gp… by MilesQLi in https://github.com/OpenAccess-AI-Collective/axolotl/pull/774
* fix eval_steps to be a sane default by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/797
* refactor neft patch to be more re-usable similar to trl's impl by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/796
* fix(config): Set eos/bos to tokenizer if different by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/801
* feat(doc): add dummyoptim faq fix by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/802
* fix(tokenizer): update log order after update by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/806
* fix model parallel by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/816
* fix: pin autogptq by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/818
* update table for rwkv4 support, fix process count for dataset by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/822
* Feat: Added Gradio support by Stillerman in https://github.com/OpenAccess-AI-Collective/axolotl/pull/812
* Dockerfile: add deepspeed-kernels dependency for deepspeed>=0.12.0 by fpreiss in https://github.com/OpenAccess-AI-Collective/axolotl/pull/827
* cleanup verbosity a bit by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/799
* make sure to cleanup tmp output_dir for e2e tests by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/831
* multipack w batch sampler by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/795
* don't compile deepspeed or bitsandbytes from source by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/837
* Pin optimum package by brthor in https://github.com/OpenAccess-AI-Collective/axolotl/pull/838
* cleanup the old multipack dataloader by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/841
* include the suffix modified string in ascii art by fpreiss in https://github.com/OpenAccess-AI-Collective/axolotl/pull/852
* feat(doc): add more info on train_on_split by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/855
* chore(doc): Separate section on runpod by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/860
* various bugfixes by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/856
* adds llama and mistral dropout support by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/858
* multipack len should use max, not min by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/863
* Docs: add instructions to 1-click launching on public clouds by concretevitamin in https://github.com/OpenAccess-AI-Collective/axolotl/pull/862
* Update data.py for signature generation by MilesQLi in https://github.com/OpenAccess-AI-Collective/axolotl/pull/851
* lint fix that didn't get caught by linter by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/866
* make docker command more robust by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/861
* add e2e tests for checking functionality of resume from checkpoint by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/865
* allow overriding of model_config parameters from the YML by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/853
* Feat: Add dataset loading from S3, GCS by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/765
* try 2: pin hf transformers and accelerate to latest release, don't reinstall pytorch by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/867
* don't train if eval split is too small by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/873
* Phi update 202311 by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/876
* Install from git url by msaroufim in https://github.com/OpenAccess-AI-Collective/axolotl/pull/874
* fix: revert local dir dataset load by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/878
* chore(doc): Add info on changing role in sharegpt by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/886
* Feat: Add warmup_ratio by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/893
* fix: warning should not show if eval_batch_size not provided by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/896
* Feat: Add Qwen by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/894
* update datasets version to cut down the warnings due to pyarrow arg change by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/897
* fix: remove FA for qwen examples by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/900
* Determine FSDP/deepspeed settings on device select. by kallewoof in https://github.com/OpenAccess-AI-Collective/axolotl/pull/883
* ensure merged model matches the training dtype by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/902
* fix for qwen w lora by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/906
* Remove lr scheduler in DeepSpeed config to avoid conflict by Haoxiang-Wang in https://github.com/OpenAccess-AI-Collective/axolotl/pull/909
* feature: loss watchdog for terminating training runs that are failing by kallewoof in https://github.com/OpenAccess-AI-Collective/axolotl/pull/899
* Feat(wandb): Refactor to be more flexible by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/767
* Support device_map=sequential & max_memory config parameters by brthor in https://github.com/OpenAccess-AI-Collective/axolotl/pull/903
* feat: add check for quantized model by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/913
* Pin flash-attn to 2.3.3 by casper-hansen in https://github.com/OpenAccess-AI-Collective/axolotl/pull/919
* fix(tokenizer): handle fast tokenizer properly for bos/eos by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/914
* support for mamba by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/915
* fixing prompt template of chatml by removal of linebreak by timothylimyl in https://github.com/OpenAccess-AI-Collective/axolotl/pull/922
* Mixtral multipack by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/928
* update to latest transformers for mixstral support by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/929
* Mixtral: More correct MoE, lower loss by casper-hansen in https://github.com/OpenAccess-AI-Collective/axolotl/pull/932
* Update requirements.txt (fschat==0.2.34) by tokestermw in https://github.com/OpenAccess-AI-Collective/axolotl/pull/940
* Mixtral official by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/942
* Respect sequence_len in config for `type: llama2_chat` by hamelsmu in https://github.com/OpenAccess-AI-Collective/axolotl/pull/926
* new evals_per_epoch and saves_per_epoch to make things cleaner by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/944
* More hints on what to do with CUDA Out of memory errors by jooray in https://github.com/OpenAccess-AI-Collective/axolotl/pull/925
* fix: remove excessive newlines in system prompt(s) for alpaca by kallewoof in https://github.com/OpenAccess-AI-Collective/axolotl/pull/936
* Flash attn hotfix by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/951
* Fix Deepspeed loading by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/950
* fix: switch to using the HuggingFace Transformers NEFT implementation by kallewoof in https://github.com/OpenAccess-AI-Collective/axolotl/pull/941
* Add docs by hamelsmu in https://github.com/OpenAccess-AI-Collective/axolotl/pull/947
* Fix prompt assembly for llama by hamelsmu in https://github.com/OpenAccess-AI-Collective/axolotl/pull/952
* update transformers to fix checkpoint saving by dumpmemory in https://github.com/OpenAccess-AI-Collective/axolotl/pull/963
* update to latest nccl in docker image by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/965
* fix for build for nccl in dockerfile by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/970
* fix: add lr scheduler kwargs to Trainer by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/972
* Update README.md by eltociear in https://github.com/OpenAccess-AI-Collective/axolotl/pull/966
* Dockerfile torch fix by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/987
* fix mistral prompt assembly by hamelsmu in https://github.com/OpenAccess-AI-Collective/axolotl/pull/982
* Feat: Warns to add to modules_to_save when adding tokens or switching special_tokens by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/787
* Add tests to Docker by hamelsmu in https://github.com/OpenAccess-AI-Collective/axolotl/pull/993
* change val size by mhenrichsen in https://github.com/OpenAccess-AI-Collective/axolotl/pull/992
* chore: Update transformers to latest by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/986
* support for cuda 12.1 by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/989
* set output_router_logits for mixtral config: by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/995
* Add an example config for finetuning a 34B model on a 24GB GPU by evangriffiths in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1000
* FEAT: add tagging support to axolotl by younesbelkada in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1004
* Set eval_sample_packing to false in mistral config.yaml by kmsydney in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1003
* add config to model card by hamelsmu in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1005
* remove landmark attn and xpos rope implementations by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1010
* [Docs] Nit: clarify what inference is by hamelsmu in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1012
* [Docs] Nit: Remind people to auth to wandb if they are going to use it by hamelsmu in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1013
* feat: remove need to add load_in* during merge by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1017
* feat: expose bnb kwargs by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1018
* add ultrachat prompt strategies by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/996
* [WandB] Push axolotl config to top level wandb files by hamelsmu in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1014
* Adds chat templates by mhenrichsen in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1022
* Fix: bf16 support for inference by taziksh in https://github.com/OpenAccess-AI-Collective/axolotl/pull/981
* use recommended setting for use_reentrant w gradient checkpointing by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1021
* added tiny llama examples for lora and qlora by tdolan21 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1027
* chore(readme): update instruction to set config to load from cache by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1030
* [Docs] delete unused cfg value `lora_out_dir` by hamelsmu in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1029
* fix: lint by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1037
* chore(config): clean up old log for Qwen by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1034
* bump transformers and update attention class map name by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1023
* Added chatglm3 conversation type for training models like TinyLLama by xaviviro in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1036
* fix HF model card upload for PEFT models by hamelsmu in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1043
* Clean Up LorA Merge by hamelsmu in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1044
* feature: better device mapping for large models by kallewoof in https://github.com/OpenAccess-AI-Collective/axolotl/pull/918
* feat: always push checkpoint to hub if set by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1049
* Update tests-docker.yml by hamelsmu in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1052
* streaming multipack for pretraining dataset by jinwonkim93 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/959
* Simplify Docker Unit Test CI by hamelsmu in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1055
* Phi2 rewrite by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1058
* Efficiently get the length of the tokenized docs by RicardoDominguez in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1063
* Sponsors by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1065
* Update FUNDING.yml for Kofi link by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1067
* fix: torch_dtype mistral default to fp32 by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1050
* Cosine learning rate schedule - minimum learning rate by RicardoDominguez in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1062
* fix double eos token for chatml by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1054
* Add: mlflow for experiment tracking by JohanWork in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1059
* update peft to 0.7.0 by mtenenholtz in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1073
* paired kto support by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1069
* Separate AutoGPTQ dep to `pip install -e .[auto-gptq]` by casper-hansen in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1077
* attempt to also run e2e tests that needs gpus by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1070
* Update FUNDING.yml with bitcoin by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1079
* swap the data collator for evals if not using sample packing by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1076
* be more robust about checking embedding modules for lora finetunes by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1074
* fix: `train_on_inputs: true` ignored for sharegpt by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1045
* update sharegpt conversations when chatml chat template is set by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1075
* additional logging to get maximum token length of a sequence in the dataset by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1066
* pin accelerate for deepspeed fix by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1080
* fix: warn user to install mamba_ssm package by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1019
* use tags again for test image, only run docker e2e after pre-commit checks by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1081
* optimize calculation of cu_seqlens from position_ids by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1084
* add python 3.11 to the matrix for unit tests by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1085
* Remove fused-dense-lib from requirements.txt by casper-hansen in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1087
* misc fixes from 943 by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1086
* add gptneox embeddings, fix phi2 inputs, also fix the casting by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1083
* Add Debugging Guide by hamelsmu in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1089
* Fix debugging.md by hamelsmu in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1091
* feat: enable trl's autounwrap by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1060
* Fix broken pypi.yml by msaroufim in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1099
* Update README.md by hamelsmu in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1103
* Add section for debugging with Docker by hamelsmu in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1104
* Add link on README to Docker Debugging by hamelsmu in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1107
* keep gate in fp32 for loras by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1105
* Fix debugging video by hamelsmu in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1111
* Disable caching on `--disable_caching` in CLI by casper-hansen in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1110
* Reverse caching PR by casper-hansen in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1115
* Enable or disable bf16 support based on availability by simhallq in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1116
* update PR template so we can capture twitter or discord handles by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1121
* pin model_revision for phi2 by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1123
* fix(readme): clarify custom user prompt [no-ci] by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1124
* Add `layers_to_transform` for `lora_config` by xzuyn in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1118
* Agnostic cloud gpu docker image and Jupyter lab by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1097
* Preprocess dataset size fix by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1131
* fix(preprocess): Make sure dataset not loaded from cache when using preprocess cli by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1136
* fix bf16 check when preprocessing data by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1140
* Add shifted sparse attention by joecummings in https://github.com/OpenAccess-AI-Collective/axolotl/pull/973
* Multipack simplify for Mixtral by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1142
* Fix link for Minotaur model by joecummings in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1146
* Dockerfile cloud ports by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1148
* fix check for env var by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1151
* feat(dataset): add config to keep processed dataset in memory by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1152
* Deprecate max packed sequence len by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1141
* make sure the model config loader respects the model_revision too by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1160
* Qwen2 by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1166
* jupyter lab fixes by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1139
* set fp16 to false if bf16, update bf16: auto in example YAMLs by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1122
* Add mlflow callback for pushing config to mlflow artifacts by JohanWork in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1125
* improve vram use w gradient checkpointing by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1167
* Vram fix attempt by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1164
* add commit message option to skip docker image builds in ci by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1168
* Falcon embeddings by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1149
* support for explicit test_dataset definition for evals by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/786
* Add desc to map/filter by casper-hansen in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1162
* Feat(test): Add tests for alpaca chatml prompt tokenizer by JohanWork in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1088
* DPO cleanup by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1126
* Update README.md by singhay in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1169
* Fine-Tuning Mistral-7b for Real-World Chatbot Applications Using Axolotl (Lora used) by Tilemachoc in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1155
* don't fail if can't cast weights due to offload when merging by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1172
* update docs by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1176
* Phi2 multipack by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1173
* DPO fixes v2 by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1174
* Docs: RLHF Update after cleanup by AlekseyKorshuk in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1178
* Add support for offline mode with HF_HUB_OFFLINE envvar by JamesHWade in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1182
* Fix do_merge_lora raises an Exception in transformers v4.37.0 by tisorlawan in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1184
* report min lenght of tokenized data by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1186
* more dpo fixes for dataset loading and docs by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1185
* upgrade deepspeed to 0.13.1 for mixtral fixes by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1189
* Standardize system prompt format for AlpacaPrompter (instruct case) by sadaisystems in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1190
* Mixtral fixes 20240124 by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1192
* prepare for release v0.4.0 by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1175

New Contributors
* Kimiko-AI made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/582
* bofenghuang made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/606
* Psancs05 made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/610
* Nan-Do made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/632
* felixonmars made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/639
* Napuh made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/645
* adarshxs made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/647
* ein-ich made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/652
* corbt made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/651
* TokenBender made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/675
* seungduk-yanolja made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/700
* lukemarsden made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/708
* atgctg made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/713
* casper-hansen made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/729
* tokestermw made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/709
* gordicaleksa made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/792
* MilesQLi made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/774
* Stillerman made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/812
* fpreiss made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/827
* brthor made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/838
* concretevitamin made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/862
* msaroufim made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/874
* kallewoof made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/883
* Haoxiang-Wang made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/909
* timothylimyl made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/922
* hamelsmu made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/926
* jooray made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/925
* dumpmemory made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/963
* eltociear made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/966
* evangriffiths made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1000
* younesbelkada made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1004
* kmsydney made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1003
* taziksh made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/981
* tdolan21 made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1027
* xaviviro made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1036
* jinwonkim93 made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/959
* RicardoDominguez made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1063
* JohanWork made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1059
* mtenenholtz made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1073
* simhallq made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1116
* xzuyn made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1118
* joecummings made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/973
* singhay made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1169
* Tilemachoc made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1155
* AlekseyKorshuk made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1178
* JamesHWade made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1182
* tisorlawan made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1184
* sadaisystems made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/1190

**Full Changelog**: https://github.com/OpenAccess-AI-Collective/axolotl/compare/v0.3.0...v0.4.0

0.3.0

What's Changed
* Fix sharegpt type in doc by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/202
* add support for opimum bettertransformers by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/92
* Use AutoTokenizer for redpajama example by sroecker in https://github.com/OpenAccess-AI-Collective/axolotl/pull/209
* issue 205 bugfix by MaciejKarasek in https://github.com/OpenAccess-AI-Collective/axolotl/pull/206
* Fix tokenizing labels by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/214
* add float16 docs and tweak typehints by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/212
* support adamw and grad norm hyperparams by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/215
* Fixing Data Readme by msinha251 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/235
* don't fail fast by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/218
* better py3 support w pre-commit by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/241
* optionally define whether to use_fast tokenizer by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/240
* skip the system prompt by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/243
* push intermediate model checkpoints to hub by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/244
* System prompt data by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/224
* Add cfg.push_to_hub_model_id to readme by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/252
* Fix typing list in prompt tokenizer by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/249
* add option for instruct w sys prompts by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/246
* open orca support by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/255
* update pip install command for apex by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/247
* Fix future deprecation push_to_hub_model_id by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/258
* [WIP] Support loading data files from a local directory by utensil in https://github.com/OpenAccess-AI-Collective/axolotl/pull/221
* Fix(readme): local path loading and custom strategy type by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/264
* don't use llama if trust_remote_code is set since that needs to use AutoModel path by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/266
* params are adam_*, not adamw_* by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/268
* Quadratic warmup by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/271
* support for loading a model by git revision by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/272
* Feat(docs): Add model_revision arg by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/273
* Feat: Add save_safetensors by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/275
* Feat: Set push to hub as private by default by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/274
* Allow non-default dataset configurations by cg123 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/277
* Feat(readme): improve docs on multi-gpu by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/279
* Update requirements.txt by teknium1 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/280
* Logging update: added PID and formatting by theobjectivedad in https://github.com/OpenAccess-AI-Collective/axolotl/pull/276
* git fetch fix for docker by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/283
* misc fixes by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/286
* fix axolotl training args dataclass annotation by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/287
* fix(readme): remove accelerate config by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/288
* add hf_transfer to requirements for faster hf upload by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/289
* Fix(tokenizing): Use multi-core by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/293
* Pytorch 2.0.1 by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/300
* Fix(readme): Improve wording for push model by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/304
* add apache 2.0 license by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/308
* Flash attention 2 by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/299
* don't resize embeddings to multiples of 32x by default by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/313
* Add XGen info to README and example config by ethanhs in https://github.com/OpenAccess-AI-Collective/axolotl/pull/306
* better handling since xgen tokenizer breaks with convert_tokens_to_ids by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/307
* add runpod envs to .bashrc, fix bnb env by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/316
* update prompts for open orca to match the paper by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/317
* latest HEAD of accelerate causes 0 loss immediately w FSDP by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/321
* Prune cuda117 by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/327
* update README for updated docker images by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/328
* fix FSDP save of final model by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/329
* pin accelerate so it works with llama2 by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/330
* add peft install back since it doesn't get installed by setup.py by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/331
* lora/qlora w flash attention fixes by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/333
* feat/llama-2 examples by mhenrichsen in https://github.com/OpenAccess-AI-Collective/axolotl/pull/319
* update README by tmm1 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/337
* Fix flash-attn + qlora not working with llama models by tmm1 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/336
* optimize the iteration when tokenizeing large datasets by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/332
* Added Orca Mini prompt strategy by jphme in https://github.com/OpenAccess-AI-Collective/axolotl/pull/263
* Update XFormers Attention Monkeypatch to handle Llama-2 70B (GQA) by ssmi153 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/339
* add a basic ds zero3 config by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/347
* experimental llama 2 chat support by jphme in https://github.com/OpenAccess-AI-Collective/axolotl/pull/296
* ensure enable_input_require_grads is called on model before getting the peft model by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/345
* set `group_by_length` to false in all examples by tmm1 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/350
* GPU memory usage logging by tmm1 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/354
* simplify `load_model` signature by tmm1 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/356
* Clarify pre-tokenize before multigpu by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/359
* Update README.md on pretraining_dataset by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/360
* bump to latest bitsandbytes release with major bug fixes by tmm1 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/355
* feat(merge): save tokenizer on merge by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/362
* Feat: Add rope scaling by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/343
* Fix(message): Improve error message for bad format by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/365
* fix(model loading): warn when model revision is passed to gptq by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/364
* Add wandb_entity to wandb options, update example configs, update README by morganmcg1 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/361
* fix(save): save as safetensors by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/363
* Attention mask and position id fixes for packing by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/285
* attempt to run non-base docker builds on regular cpu hosts by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/369
* revert previous change and build ax images w docker on gpu by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/371
* extract module for working with cfg by tmm1 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/372
* quiet noise from llama tokenizer by setting pad token earlier by tmm1 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/374
* improve GPU logging to break out pytorch cache and system mem by tmm1 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/376
* simplify `load_tokenizer` by tmm1 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/375
* fix check for flash attn branching by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/377
* fix for models loading on cpu when not using accelerate launch by tmm1 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/373
* save tokenizer before training starts by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/380
* Feat(doc): Improve sharegpt doc by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/378
* Fix crash when running without CUDA by cg123 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/384
* bump flash-attn to 2.0.4 for the base docker image by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/382
* don't pass rope_scaling kwarg if it's None by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/383
* new llama-2 default settings by mhenrichsen in https://github.com/OpenAccess-AI-Collective/axolotl/pull/370
* Error msg for sharegpt if conv has less than 2 msg by flotos in https://github.com/OpenAccess-AI-Collective/axolotl/pull/379
* Feat(config): Add hub_strategy by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/386
* Added "epoch" evaluation_strategy by flotos in https://github.com/OpenAccess-AI-Collective/axolotl/pull/388
* Feat(config): add max steps by ittailup in https://github.com/OpenAccess-AI-Collective/axolotl/pull/387
* Feat(doc): Add max_steps to readme by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/389
* don't use mask expansion for inference by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/392
* use context manager to run things on rank0 before others by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/397
* Feat(doc): Add how to save by epochs by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/396
* add `utils.data.prepare_dataset` by tmm1 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/398
* better handling of empty input ids when tokenizing by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/395
* fix eval steps and strategy by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/403
* add templates, CoC and contributing guide by lightningRalf in https://github.com/OpenAccess-AI-Collective/axolotl/pull/126
* Ax art by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/405
* Fix(template): Remove iPhone/android from Issue template by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/407
* update docs for tokenizer_legacy by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/401
* Fix(docs): Update flash attn requirements by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/409
* Fix(config): Update handling of deepspeed config by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/404
* Feat(doc): Add lr_quadratic_warmup to readme by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/412
* update path to align with fsdp example by mhenrichsen in https://github.com/OpenAccess-AI-Collective/axolotl/pull/413
* tag with latest as well for axolotl-runpod by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/418
* hopefully improve the README by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/419
* use inputs for image rather than outputs for docker metadata by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/420
* Fix(template): Inform to place stack trace to Issue by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/417
* just resort to tags ans use main-latest by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/424
* Fix(docs): Remove gptq+lora and fix xformer compat list by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/423
* fix orca prompts by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/422
* fix fixture for new tokenizer handling in transformers by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/428
* remove extra accelearate in requirements by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/430
* adds color by mhenrichsen in https://github.com/OpenAccess-AI-Collective/axolotl/pull/425
* standardize attn hijack patches by tmm1 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/381
* flash attn pip install by mhenrichsen in https://github.com/OpenAccess-AI-Collective/axolotl/pull/426
* set env for FSDP offload params by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/433
* use save_strategy from config if available by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/434
* fix comma, not a tuple by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/436
* disable eval using multipack for now by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/437
* docs(readme): add `cd axolotl` by philpax in https://github.com/OpenAccess-AI-Collective/axolotl/pull/440
* support user defined prompters, pretokenized datasets in config, local parquet, local arrow files by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/348
* gracefully handle empty input by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/442
* fix evals by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/447
* feat(doc): add pillow to lambda instructions by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/445
* feat(docs): improve user customized prompts by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/443
* add missing positional arg by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/450
* is_causal fix for evals? by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/451
* set env var for FSDP layer to wrap by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/453
* always drop samples that are too long by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/452
* recast loralayer, norm, lmhead + embed token weights per original qlora by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/393
* feat: add Metharme prompt strategy by TearGosling in https://github.com/OpenAccess-AI-Collective/axolotl/pull/446
* fix test fixture b/c hf trainer tokenization changed by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/464
* workaround so training doesn't hang when packed dataloader batches aren't even by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/461
* Fix(doc): Clarify config by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/466
* ReLoRA implementation (with quantization) by cg123 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/322
* improve llama pad token handling by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/475
* Fix(tokenizer): Fix condition to add pad token by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/477
* fix types w lora by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/478
* allow newer deps in requirements.txt by tmm1 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/484
* fix checkpints on multigpu by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/481
* Fix missing 'packaging' wheel by maximegmd in https://github.com/OpenAccess-AI-Collective/axolotl/pull/482
* fix: inference did not move the model to the correct device by maximegmd in https://github.com/OpenAccess-AI-Collective/axolotl/pull/483
* let transformers handle adamw_bnb_8bit by tmm1 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/486
* Add example Llama 2 ReLoRA config by cg123 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/471
* Feat(doc): Update eval_steps doc by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/487
* zero2 config by mhenrichsen in https://github.com/OpenAccess-AI-Collective/axolotl/pull/476
* Feat(cfg): Add code-llama configs for all sizes by mhenrichsen in https://github.com/OpenAccess-AI-Collective/axolotl/pull/479
* fix: finetune model inference needs the dtype fix to work with flash-attn by maximegmd in https://github.com/OpenAccess-AI-Collective/axolotl/pull/485
* Fix(tokenizer): Make sure to add pad for CodeLlamaTokenizer by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/489
* fsdp requires params be the same type too by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/493
* simplify linear layer locator by tmm1 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/495
* `pad_to_sequence_len`, for reduced VRAM peak usage due to memory fragmentation by Birch-san in https://github.com/OpenAccess-AI-Collective/axolotl/pull/498
* Refactor train cfg cli by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/499
* tweak: use default config file when only one file is present by maximegmd in https://github.com/OpenAccess-AI-Collective/axolotl/pull/501
* Fix(doc): Clarify no amp to full yaml docs by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/496
* remove --force-reinstall from Dockerfile to ensure correct pytorch version by tmm1 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/492
* support for datasets with multiple names by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/480
* customizable ascii art by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/506
* add eval benchmark callback by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/441
* set zero3 optimizer betas to auto so they inherit from HF trainer config by tmm1 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/507
* drop empty tokenized rows too by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/509
* Changed Bench Eval to report metrics correctly by split. Added total accuracy and renamed previously used bench_accuracy to bench_average_accuracy. by alpayariyak in https://github.com/OpenAccess-AI-Collective/axolotl/pull/512
* split train from other cli options by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/503
* Added advanced DDP args by jphme in https://github.com/OpenAccess-AI-Collective/axolotl/pull/515
* Debug tokenization output: Add ability to output text only (no tokens), and/or specify num samples to see by TheBloke in https://github.com/OpenAccess-AI-Collective/axolotl/pull/511
* log supervised token count by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/448
* Fix(doc): Inform Windows users to use WSL/docker by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/518
* fix: bad dtype for full finetune by maximegmd in https://github.com/OpenAccess-AI-Collective/axolotl/pull/504
* No gather single gpu by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/523
* move is_llama_derived_model into normalize_config by tmm1 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/524
* use flash_attn xentropy when available by tmm1 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/525
* use flash_attn rmsnorm when available by tmm1 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/526
* Allow for custom system prompts with ShareGPT by bdashore3 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/520
* Add support for GPTQ using native transformers/peft by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/468
* misc fixes/improvements by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/513
* log rank by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/527
* recommend padding when using sample packing by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/531
* Early stopping metric by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/537
* Adding NCCL Timeout Guide by theobjectivedad in https://github.com/OpenAccess-AI-Collective/axolotl/pull/536
* update readme to point to direct link to runpod template, cleanup install instrucitons by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/532
* add git environment variables to compose: avoid checkout failure erro… by SlapDrone in https://github.com/OpenAccess-AI-Collective/axolotl/pull/534
* workaround for md5 variations by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/533
* Update requirements.txt by dongxiaolong in https://github.com/OpenAccess-AI-Collective/axolotl/pull/543
* fix for quant config from model by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/540
* publish to pypi workflow on tagged release by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/549
* remove with section, doesn't seem to work by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/551
* pypi on tag push by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/552
* Ergonomic update to optimizer config documentation by theobjectivedad in https://github.com/OpenAccess-AI-Collective/axolotl/pull/548
* replace tags, build dist for pypi publish by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/553
* add long_description for pypi push by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/555

New Contributors
* sroecker made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/209
* MaciejKarasek made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/206
* msinha251 made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/235
* cg123 made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/277
* teknium1 made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/280
* theobjectivedad made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/276
* ethanhs made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/306
* tmm1 made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/337
* ssmi153 made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/339
* morganmcg1 made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/361
* flotos made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/379
* ittailup made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/387
* lightningRalf made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/126
* philpax made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/440
* TearGosling made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/446
* maximegmd made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/482
* Birch-san made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/498
* alpayariyak made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/512
* TheBloke made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/511
* bdashore3 made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/520
* SlapDrone made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/534
* dongxiaolong made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/543

**Full Changelog**: https://github.com/OpenAccess-AI-Collective/axolotl/compare/v0.2.1...v0.3.0

0.2.1

What's Changed
* docker fixes: py310, fix cuda arg in deepspeed by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/115
* add support for gradient accumulation steps by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/123
* split up llama model loading so config can be loaded from base config and models can be loaded from a path by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/120
* copy xformers attn from ooba since we removed dep on alpaca_lora_4bit by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/124
* Fix(readme): Fix torch missing from readme by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/118
* Add accelerate dep by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/114
* Feat(inference): Swap to GenerationConfig by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/119
* add py310 support from base image by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/127
* add badge info to readme by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/129
* fix packing so that concatenated sequences reset the attention by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/131
* swap batch size for gradient accumulation steps to decouple from num gpu by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/130
* fix batch size calculation by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/134
* Fix: Update doc for grad_accu and add validation tests for batch size by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/135
* Feat: Add lambdalabs instruction by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/141
* Feat: Add custom prompt readme and add missing prompt strategies to Readme by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/142
* added docker-compose file by FarisHijazi in https://github.com/OpenAccess-AI-Collective/axolotl/pull/146
* Update README.md for correct image tags by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/147
* fix device map by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/148
* clone in docker by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/149
* new prompters, misc fixes for output dir missing using fsdp, and changing max seq len by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/155
* fix camel ai, add guanaco/oasst mapping for sharegpt by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/158
* Fix: Update peft and gptq instruction by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/161
* Fix: Move custom prompts out of hidden by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/162
* Fix future deprecate prepare_model_for_int8_training by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/143
* Feat: Set matmul tf32=True when tf32 passed by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/163
* Fix: Validate falcon with fsdp by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/164
* Axolotl supports falcon + qlora by utensil in https://github.com/OpenAccess-AI-Collective/axolotl/pull/132
* Fix: Set to use cfg.seed or 42 for seed by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/166
* Fix: Refactor out unmodified save_steps and eval_steps by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/167
* Disable Wandb if no wandb project is specified by bratao in https://github.com/OpenAccess-AI-Collective/axolotl/pull/168
* Feat: Improve lambda labs instruction by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/170
* Fix falcon support lora by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/171
* Feat: Add landmark attention by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/169
* Fix backward compat for peft by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/176
* Update README.md to reflect current gradient checkpointing support by PocketDocLabs in https://github.com/OpenAccess-AI-Collective/axolotl/pull/178
* fix for max sequence len across different model types by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/179
* Add streaming inference & fix stopping at EOS by Glavin001 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/180
* add support to extend context with xpos rope by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/181
* fix for local variable 'LlamaForCausalLM' referenced before assignment by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/182
* pass a prompt in from stdin for inference by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/183
* Update FAQS.md by akj2018 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/186
* various fixes by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/189
* more config pruning and migrating by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/190
* Add save_steps and eval_steps to Readme by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/191
* Fix config path after config moved by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/194
* Fix training over existing lora by AngainorDev in https://github.com/OpenAccess-AI-Collective/axolotl/pull/159
* config fixes by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/193
* misc fixes by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/192
* Fix landmark attention patch by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/177
* peft no longer needs device_map by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/187
* chore: Fix inference README. by mhenrichsen in https://github.com/OpenAccess-AI-Collective/axolotl/pull/197
* Update README.md to include a community showcase by PocketDocLabs in https://github.com/OpenAccess-AI-Collective/axolotl/pull/200
* chore: Refactor inf_kwargs out by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/199
* tweak config to work by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/196

New Contributors
* FarisHijazi made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/146
* utensil made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/132
* bratao made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/168
* PocketDocLabs made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/178
* Glavin001 made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/180
* akj2018 made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/186
* AngainorDev made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/159
* mhenrichsen made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/197

**Full Changelog**: https://github.com/OpenAccess-AI-Collective/axolotl/compare/v0.2.0...v0.2.1

0.2.0

What's Changed
* Add pre-commit: black+flake8+pylint+mypy+isort+bandit by NanoCode012 in https://github.com/OpenAccess-AI-Collective/axolotl/pull/98
* Qlora openllama 3b example by fearnworks in https://github.com/OpenAccess-AI-Collective/axolotl/pull/106
* Viktoriussuwandi patch by viktoriussuwandi in https://github.com/OpenAccess-AI-Collective/axolotl/pull/105
* default to qlora support, make gptq specific image by winglian in https://github.com/OpenAccess-AI-Collective/axolotl/pull/108

New Contributors
* fearnworks made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/106
* viktoriussuwandi made their first contribution in https://github.com/OpenAccess-AI-Collective/axolotl/pull/105

**Full Changelog**: https://github.com/OpenAccess-AI-Collective/axolotl/compare/v0.1.0...v0.2.0

Page 1 of 2

Links

Releases

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.