FineTrainers is a work-in-progress library to support (accessible) training of diffusion models. The following models are currently supported (based on Diffusers):
- CogVideoX T2V (versions 1.0 and 1.5)
- LTX Video
- Hunyuan Video
The legacy/deprecated scripts also support CogVideoX I2V and Mochi.
Currently, LoRA and Full-rank finetuning is supported. With time, more models and training techniques will be supported. We thank our many contributors for their amazing work at improve `finetrainers`. They are mentioned below in the "New Contributors" section.
In a short timespan, finetrainers has found its way into multiple research works, which has been a very motivating factor for us. They are mentioned in the "Featured Projects" section of the README. We hope you find them interesting and continue to build & work on interesting ideas, while sharing your research artifacts openly!
Some artifacts that we've released ourselves is available here: https://huggingface.co/finetrainers
We plan to focus on core algorithms/models that users prefer to have support for quickly, primarily based on the feedback we've received (thanks to everyone who's spoken with me regarding this. Your time is invaluable!) The majors asks are:
- more models and faster support for newer models (we will open this up for contributions after a major open but pending PR, and add many ourselves!)
- compatibility with UIs that do not support standardized implementations from diffusers (we will write two-way conversion scripts for new models that are added to diffusers, so that it is easy to obtain original-format weights from diffusers-format weights)
- more algorithms (Control LoRA, ControlNets for video models and VideoJAM are some of the highly asked techniques -- we will prioritize this!)
- Dataset QoL changes (this is a WIP in an open but pending PR)
Let us know what you'd like to see next & stay tuned for interesting updates!
<table align=center>
<td align=center><video src="https://github.com/user-attachments/assets/0a22a40c-7291-4a6c-86a6-c212503504b9"> Your browser does not support the video tag. </video></td>
</table>
What's Changed
* CogVideoX LoRA and full finetuning by a-r-r-o-w in https://github.com/a-r-r-o-w/finetrainers/pull/1
* Low-bit memory optimizers, CpuOffloadOptimizer, Memory Reports by a-r-r-o-w in https://github.com/a-r-r-o-w/finetrainers/pull/3
* Pin memory support in dataloader by a-r-r-o-w in https://github.com/a-r-r-o-w/finetrainers/pull/5
* DeepSpeed fixes by a-r-r-o-w in https://github.com/a-r-r-o-w/finetrainers/pull/7
* refactor readme i. by sayakpaul in https://github.com/a-r-r-o-w/finetrainers/pull/8
* DeepSpeed and DDP Configs by a-r-r-o-w in https://github.com/a-r-r-o-w/finetrainers/pull/10
* Full finetuning memory requirements by a-r-r-o-w in https://github.com/a-r-r-o-w/finetrainers/pull/9
* Multi-GPU parallel encoding support for training videos. by zRzRzRzRzRzRzR in https://github.com/a-r-r-o-w/finetrainers/pull/6
* CogVideoX I2V; CPU offloading; Model README descriptions by a-r-r-o-w in https://github.com/a-r-r-o-w/finetrainers/pull/11
* add VideoDatasetWithResizeAndRectangleCrop dataset resize crop by glide-the in https://github.com/a-r-r-o-w/finetrainers/pull/13
* add "max_sequence_length": model_config.max_text_seq_length, by glide-the in https://github.com/a-r-r-o-w/finetrainers/pull/15
* readme updates + refactor by a-r-r-o-w in https://github.com/a-r-r-o-w/finetrainers/pull/14
* Update README.md by a-r-r-o-w in https://github.com/a-r-r-o-w/finetrainers/pull/17
* merge by zRzRzRzRzRzRzR in https://github.com/a-r-r-o-w/finetrainers/pull/18
* Darft of Chinese README by zRzRzRzRzRzRzR in https://github.com/a-r-r-o-w/finetrainers/pull/19
* docs: update README.md by eltociear in https://github.com/a-r-r-o-w/finetrainers/pull/21
* Update requirements.txt (fixed typo) by Nojahhh in https://github.com/a-r-r-o-w/finetrainers/pull/24
* Update README and Contribution guide by zRzRzRzRzRzRzR in https://github.com/a-r-r-o-w/finetrainers/pull/20
* Lower requirements versions by a-r-r-o-w in https://github.com/a-r-r-o-w/finetrainers/pull/27
* Update for windows compability by Nojahhh in https://github.com/a-r-r-o-w/finetrainers/pull/32
* [Docs] : Update README.md by FarukhS52 in https://github.com/a-r-r-o-w/finetrainers/pull/35
* Improve dataset preparation support + multiresolution prep by a-r-r-o-w in https://github.com/a-r-r-o-w/finetrainers/pull/39
* Update prepare_dataset.sh by a-r-r-o-w in https://github.com/a-r-r-o-w/finetrainers/pull/42
* improve dataset preparation by a-r-r-o-w in https://github.com/a-r-r-o-w/finetrainers/pull/43
* more dataset fixes by a-r-r-o-w in https://github.com/a-r-r-o-w/finetrainers/pull/49
* fix: correct type in .py files by DhanushNehru in https://github.com/a-r-r-o-w/finetrainers/pull/52
* fix: resuming from a checkpoint when using deepspeed. by sayakpaul in https://github.com/a-r-r-o-w/finetrainers/pull/38
* Windows support for T2V scripts by a-r-r-o-w in https://github.com/a-r-r-o-w/finetrainers/pull/48
* Fixed optimizers parsing error in bash scripts by Nojahhh in https://github.com/a-r-r-o-w/finetrainers/pull/61
* Update readme to install diffusers from source by Yuancheng-Xu in https://github.com/a-r-r-o-w/finetrainers/pull/59
* Update README.md by a-r-r-o-w in https://github.com/a-r-r-o-w/finetrainers/pull/73
* add some script of lora test by zRzRzRzRzRzRzR in https://github.com/a-r-r-o-w/finetrainers/pull/66
* I2V multiresolution finetuning by removing learned PEs by a-r-r-o-w in https://github.com/a-r-r-o-w/finetrainers/pull/31
* adaption for CogVideoX1.5 by jiashenggu in https://github.com/a-r-r-o-w/finetrainers/pull/92
* docs: fix help message in args.py by Leojc in https://github.com/a-r-r-o-w/finetrainers/pull/98
* sft with multigpu by zhipuch in https://github.com/a-r-r-o-w/finetrainers/pull/84
* [feat] add Mochi-1 trainer by sayakpaul in https://github.com/a-r-r-o-w/finetrainers/pull/90
* wandb tracker in scheduling problems during the training initiation and training stages by glide-the in https://github.com/a-r-r-o-w/finetrainers/pull/100
* fix format specifier. by sayakpaul in https://github.com/a-r-r-o-w/finetrainers/pull/104
* Unbound fix by glide-the in https://github.com/a-r-r-o-w/finetrainers/pull/105
* feat: support checkpointing saving and loading by sayakpaul in https://github.com/a-r-r-o-w/finetrainers/pull/106
* RoPE fixes for 1.5, bfloat16 support in prepare_dataset, gradient_accumulation grad norm undefined fix by a-r-r-o-w in https://github.com/a-r-r-o-w/finetrainers/pull/107
* Update README.md to include mochi-1 trainer by sayakpaul in https://github.com/a-r-r-o-w/finetrainers/pull/112
* add I2V sft and fix an error by jiashenggu in https://github.com/a-r-r-o-w/finetrainers/pull/97
* LTX Video by a-r-r-o-w in https://github.com/a-r-r-o-w/finetrainers/pull/123
* Hunyuan Video LoRA by a-r-r-o-w in https://github.com/a-r-r-o-w/finetrainers/pull/126
* Precomputation of conditions and latents by a-r-r-o-w in https://github.com/a-r-r-o-w/finetrainers/pull/129
* Grad Norm tracking in DeepSpeed by a-r-r-o-w in https://github.com/a-r-r-o-w/finetrainers/pull/148
* fix validation bug by a-r-r-o-w in https://github.com/a-r-r-o-w/finetrainers/pull/149
* [feat] support DeepSpeed. by sayakpaul in https://github.com/a-r-r-o-w/finetrainers/pull/139
* [optimization] support 8bit optims from bistandbytes by sayakpaul in https://github.com/a-r-r-o-w/finetrainers/pull/163
* [Chore] bulk update styling and formatting by sayakpaul in https://github.com/a-r-r-o-w/finetrainers/pull/170
* Update README.md to fix graph paths by sayakpaul in https://github.com/a-r-r-o-w/finetrainers/pull/171
* Support CogVideoX T2V by sayakpaul in https://github.com/a-r-r-o-w/finetrainers/pull/165
* Fix scheduler bugs by sayakpaul in https://github.com/a-r-r-o-w/finetrainers/pull/177
* scheduler fixes part ii by sayakpaul in https://github.com/a-r-r-o-w/finetrainers/pull/178
* [CI] add a workflow to do quality checks. by sayakpaul in https://github.com/a-r-r-o-w/finetrainers/pull/180
* support model cards by sayakpaul in https://github.com/a-r-r-o-w/finetrainers/pull/176
* [docs] refactor docs for easier info parsing by sayakpaul in https://github.com/a-r-r-o-w/finetrainers/pull/175
* Allow images; Remove LLM generated prefixes; Allow JSON/JSONL; Fix bugs by a-r-r-o-w in https://github.com/a-r-r-o-w/finetrainers/pull/158
* simplify docs part ii by sayakpaul in https://github.com/a-r-r-o-w/finetrainers/pull/190
* Update requirements by a-r-r-o-w in https://github.com/a-r-r-o-w/finetrainers/pull/189
* Fix minor bug with function call that doesn't exist. by ArEnSc in https://github.com/a-r-r-o-w/finetrainers/pull/195
* Precomputation folder name based on model name by a-r-r-o-w in https://github.com/a-r-r-o-w/finetrainers/pull/196
* Better defaults for LTXV by a-r-r-o-w in https://github.com/a-r-r-o-w/finetrainers/pull/198
* [core] Fix loading of precomputed conditions and latents by sayakpaul in https://github.com/a-r-r-o-w/finetrainers/pull/199
* Epoch loss by a-r-r-o-w in https://github.com/a-r-r-o-w/finetrainers/pull/201
* Shell script to minimally test supported models on a real dataset by sayakpaul in https://github.com/a-r-r-o-w/finetrainers/pull/204
* Update pr_tests.yml to update ruff version by sayakpaul in https://github.com/a-r-r-o-w/finetrainers/pull/205
* Fix the checkpoint dir bug in `get_intermediate_ckpt_path` by Awcrr in https://github.com/a-r-r-o-w/finetrainers/pull/207
* Argument descriptions by a-r-r-o-w in https://github.com/a-r-r-o-w/finetrainers/pull/208
* Improve argument handling by a-r-r-o-w in https://github.com/a-r-r-o-w/finetrainers/pull/209
* Helpful messages by a-r-r-o-w in https://github.com/a-r-r-o-w/finetrainers/pull/210
* Full Finetuning for LTX possibily extended to other models. by ArEnSc in https://github.com/a-r-r-o-w/finetrainers/pull/192
* Refactor private methods by a-r-r-o-w in https://github.com/a-r-r-o-w/finetrainers/pull/213
* Rename lora files by a-r-r-o-w in https://github.com/a-r-r-o-w/finetrainers/pull/214
* Fix: Utils error in mochi finetuning script by guptaaryan16 in https://github.com/a-r-r-o-w/finetrainers/pull/218
* (fake*) FP8 training support by a-r-r-o-w in https://github.com/a-r-r-o-w/finetrainers/pull/184
* Restructure model folder by a-r-r-o-w in https://github.com/a-r-r-o-w/finetrainers/pull/219
* [docs] add a note on MP. by sayakpaul in https://github.com/a-r-r-o-w/finetrainers/pull/224
* [Chore] reset unnecessary args by sayakpaul in https://github.com/a-r-r-o-w/finetrainers/pull/225
* [core] fix pipeline loading by waiting till `transformer` is saved. by sayakpaul in https://github.com/a-r-r-o-w/finetrainers/pull/226
* fix(typo): captions path in disney dataset by badayvedat in https://github.com/a-r-r-o-w/finetrainers/pull/236
* fix(tests/hunyuanvideo-lora): typo in id token by badayvedat in https://github.com/a-r-r-o-w/finetrainers/pull/237
* Fix LTX frame rate for rope interpolation scale calculation by a-r-r-o-w in https://github.com/a-r-r-o-w/finetrainers/pull/244
* fix(test): lora inference script by badayvedat in https://github.com/a-r-r-o-w/finetrainers/pull/247
* fix: model card info. by sayakpaul in https://github.com/a-r-r-o-w/finetrainers/pull/251
* [core] Ensure loading mp first by sayakpaul in https://github.com/a-r-r-o-w/finetrainers/pull/252
* add script to convert HunyuanVideo diffusers lora to original by spacepxl in https://github.com/a-r-r-o-w/finetrainers/pull/255
* Update Makefile to include `examples/` by a-r-r-o-w in https://github.com/a-r-r-o-w/finetrainers/pull/256
* updating weight names to match original instead of comfyui by spacepxl in https://github.com/a-r-r-o-w/finetrainers/pull/258
* mention cool stuff. by sayakpaul in https://github.com/a-r-r-o-w/finetrainers/pull/262
* Update README.md by a-r-r-o-w in https://github.com/a-r-r-o-w/finetrainers/pull/263
* Move legacy scripts to `examples/_legacy` by a-r-r-o-w in https://github.com/a-r-r-o-w/finetrainers/pull/257
* [WIP][tests] add precomputation tests by sayakpaul in https://github.com/a-r-r-o-w/finetrainers/pull/234
* featured projects by a-r-r-o-w in https://github.com/a-r-r-o-w/finetrainers/pull/270
New Contributors
* glide-the made their first contribution in https://github.com/a-r-r-o-w/finetrainers/pull/13
* eltociear made their first contribution in https://github.com/a-r-r-o-w/finetrainers/pull/21
* Nojahhh made their first contribution in https://github.com/a-r-r-o-w/finetrainers/pull/24
* FarukhS52 made their first contribution in https://github.com/a-r-r-o-w/finetrainers/pull/35
* DhanushNehru made their first contribution in https://github.com/a-r-r-o-w/finetrainers/pull/52
* Yuancheng-Xu made their first contribution in https://github.com/a-r-r-o-w/finetrainers/pull/59
* jiashenggu made their first contribution in https://github.com/a-r-r-o-w/finetrainers/pull/92
* Leojc made their first contribution in https://github.com/a-r-r-o-w/finetrainers/pull/98
* zhipuch made their first contribution in https://github.com/a-r-r-o-w/finetrainers/pull/84
* ArEnSc made their first contribution in https://github.com/a-r-r-o-w/finetrainers/pull/195
* Awcrr made their first contribution in https://github.com/a-r-r-o-w/finetrainers/pull/207
* guptaaryan16 made their first contribution in https://github.com/a-r-r-o-w/finetrainers/pull/218
* badayvedat made their first contribution in https://github.com/a-r-r-o-w/finetrainers/pull/236
* spacepxl made their first contribution in https://github.com/a-r-r-o-w/finetrainers/pull/255
**Full Changelog**: https://github.com/a-r-r-o-w/finetrainers/commits/v0.0.1