Diffusers

Latest version: v0.32.2

Safety actively analyzes 714973 Python packages for vulnerabilities to keep your Python projects secure.

Page 3 of 16

0.32.1

TorchAO Quantizer fixes

This patch release fixes a few bugs related to the TorchAO Quantizer introduced in [v0.32.0]().

- Importing Diffusers would raise an error in PyTorch versions lower than 2.3.0. This should no longer be a problem.
- Device Map does not work as expected when using the quantizer. We now raise an error if it is used. Support for using device maps with different quantization backends will be added in the near future.
- Quantization was not performed due to faulty logic. This is now fixed and better tested.

Refer to our [documentation](https://huggingface.co/docs/diffusers/) to learn more about how to use different quantization backends.

All commits

* make style for https://github.com/huggingface/diffusers/pull/10368 by yiyixuxu in #10370
* fix test pypi installation in the release workflow by sayakpaul in 10360
* Fix TorchAO related bugs; revert device_map changes by a-r-r-o-w in 10371

0.32.0

Significant community contributions

The following contributors have made significant changes to the library over the last release:

* faaany
* fix bug in `require_accelerate_version_greater` (9746)
* make `pipelines` tests device-agnostic (part1) (9399)
* make `pipelines` tests device-agnostic (part2) (9400)
* linoytsaban
* [SD3-5 dreambooth lora] update model cards (9749)
* [SD 3.5 Dreambooth LoRA] support configurable training block & layers (9762)
* [flux dreambooth lora training] make LoRA target modules configurable + small bug fix (9646)
* [advanced flux training] bug fix + reduce memory cost as in 9829 (9838)
* [SD3 dreambooth lora] smol fix to checkpoint saving (9993)
* [Flux Redux] add prompt & multiple image input (10056)
* [community pipeline] Add RF-inversion Flux pipeline (9816)
* [community pipeline rf-inversion] - fix example in doc (10179)
* [RF inversion community pipeline] add eta_decay (10199)
* raulc0399
* adds the pipeline for pixart alpha controlnet (8857)
* yiyixuxu
* Revert "[LoRA] fix: lora loading when using with a device_mapped mode… (9823)
* fix controlnet module refactor (9968)
* Sd35 controlnet (10020)
* fix offloading for sd3.5 controlnets (10072)
* pass attn mask arg for flux (10122)
* update `get_parameter_dtype` (10342)
* jellyheadandrew
* Add new community pipeline for 'Adaptive Mask Inpainting', introduced in [ECCV2024] ComA (9228)
* DN6
* Improve downloads of sharded variants (9869)
* [CI] Unpin torch<2.5 in CI (9961)
* Flux latents fix (9929)
* [Single File] Fix SD3.5 single file loading (10077)
* [Single File] Pass token when fetching interpreted config (10082)
* [Single File] Add single file support for AutoencoderDC (10183)
* Fix format issue in push_test yml (10235)
* [Single File] Add GGUF support (9964)
* Fix Mochi Quality Issues (10033)
* Fix Doc links in GGUF and Quantization overview docs (10279)
* Make zeroing prompt embeds for Mochi Pipeline configurable (10284)
* [Single File] Add single file support for Flux Canny, Depth and Fill (10288)
* [Single File] Add single file support for Mochi Transformer (10268)
* Allow Mochi Transformer to be split across multiple GPUs (10300)
* [Single File] Add GGUF support for LTX (10298)
* Mochi docs (9934)
* [Single File] Add Single File support for HunYuan video (10320)
* [Single File] Fix loading (10349)
* ParagEkbote
* Notebooks for Community Scripts Examples (9905)
* Move Wuerstchen Dreambooth to research_projects (9935)
* Fixed Nits in Docs and Example Script (9940)
* Notebooks for Community Scripts-2 (9952)
* Move IP Adapter Scripts to research project (9960)
* Notebooks for Community Scripts-3 (10032)
* Fixed Nits in Evaluation Docs (10063)
* Notebooks for Community Scripts-4 (10094)
* Fix Broken Link in Optimization Docs (10105)
* Fix Broken Links in ReadMe (10117)
* painebenjamin
* Fix Progress Bar Updates in SD 1.5 PAG Img2Img pipeline (9925)
* Add StableDiffusion3PAGImg2Img Pipeline + Fix SD3 Unconditional PAG (9932)
* hlky
* Fix beta and exponential sigmas + add tests (9954)
* ControlNet from_single_file when already converted (9978)
* Add `beta`, `exponential` and `karras` sigmas to `FlowMatchEulerDiscreteScheduler` (10001)
* Add `sigmas` to Flux pipelines (10081)
* Fix `num_images_per_prompt>1` with Skip Guidance Layers in `StableDiffusion3Pipeline` (10086)
* Convert `sigmas` to `np.array` in FlowMatch set_timesteps (10088)
* Fix multi-prompt inference (10103)
* Test `skip_guidance_layers` in SD3 pipeline (10102)
* Fix `pipeline_stable_audio` formating (10114)
* Add `sigmas` to pipelines using FlowMatch (10116)
* Use `torch` in `get_3d_rotary_pos_embed`/`_allegro` (10161)
* Add ControlNetUnion (10131)
* Remove `negative_*` from SDXL callback (10203)
* refactor StableDiffusionXLControlNetUnion (10200)
* Use `torch` in `get_2d_sincos_pos_embed` and `get_3d_sincos_pos_embed` (10156)
* Use `t` instead of `timestep` in `_apply_perturbed_attention_guidance` (10243)
* Add `dynamic_shifting` to SD3 (10236)
* Fix `use_flow_sigmas` (10242)
* Fix ControlNetUnion _callback_tensor_inputs (10218)
* Use non-human subject in StableDiffusion3ControlNetPipeline example (10214)
* Add enable_vae_tiling to AllegroPipeline, fix example (10212)
* Fix checkpoint in CogView3PlusPipeline example (10211)
* Fix RePaint Scheduler (10185)
* Add ControlNetUnion to AutoPipeline from_pretrained (10219)
* Add `set_shift` to FlowMatchEulerDiscreteScheduler (10269)
* Use `torch` in `get_2d_rotary_pos_embed` (10155)
* Fix sigma_last with use_flow_sigmas (10267)
* Add Flux Control to AutoPipeline (10292)
* Check correct model type is passed to `from_pretrained` (10189)
* Fix `local_files_only` for checkpoints with shards (10294)
* Fix push_tests_mps.yml (10326)
* Fix EMAModel test_from_pretrained (10325)
* Support Flux IP Adapter (10261)
* Fix enable_sequential_cpu_offload in test_kandinsky_combined (10324)
* Fix FluxIPAdapterTesterMixin (10354)
* dimitribarbot
* Update sdxl reference pipeline to latest sdxl pipeline (9938)
* Add sdxl controlnet reference community pipeline (9893)
* suzukimain
* [community] Load Models from Sources like `Civitai` into Existing Pipelines (9986)
* lawrence-cj
* [DC-AE] Add the official Deep Compression Autoencoder code(32x,64x,128x compression ratio); (9708)
* [Sana] Add Sana, including `SanaPipeline`, `SanaPAGPipeline`, `LinearAttentionProcessor`, `Flow-based DPM-sovler` and so on. (9982)
* [Sana]add 2K related model for Sana (10322)
* [Sana bug] bug fix for 2K model config (10340)
* darshil0805
* Add PAG Support for Stable Diffusion Inpaint Pipeline (9386)
* affromero
* Flux Control(Depth/Canny) + Inpaint (10192)
* SHYuanBest
* [LoRA] Support HunyuanVideo (10254)
* guiyrt
* [WIP] SD3.5 IP-Adapter Pipeline Integration (9987)

0.31.0

Significant community contributions

The following contributors have made significant changes to the library over the last release:

* ighoshsubho
* Feature flux controlnet img2img and inpaint pipeline (9408)
* flux controlnet control_guidance_start and control_guidance_end implement (9571)
* noskill
* adapt masked im2im pipeline for SDXL (7790)
* saqlain2204
* [Tests] Reduce the model size in the lumina test (8985)
* Add Differential Diffusion to Kolors (9423)
* hlky
* [Schedulers] Add exponential sigmas / exponential noise schedule (9499)
* Add Noise Schedule/Schedule Type to Schedulers Overview documentation (9504)
* Add exponential sigmas to other schedulers and update docs (9518)
* [Schedulers] Add beta sigmas / beta noise schedule (9509)
* Add beta sigmas to other schedulers and update docs (9538)
* FluxMultiControlNetModel (9647)
* Add pred_original_sample to `if not return_dict` path (9649)
* Convert list/tuple of `SD3ControlNetModel` to `SD3MultiControlNetModel` (9652)
* Convert list/tuple of `HunyuanDiT2DControlNetModel` to `HunyuanDiT2DMultiControlNetModel` (9651)
* Refactor SchedulerOutput and add pred_original_sample in `DPMSolverSDE`, `Heun`, `KDPM2Ancestral` and `KDPM2` (9650)
* Slight performance improvement to `Euler`, `EDMEuler`, `FlowMatchHeun`, `KDPM2Ancestral` (9616)
* Add prompt scheduling callback to community scripts (9718)
* yiyixuxu
* a few fix for SingleFile tests (9522)
* update get_parameter_dtype (9526)
* flux controlnet fix (control_modes batch & others) (9507)
* [sd3] make sure height and size are divisible by `16` (9573)
* [authored by Anghellia) Add support of Xlabs Controlnets 9638 (9687)
* minor doc/test update (9734)
* fix singlestep dpm tests (9716)
* PromeAIpro
* [examples] add train flux-controlnet scripts in example. (9324)
* juancopi81
* Add PAG support to StableDiffusionControlNetPAGInpaintPipeline (8875)
* glide-the
* fix: CogVideox train dataset _preprocess_data crop video (9574)
* Docs: CogVideoX (9578)
* SahilCarterr
* add PAG support for SD Img2Img (9463)
* Added Lora Support to SD3 Img2Img Pipeline (9659)
* ryanlyn
* Flux - soft inpainting via differential diffusion (9268)
* zRzRzRzRzRzRzR
* CogView3Plus DiT (9570)
* tolgacangoz
* [`Community Pipeline`] Add 🪆Matryoshka Diffusion Models (9157)
* Fix `schedule_shifted_power` usage in 🪆Matryoshka Diffusion Models (9723)
* linoytsaban
* [SD3 dreambooth-lora training] small updates + bug fixes (9682)
* [Flux] Add advanced training script + support textual inversion inference (9434)
* [advanced flux lora script] minor updates to readme (9705)

0.30.3

This patch release adds Diffusers support for the upcoming CogVideoX-5B-I2V release (an Image-to-Video generation model)! The model weights will be available by end of the week on the HF Hub at `THUDM/CogVideoX-5b-I2V` ([Link](https://huggingface.co/THUDM/CogVideoX-5b-I2V)). Stay tuned for the release!

This release features two new pipelines:

- CogVideoXImageToVideoPipeline
- CogVideoXVideoToVideoPipeline

Additionally, we now have support for tiled encoding in the CogVideoX VAE. This can be enabled by calling the `vae.enable_tiling()` method, and it is used in the new Video-to-Video pipeline to encode sample videos to latents in a memory-efficient manner.

CogVideoXImageToVideoPipeline

The code below demonstrates how to use the new image-to-video pipeline:

python
import torch
from diffusers import CogVideoXImageToVideoPipeline
from diffusers.utils import export_to_video, load_image

pipe = CogVideoXImageToVideoPipeline.from_pretrained("THUDM/CogVideoX-5b-I2V", torch_dtype=torch.bfloat16)
pipe.to("cuda")

Optionally, enable memory optimizations.
If enabling CPU offloading, remember to remove `pipe.to("cuda")` above
pipe.enable_model_cpu_offload()
pipe.vae.enable_tiling()

prompt = "An astronaut hatching from an egg, on the surface of the moon, the darkness and depth of space realised in the background. High quality, ultrarealistic detail and breath-taking movie-like camera shot."
image = load_image(
"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/astronaut.jpg"
)
video = pipe(image, prompt, use_dynamic_cfg=True)
export_to_video(video.frames[0], "output.mp4", fps=8)

<table align=center>
<tr>
<td align=center colspan=1><img src="https://github.com/user-attachments/assets/1c7c1d86-f97e-44dd-9b17-4fec2bbc2b1a" /></td>
<td align=center colspan=1><video src="https://github.com/user-attachments/assets/a115372e-c539-4ca0-b0d0-770d62862257"> Your broswer does not support the video tag. </video></td>
</tr>
</table>

CogVideoXVideoToVideoPipeline

The code below demonstrates how to use the new video-to-video pipeline:

python
import torch
from diffusers import CogVideoXDPMScheduler, CogVideoXVideoToVideoPipeline
from diffusers.utils import export_to_video, load_video

Models: "THUDM/CogVideoX-2b" or "THUDM/CogVideoX-5b"
pipe = CogVideoXVideoToVideoPipeline.from_pretrained("THUDM/CogVideoX-5b-trial", torch_dtype=torch.bfloat16)
pipe.scheduler = CogVideoXDPMScheduler.from_config(pipe.scheduler.config)
pipe.to("cuda")

input_video = load_video(
"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/hiker.mp4"
)
prompt = (
"An astronaut stands triumphantly at the peak of a towering mountain. Panorama of rugged peaks and "
"valleys. Very futuristic vibe and animated aesthetic. Highlights of purple and golden colors in "
"the scene. The sky is looks like an animated/cartoonish dream of galaxies, nebulae, stars, planets, "
"moons, but the remainder of the scene is mostly realistic."
)

video = pipe(
video=input_video, prompt=prompt, strength=0.8, guidance_scale=6, num_inference_steps=50
).frames[0]
export_to_video(video, "output.mp4", fps=8)

<table align=center>
<tr>
<td align=center><video src="https://github.com/user-attachments/assets/bc9273ff-e459-42f9-af1e-c9b084b28f4d"> Your browser does not support the video tag. </video></td>
</tr>
</table>

Shoutout to tin2tin for the awesome demonstration!

Refer to our [documentation](https://huggingface.co/docs/diffusers/api/pipelines/cogvideox) to learn more about it.

All commits

* [core] Support VideoToVideo with CogVideoX by a-r-r-o-w in 9333
* [core] CogVideoX memory optimizations in VAE encode by a-r-r-o-w in 9340
* [CI] Quick fix for Cog Video Test by DN6 in 9373
* [refactor] move positional embeddings to patch embed layer for CogVideoX by a-r-r-o-w in 9263
* CogVideoX-5b-I2V support by zRzRzRzRzRzRzR in 9418

0.30.2

All commits

* update runway repo for single_file by yiyixuxu in 9323
* Fix Flux CLIP prompt embeds repeat for num_images_per_prompt > 1 by DN6 in 9280
* [IP Adapter] Fix cache_dir and local_files_only for image encoder by asomoza in 9272

0.30.1

CogVideoX-5B

This patch release adds diffusers support for the upcoming CogVideoX-5B release! The model weights will be available next week on the Huggingface Hub at `THUDM/CogVideoX-5b`. Stay tuned for the release!

Additionally, we have implemented VAE tiling feature, which reduces the memory requirement for CogVideoX models. With this update, the total memory requirement is now 12GB for CogVideoX-2B and 21GB for CogVideoX-5B (with CPU offloading). To Enable this feature, simply call `enable_tiling()` on the VAE.

The code below shows how to generate a video with CogVideoX-5B

python
import torch
from diffusers import CogVideoXPipeline
from diffusers.utils import export_to_video

prompt = "Tracking shot,late afternoon light casting long shadows,a cyclist in athletic gear pedaling down a scenic mountain road,winding path with trees and a lake in the background,invigorating and adventurous atmosphere."

pipe = CogVideoXPipeline.from_pretrained(
"THUDM/CogVideoX-5b",
torch_dtype=torch.bfloat16
)

pipe.enable_model_cpu_offload()
pipe.vae.enable_tiling()

video = pipe(
prompt=prompt,
num_videos_per_prompt=1,
num_inference_steps=50,
num_frames=49,
guidance_scale=6,
).frames[0]

export_to_video(video, "output.mp4", fps=8)

https://github.com/user-attachments/assets/c2d4f7e8-ef86-4da6-8085-cb9f83f47f34

Refer to our [documentation](https://huggingface.co/docs/diffusers/api/pipelines/cogvideox) to learn more about it.

All commits

- Update Video Loading/Export to use `imageio` by DN6 in 9094
- [refactor] CogVideoX followups + tiled decoding support by a-r-r-o-w in 9150
- Add Learned PE selection for Auraflow by cloneofsimo in 9182
- [Single File] Fix configuring scheduler via legacy kwargs by DN6 in 9229
- [Flux LoRA] support parsing alpha from a flux lora state dict. by sayakpaul in 9236
- [tests] fix broken xformers tests by a-r-r-o-w in 9206
- Cogvideox-5B Model adapter change by zRzRzRzRzRzRzR in 9203
- [Single File] Support loading Comfy UI Flux checkpoints by DN6 in 9243

Page 3 of 16

Releases

Has known vulnerabilities

Previous Next