Diffusers

Latest version: v0.32.2

Safety actively analyzes 723882 Python packages for vulnerabilities to keep your Python projects secure.

Page 4 of 16

0.30.0

Significant community contributions

The following contributors have made significant changes to the library over the last release:

* DN6
* [Tests] Fix precision related issues in slow pipeline tests (8720)
* Remove legacy single file model loading mixins (8754)
* Enforce ordering when running Pipeline slow tests (8763)
* Fix warning in UNetMotionModel (8756)
* Fix indent in dreambooth lora advanced SD 15 script (8753)
* Fix mistake in Single File Docs page (8765)
* [Single File] Allow loading T5 encoder in mixed precision (8778)
* Fix saving text encoder weights and kohya weights in advanced dreambooth lora script (8766)
* Add VAE tiling option for SD3 (8791)
* Add single file loading support for AnimateDiff (8819)
* Add option to SSH into CPU runner. (8884)
* SSH into cpu runner fix (8888)
* SSH into cpu runner additional fix (8893)
* Update pipeline test fetcher (8931)
* Fix name when saving text inversion embeddings in dreambooth advanced scripts (8927)
* [CI] Skip flaky download tests in PR CI (8945)
* [CI] Slow Test Updates (8870)
* [CI] Fix parallelism in nightly tests (8983)
* [CI] Nightly Test Runner explicitly set runner for Setup Pipeline Matrix (8986)
* Updates deps for pipeline test fetcher (9033)
* Fix Nightly Deps (9036)
* update
* [Docs] Add community projects section to docs (9013)
* [Single File] Add single file support for Flux Transformer (9083)
* Freenoise change `vae_batch_size` to `decode_chunk_size` (9110)
* shauray8
* add PAG support for SD architecture (8725)
* gnobitab
* [Tencent Hunyuan Team] Add HunyuanDiT-v1.2 Support (8747)
* [Tencent Hunyuan Team] Add checkpoint conversion scripts and changed controlnet (8783)
* yiyixuxu
* [doc] add a tip about using SDXL refiner with hunyuan-dit and pixart (8735)
* [hunyuan-dit] refactor `HunyuanCombinedTimestepTextSizeStyleEmbedding` (8761)
* correct `attention_head_dim` for `JointTransformerBlock` (8608)
* fix loading sharded checkpoints from subfolder (8798)
* Revert "[LoRA] introduce LoraBaseMixin to promote reusability." (8976)
* fix load sharded checkpoint from a subfolder (local path) (8913)
* add sentencepiece as a soft dependency (9065)
* PommesPeter
* [Alpha-VLLM Team] Add Lumina-T2X to diffusers (8652)
* IrohXu
* Add pipeline_stable_diffusion_3_inpaint.py for SD3 Inference (8709)
* maxin-cn
* Latte: Latent Diffusion Transformer for Video Generation (8404)
* ustcuna
* [Community Pipelines] Accelerate inference of AnimateDiff by IPEX on CPU (8643)
* tuanh123789
* add PAG support sd15 controlnet (8820)
* Snailpong
* 🌐 [i18n-KO] Translated docs to Korean (added 7 docs and etc) (8804)
* asfiyab-nvidia
* Update TensorRT img2img community pipeline (8899)
* Update TensorRT txt2img and inpaint community pipelines (9037)
* ylacombe
* Stable Audio integration (8716)
* Fix Stable Audio repository id (9016)
* sunovivid
* add PAG support for Stable Diffusion 3 (8861)
* zRzRzRzRzRzRzR
* Add CogVideoX text-to-video generation model (9082)

0.29.2

0.29.1

SD3 CntrolNet
<img width="624" alt="image" src="https://github.com/huggingface/diffusers/assets/46553287/db384753-cfbb-488c-bc74-8280f9bee24e">

python
import torch
from diffusers import StableDiffusion3ControlNetPipeline
from diffusers.models import SD3ControlNetModel, SD3MultiControlNetModel
from diffusers.utils import load_image

controlnet = SD3ControlNetModel.from_pretrained("InstantX/SD3-Controlnet-Canny", torch_dtype=torch.float16)

pipe = StableDiffusion3ControlNetPipeline.from_pretrained(
"stabilityai/stable-diffusion-3-medium-diffusers", controlnet=controlnet, torch_dtype=torch.float16
)
pipe.to("cuda")
control_image = load_image("https://huggingface.co/InstantX/SD3-Controlnet-Canny/resolve/main/canny.jpg")
prompt = "A girl holding a sign that says InstantX"
image = pipe(prompt, control_image=control_image, controlnet_conditioning_scale=0.7).images[0]
image.save("sd3.png")

📜 Refer to the official docs [here](https://huggingface.co/docs/diffusers/api/pipelines/controlnet_sd3) to learn more about it.

Thanks to haofanwang wangqixun from the ResearcherXman team for contributing this pipeline!

Expanded single file support
We now support all available single-file checkpoints for sd3 in diffusers! To load the single file checkpoint with t5

python
import torch
from diffusers import StableDiffusion3Pipeline

pipe = StableDiffusion3Pipeline.from_single_file(
"https://huggingface.co/stabilityai/stable-diffusion-3-medium/blob/main/sd3_medium_incl_clips_t5xxlfp8.safetensors",
torch_dtype=torch.float16,
)
pipe.enable_model_cpu_offload()

image = pipe("a picture of a cat holding a sign that says hello world").images[0]
image.save('sd3-single-file-t5-fp8.png')

Using Long Prompts with the T5 Text Encoder
We increased the default sequence length for the T5 Text Encoder from a maximum of `77` to `256`! It can be adjusted to accept fewer or more tokens by setting the `max_sequence_length` to a maximum of `512`. Keep in mind that longer sequences require additional resources and will result in longer generation times. This effect is particularly noticeable during batch inference.

python
prompt = "A whimsical and creative image depicting a hybrid creature that is a mix of a waffle and a hippopotamus. This imaginative creature features the distinctive, bulky body of a hippo, but with a texture and appearance resembling a golden-brown, crispy waffle. The creature might have elements like waffle squares across its skin and a syrup-like sheen. It’s set in a surreal environment that playfully combines a natural water habitat of a hippo with elements of a breakfast table setting, possibly including oversized utensils or plates in the background. The image should evoke a sense of playful absurdity and culinary fantasy."

image = pipe(
prompt=prompt,
negative_prompt="",
num_inference_steps=28,
guidance_scale=4.5,
max_sequence_length=512,
).images[0]

|Before|max_sequence_length=256|max_sequence_length=512
|---|---|---|
|![20240612204503_2888268196](https://github.com/huggingface/diffusers/assets/5442875/e5ab1053-f819-4314-b676-80bef759aa71)|![20240612204440_2888268196](https://github.com/huggingface/diffusers/assets/5442875/6bda088f-8ee4-42ff-88bc-ac3129a92d31)|![20240613195139_569754043](https://github.com/huggingface/diffusers/assets/5442875/ca6940d4-7459-451f-80f9-c591c611aba0)

All commits

0.29.0

This release emphasizes Stable Diffusion 3, Stability AI’s latest iteration of the Stable Diffusion family of models. It was introduced in [Scaling Rectified Flow Transformers for High-Resolution Image Synthesis](https://arxiv.org/abs/2403.03206) by Patrick Esser, Sumith Kulal, Andreas Blattmann, Rahim Entezari, Jonas Müller, Harry Saini, Yam Levi, Dominik Lorenz, Axel Sauer, Frederic Boesel, Dustin Podell, Tim Dockhorn, Zion English, Kyle Lacey, Alex Goodwin, Yannik Marek, and Robin Rombach.

As the model is gated, before using it with `diffusers`, you first need to go to the [Stable Diffusion 3 Medium Hugging Face page](https://huggingface.co/stabilityai/stable-diffusion-3-medium-diffusers), fill in the form and accept the gate. Once you are in, you need to log in so that your system knows you’ve accepted the gate.

bash
huggingface-cli login

The code below shows how to perform text-to-image generation with SD3:

python
import torch
from diffusers import StableDiffusion3Pipeline

pipe = StableDiffusion3Pipeline.from_pretrained("stabilityai/stable-diffusion-3-medium-diffusers", torch_dtype=torch.float16)
pipe = pipe.to("cuda")

image = pipe(
"A cat holding a sign that says hello world",
negative_prompt="",
num_inference_steps=28,
guidance_scale=7.0,
).images[0]
image

![image](https://github.com/huggingface/diffusers/assets/22957388/30917935-6649-447e-8bf2-c4c9378562de)

Refer to [our documentation](https://huggingface.co/docs/diffusers/main/en/api/pipelines/stable_diffusion/stable_diffusion_3) for learning all the optimizations you can apply to SD3 as well as the image-to-image pipeline.

Additionally, we support DreamBooth + LoRA fine-tuning of Stable Diffusion 3 through rectified flow. Check out [this directory](https://github.com/huggingface/diffusers/blob/main/examples/dreambooth/README_sd3.md) for more details.

0.28.2

* Change checkpoint key used to identify CLIP models in single file checkpoints by DN6 in 8319

0.28.1

Significant community contributions

The following contributors have made significant changes to the library over the last release:

* gnobitab
* Tencent Hunyuan Team: add HunyuanDiT related updates (8240)
* Tencent Hunyuan Team - Updated Doc for HunyuanDiT (8383)

Page 4 of 16

Releases

Has known vulnerabilities

Previous Next

Diffusers

Page 4 of 16

0.30.0

0.29.2

0.29.1

0.29.0

0.28.2

0.28.1

Page 4 of 16

Links

Releases