Diffusers

Latest version: v0.32.2

Safety actively analyzes 723843 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 13 of 16

0.9

* https://civitai.com/models/22279?modelVersionId=118556
* https://civitai.com/models/104515/sdxlor30costumesrevue-starlight-saijoclaudine-lora
* https://civitai.com/models/108448/daiton-sdxl-test
* https://filebin.net/2ntfqqnapiu9q3zx/pixelbuildings128-v1.safetensors

To know more details and the known limitations, please check out the [documentation](https://huggingface.co/docs/diffusers/main/en/training/lora#supporting-a1111-themed-lora-checkpoints-from-diffusers).

Thanks to isidentical for their sincere help in the PR.

Batched inference

bghira found that for SDXL Img2Img batched inference led to weird artifacts. That is fixed in: https://github.com/huggingface/diffusers/pull/4327.

Downloads

Under some circumstances SD-XL 1.0 can download ONNX weights which is corrected in https://github.com/huggingface/diffusers/pull/4338.

Improved SDXL behavior

https://github.com/huggingface/diffusers/pull/4346 allows the user to disable the watermarker under certain circumstances to improve the usability of SDXL.

All commits:

* [SDXL Refiner] Fix refiner forward pass for batched input by patrickvonplaten in 4327
* [ONNX] Don't download ONNX model by default by patrickvonplaten in 4338
* [SDXL] Make watermarker optional under certain circumstances to improve usability of SDXL 1.0 by patrickvonplaten in 4346
* [Feat] Support SDXL Kohya-style LoRA by sayakpaul in 4287

0.9.0

:art: Stable Diffusion 2 is here!

Installation

`pip install diffusers[torch]==0.9 transformers`

Stable Diffusion 2.0 is available in several flavors:

Stable Diffusion 2.0-V at `768x768`

New stable diffusion model (Stable Diffusion 2.0-v) at 768x768 resolution. Same number of parameters in the U-Net as `1.5`, but uses [OpenCLIP-ViT/H](https://github.com/mlfoundations/open_clip) as the text encoder and is trained from scratch. SD 2.0-v is a so-called [v-prediction](https://arxiv.org/abs/2202.00512) model.

![image](https://user-images.githubusercontent.com/26864830/204018236-259ace29-c007-4002-ad19-98fd35464954.png)

python
import torch
from diffusers import DiffusionPipeline, DPMSolverMultistepScheduler

repo_id = "stabilityai/stable-diffusion-2"
pipe = DiffusionPipeline.from_pretrained(repo_id, torch_dtype=torch.float16, revision="fp16")
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
pipe = pipe.to("cuda")

prompt = "High quality photo of an astronaut riding a horse in space"
image = pipe(prompt, guidance_scale=9, num_inference_steps=25).images[0]
image.save("astronaut.png")


Stable Diffusion 2.0-base at `512x512`

The above model is finetuned from SD 2.0-base, which was trained as a standard noise-prediction model on 512x512 images and is also made available.

![image](https://user-images.githubusercontent.com/26864830/204019534-3e4febce-55f8-4e27-9cc0-d8058ed00486.png)

python
import torch
from diffusers import DiffusionPipeline, DPMSolverMultistepScheduler

repo_id = "stabilityai/stable-diffusion-2-base"
pipe = DiffusionPipeline.from_pretrained(repo_id, torch_dtype=torch.float16, revision="fp16")
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
pipe = pipe.to("cuda")

prompt = "High quality photo of an astronaut riding a horse in space"
image = pipe(prompt, num_inference_steps=25).images[0]
image.save("astronaut.png")


Stable Diffusion 2.0 for Inpanting

This model for text-guided inpanting is finetuned from SD 2.0-base. Follows the mask-generation strategy presented in [LAMA](https://github.com/saic-mdal/lama) which, in combination with the latent VAE representations of the masked image, are used as an additional conditioning.

![image](https://user-images.githubusercontent.com/26864830/204019798-e03b7905-73d5-4eda-abd4-c31f46bd0c49.png)

python
import PIL
import requests
import torch
from io import BytesIO
from diffusers import DiffusionPipeline, DPMSolverMultistepScheduler

def download_image(url):
response = requests.get(url)
return PIL.Image.open(BytesIO(response.content)).convert("RGB")

img_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo.png"
mask_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo_mask.png"
init_image = download_image(img_url).resize((512, 512))
mask_image = download_image(mask_url).resize((512, 512))

repo_id = "stabilityai/stable-diffusion-2-inpainting"
pipe = DiffusionPipeline.from_pretrained(repo_id, torch_dtype=torch.float16, revision="fp16")
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
pipe = pipe.to("cuda")

prompt = "Face of a yellow cat, high resolution, sitting on a park bench"
image = pipe(prompt=prompt, image=init_image, mask_image=mask_image, num_inference_steps=25).images[0]
image.save("yellow_cat.png")


Stable Diffusion X4 Upscaler

The model was trained on crops of size 512x512 and is a text-guided [latent upscaling diffusion model](https://arxiv.org/abs/2112.10752). In addition to the textual input, it receives a noise_level as an input parameter, which can be used to add noise to the low-resolution input according to a [predefined diffusion schedule](https://huggingface.co/stabilityai/stable-diffusion-x4-upscaler/blob/main/low_res_scheduler/scheduler_config.json).

![image](https://user-images.githubusercontent.com/26864830/204020264-86807d85-3097-4755-ace6-cc1e6f24633d.png)

python
import requests
from PIL import Image
from io import BytesIO
from diffusers import StableDiffusionUpscalePipeline
import torch

model_id = "stabilityai/stable-diffusion-x4-upscaler"
pipeline = StableDiffusionUpscalePipeline.from_pretrained(model_id, revision="fp16", torch_dtype=torch.float16)
pipeline = pipeline.to("cuda")

url = "https://huggingface.co/datasets/hf-internal-testing/diffusers-images/resolve/main/sd2-upscale/low_res_cat.png"
response = requests.get(url)
low_res_img = Image.open(BytesIO(response.content)).convert("RGB")
low_res_img = low_res_img.resize((128, 128))

prompt = "a white cat"
upscaled_image = pipeline(prompt=prompt, image=low_res_img).images[0]
upscaled_image.save("upsampled_cat.png")


Saving & Loading is fixed for Versatile Diffusion

Previously there was a :bug: when saving & loading versatile diffusion - this is fixed now so that memory efficient saving & loading works as expected.

* [Versatile Diffusion] Fix remaining tests by patrickvonplaten in 1418

:memo: Changelog
* add v prediction by patil-suraj in 1386
* Adapt UNet2D for supre-resolution by patil-suraj in 1385
* Version 0.9.0.dev0 by anton-l in 1394
* Make height and width optional by patrickvonplaten in 1401
* [Config] Add optional arguments by patrickvonplaten in 1395
* Upscaling fixed by patrickvonplaten in 1402
* Add the new SD2 attention params to the VD text unet by anton-l in 1400
* Deprecate sample size by patrickvonplaten in 1406
* Support SD2 attention slicing by anton-l in 1397
* Add SD2 inpainting integration tests by anton-l in 1412
* Fix sample size conversion script by patrickvonplaten in 1408
* fix clip guided by patrickvonplaten in 1414
* Fix all stable diffusion by patrickvonplaten in 1415
* [MPS] call contiguous after permute by kashif in 1411
* Deprecate `predict_epsilon` by pcuenca in 1393
* Fix ONNX conversion and inference by anton-l in 1416
* Allow to set config params directly in init by patrickvonplaten in 1419
* Add tests for Stable Diffusion 2 V-prediction 768x768 by anton-l in 1420
* StableDiffusionUpscalePipeline by patil-suraj in 1396
* added initial v-pred support to DPM-solver by kashif in 1421
* SD2 docs by patrickvonplaten in 1424

0.8.1

This patch release fixes an error with `CLIPVisionModelWithProjection` imports on a non-git `transformers` installation.

:warning: Please upgrade with `pip install --upgrade diffusers` or `pip install diffusers==0.8.1`

* [Bad dependencies] Fix imports (https://github.com/huggingface/diffusers/pull/1382) by patrickvonplaten

0.8.0

πŸ™†β€β™€οΈ New Models

VersatileDiffusion

VersatileDiffusion, released by [SHI-Labs](https://github.com/SHI-Labs), is a unified multi-flow multimodal diffusion model that is capable of doing multiple tasks such as text2image, image variations, dual-guided(text+image) image generation, image2text.

- [Versatile Diffusion] Add versatile diffusion model by patrickvonplaten anton-l 1283
Make sure to install `transformers` from "main":

bash
pip install git+https://github.com/huggingface/transformers


Then you can run:

python
from diffusers import VersatileDiffusionPipeline
import torch
import requests
from io import BytesIO
from PIL import Image

pipe = VersatileDiffusionPipeline.from_pretrained("shi-labs/versatile-diffusion", torch_dtype=torch.float16)
pipe = pipe.to("cuda")

initial image
url = "https://huggingface.co/datasets/diffusers/images/resolve/main/benz.jpg"
response = requests.get(url)
image = Image.open(BytesIO(response.content)).convert("RGB")

prompt
prompt = "a red car"

text to image
image = pipe.text_to_image(prompt).images[0]

image variation
image = pipe.image_variation(image).images[0]

image variation
image = pipe.dual_guided(prompt, image).images[0]


More in-depth details can be found on:
- [Model card](https://huggingface.co/shi-labs/versatile-diffusion)
- [Docs](https://huggingface.co/docs/diffusers/api/pipelines/versatile_diffusion)

AltDiffusion

AltDiffusion is a multilingual latent diffusion model that supports text-to-image generation for 9 different languages: English, Chinese, Spanish, French, Japanese, Korean, Arabic, Russian and Italian.

- Add AltDiffusion by patrickvonplaten patil-suraj 1299


Stable Diffusion Image Variations

`StableDiffusionImageVariationPipeline` by justinpinkney is a stable diffusion model that takes an image as an input and generates variations of that image. It is conditioned on CLIP image embeddings instead of text.

- StableDiffusionImageVariationPipeline by patil-suraj 1365


Safe Latent Diffusion

Safe Latent Diffusion (SLD), released by [ml-researchTUDarmstadt](https://github.com/ml-research) group, is a new practical and sophisticated approach to prevent unsolicited content from being generated by diffusion models. One of the authors of the research contributed their implementation to `diffusers`.

- Add Safe Stable Diffusion Pipeline by manuelbrack 1244

VQ-Diffusion with classifier-free sampling

vq diffusion classifier free sampling by williamberman 1294

LDM super resolution

LDM super resolution is a latent 4x super-resolution diffusion model released by [CompVis](https://github.com/CompVis).

- Add LDM Super Resolution pipeline by duongna21 1116

CycleDiffusion

CycleDiffusion is a method that uses Text-to-Image Diffusion Models for Image-to-Image Editing. It is capable of

1. Zero-shot image-to-image translation with text-to-image diffusion models such as Stable Diffusion.
Traditional unpaired image-to-image translation with diffusion models trained on two related domains.
2. Zero-shot image-to-image translation with text-to-image diffusion models such as Stable Diffusion.
Traditional unpaired image-to-image translation with diffusion models trained on two related domains.

- Add CycleDiffusion pipeline using Stable Diffusion by ChenWu98 888

CLIPSeg + StableDiffusionInpainting.

Uses [CLIPSeg](https://huggingface.co/docs/transformers/main/en/model_doc/clipseg) to automatically generate a mask using segmentation, and then applies Stable Diffusion in-painting.

K-Diffusion wrapper

K-Diffusion Pipeline is community pipeline that allows to use any sampler from [K-diffusion](https://github.com/crowsonkb/k-diffusion) with `diffusers` models.

- [Community Pipelines] K-Diffusion Pipeline by patrickvonplaten 1360

πŸŒ€New SOTA Scheduler

`DPMSolverMultistepScheduler` is the 🧨 `diffusers` implementation of [DPM-Solver++](https://github.com/LuChengTHU/dpm-solver), a state-of-the-art scheduler that was contributed by one of the authors of the paper. This scheduler is able to achieve great quality in as few as 20 steps. It's a drop-in replacement for the default Stable Diffusion scheduler, so you can use it to essentially half generation times. It works so well that we adopted it for the Stable Diffusion demo Spaces: https://huggingface.co/spaces/stabilityai/stable-diffusion, https://huggingface.co/spaces/runwayml/stable-diffusion-v1-5.

You can use it like this:

Python
from diffusers import DiffusionPipeline, DPMSolverMultistepScheduler

repo_id = "runwayml/stable-diffusion-v1-5"
scheduler = DPMSolverMultistepScheduler.from_pretrained(repo_id, subfolder="scheduler")
stable_diffusion = DiffusionPipeline.from_pretrained(repo_id, scheduler=scheduler)


🌐 Better scheduler API
The example above also demonstrates how to load schedulers using a new API that is coherent with model loading and therefore more natural and intuitive.

You can load a scheduler using `from_pretrained`, as demonstrated above, or you can instantiate one from an existing scheduler configuration. This is a way to replace the scheduler of a pipeline that was previously loaded:

Python
from diffusers import DiffusionPipeline, EulerDiscreteScheduler

pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5")
pipeline.scheduler = DDIMScheduler.from_config(pipeline.scheduler.config)


Read more about these changes in [the documentation](https://huggingface.co/docs/diffusers/v0.8.0/en/using-diffusers/schedulers). See also the community pipeline that allows using any of the [K-diffusion](https://github.com/crowsonkb/k-diffusion) samplers with `diffusers`, as mentioned above!

πŸŽ‰ Performance

We work relentlessly to incorporate performance optimizations and memory reduction techniques to 🧨 diffusers. These are two of the most noteworthy incorporations in this release:

- Enable memory-efficient attention by default if [xFormers](https://github.com/facebookresearch/xformers) is installed.
- Use batched-matmuls when possible.


🎁 Quality of Life improvements
- Fix/Enable all schedulers for in-painting
- Easier loading of local pipelines
- cpu offloading: mutli GPU support

:memo: Changelog
* Add multistep DPM-Solver discrete scheduler by LuChengTHU in 1132
* Remove warning about half precision on MPS by pcuenca in 1163
* Fix typo latens -> latents by duongna21 in 1171
* Fix community pipeline links by pcuenca in 1162
* [Docs] Add loading script by patrickvonplaten in 1174
* Fix dtype safety checker inpaint legacy by patrickvonplaten in 1137
* Community pipeline img2img inpainting by vvvm23 in 1114
* [Community Pipeline] Add multilingual stable diffusion to community pipelines by juancopi81 in 1142
* [Flax examples] Load text encoder from subfolder by duongna21 in 1147
* Link to Dreambooth blog post instead of W&B report by pcuenca in 1180
* Fix small typo by pcuenca in 1178
* [DDIMScheduler] fix noise device in ddim step by patil-suraj in 1189
* MPS schedulers: don't use float64 by pcuenca in 1169
* Warning for invalid options without "--with_prior_preservation" by shirayu in 1065
* [ONNX] Improve ONNXPipeline scheduler compatibility, fix safety_checker by anton-l in 1173
* Restore compatibility with deprecated `StableDiffusionOnnxPipeline` by pcuenca in 1191
* Update pr docs actions by mishig25 in 1194
* handle dtype xformers attention by patil-suraj in 1196
* [Scheduler] Move predict epsilon to init by patrickvonplaten in 1155
* add licenses to pipelines by natolambert in 1201
* Fix cpu offloading by anton-l in 1177
* Fix slow tests by patrickvonplaten in 1210
* [Flax] fix extra copy pasta 🍝 by camenduru in 1187
* [CLIPGuidedStableDiffusion] support DDIM scheduler by patil-suraj in 1190
* Fix layer names convert LDM script by duongna21 in 1206
* [Loading] Make sure loading edge cases work by patrickvonplaten in 1192
* Add LDM Super Resolution pipeline by duongna21 in 1116
* [Conversion] Improve conversion script by patrickvonplaten in 1218
* DDIM docs by patrickvonplaten in 1219
* apply `repeat_interleave` fix for `mps` to stable diffusion image2image pipeline by jncasey in 1135
* Flax tests: don't hardcode number of devices by pcuenca in 1175
* Improve documentation for the LPW pipeline by exo-pla-net in 1182
* Factor out encode text with Copied from by patrickvonplaten in 1224
* Match the generator device to the pipeline for DDPM and DDIM by anton-l in 1222
* [Tests] Fix mps+generator fast tests by anton-l in 1230
* [Tests] Adjust TPU test values by anton-l in 1233
* Add a reference to the name 'Sampler' by apolinario in 1172
* Fix Flax usage comments by pcuenca in 1211
* [Docs] improve img2img example by ruanrz in 1193
* [Stable Diffusion] Fix padding / truncation by patrickvonplaten in 1226
* Finalize stable diffusion refactor by patrickvonplaten in 1269
* Edited attention.py for older xformers by Lime-Cakes in 1270
* Fix wrong link in text2img fine-tuning documentation by daspartho in 1282
* [StableDiffusionInpaintPipeline] fix batch_size for mask and masked latents by patil-suraj in 1279
* Add UNet 1d for RL model for planning + colab by natolambert in 105
* Fix documentation typo for `UNet2DModel` and `UNet2DConditionModel` by xenova in 1275
* add source link to composable diffusion model by nanliu1 in 1293
* Fix incorrect link to Stable Diffusion notebook by dhruvrnaik in 1291
* [dreambooth] link to bitsandbytes readme for installation by 0xdevalias in 1229
* Add Scheduler.from_pretrained and better scheduler changing by patrickvonplaten in 1286
* Add AltDiffusion by patrickvonplaten in 1299
* Better error message for transformers dummy by patrickvonplaten in 1306
* Revert "Update pr docs actions" by mishig25 in 1307
* [AltDiffusion] add tests by patil-suraj in 1311
* Add improved handling of pil by patrickvonplaten in 1309
* cpu offloading: mutli GPU support by dblunk88 in 1143
* vq diffusion classifier free sampling by williamberman in 1294
* doc string args shape fix by kamalkraj in 1243
* [Community Pipeline] CLIPSeg + StableDiffusionInpainting by unography in 1250
* Temporary local test for PIL_INTERPOLATION by pcuenca in 1317
* Fix gpu_id by anton-l in 1326
* integrate ort by prathikr in 1110
* [Custom pipeline] Easier loading of local pipelines by patrickvonplaten in 1327
* [ONNX] Support Euler schedulers by anton-l in 1328
* img2text Typo by patrickvonplaten in 1329
* add docs for multi-modal examples by natolambert in 1227
* [Flax] Fix loading scheduler from subfolder by skirsten in 1319
* Fix/Enable all schedulers for in-painting by patrickvonplaten in 1331
* Correct path to schedlure by patrickvonplaten in 1322
* Avoid nested fix-copies by anton-l in 1332
* Fix img2img speed with LMS-Discrete Scheduler by NotNANtoN in 896
* Fix the order of casts for onnx inpainting by anton-l in 1338
* Legacy Inpainting Pipeline for Onnx Models by ctsims in 1237
* Jax infer support negative prompt by entrpn in 1337
* Update README.md: IMAGIC example code snippet misspelling by ki-arie in 1346
* Update README.md: Minor change to Imagic code snippet, missing dir error by ki-arie in 1347
* Handle batches and Tensors in `pipeline_stable_diffusion_inpaint.py:prepare_mask_and_masked_image` by vict0rsch in 1003
* change the sample model by shunxing1234 in 1352
* Add bit diffusion [WIP] by kingstut in 971
* perf: prefer batched matmuls for attention by Birch-san in 1203
* [Community Pipelines] K-Diffusion Pipeline by patrickvonplaten in 1360
* Add Safe Stable Diffusion Pipeline by manuelbrack in 1244
* [examples] fix mixed_precision arg by patil-suraj in 1359
* use memory_efficient_attention by default by patil-suraj in 1354
* Replace logger.warn by logger.warning by regisss in 1366
* Fix using non-square images with UNet2DModel and DDIM/DDPM pipelines by jenkspt in 1289
* handle fp16 in `UNet2DModel` by patil-suraj in 1216
* StableDiffusionImageVariationPipeline by patil-suraj in 1365

0.7.2

This patch release fixes a bug that broken the Flax Stable Diffusion Inference.
Thanks a mille for spotting it camenduru in https://github.com/huggingface/diffusers/issues/1145 and thanks a lot to pcuenca and kashif for fixing it in https://github.com/huggingface/diffusers/pull/1149


* Flax: Flip sin to cos in time embeddings 1149 by pcuenca

0.7.1

This patch release makes `accelerate` a soft dependency to avoid an error when installing `diffusers` with pre-existing `torch`.


* Move accelerate to a soft-dependency 1134 by patrickvonplaten

Page 13 of 16

Β© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.