Diffusers

Latest version: v0.31.0

Safety actively analyzes 682487 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 12 of 16

0.10.2

This patch removes the hard requirement for `transformers>=4.25.1` in case external libraries were downgrading the library upon startup in a non-controllable way.

* do not automatically enable xformers by patrickvonplaten in 1640
* Adapt to forced transformers version in some dependent libraries by anton-l in 1638
* Re-add xformers enable to UNet2DCondition by patrickvonplaten in 1627

🚨🚨🚨 **Note that xformers in not automatically enabled anymore** 🚨🚨🚨

The reasons for this are given here: https://github.com/huggingface/diffusers/pull/1640#discussion_r1044651551:

> We should not automatically enable xformers for three reasons:
>
> It's not PyTorch-like API. PyTorch doesn't by default enable all the fastest options available
> We allocate GPU memory before the user even does .to("cuda")
> This behavior is not consistent with cases where xformers is not installed

**=> This means**: If you were used to have xformers automatically enabled, please make sure to add the following now:

python
from diffusers.utils.import_utils import is_xformers_available

unet = ... load unet

if is_xformers_available():
try:
unet.enable_xformers_memory_efficient_attention(True)
except Exception as e:
logger.warning(
"Could not enable memory efficient attention. Make sure xformers is installed"
f" correctly and a GPU is available: {e}"
)


for the UNet (e.g. in dreambooth) or for the pipeline:

py
from diffusers.utils.import_utils import is_xformers_available

pipe = ... load pipeline

if is_xformers_available():
try:
pipe.enable_xformers_memory_efficient_attention(True)
except Exception as e:
logger.warning(
"Could not enable memory efficient attention. Make sure xformers is installed"
f" correctly and a GPU is available: {e}"
)

0.10.1

This patch returns `enable_xformers_memory_efficient_attention()` to `UNet2DCondition` to restore backward compatibility.

* Re-add xformers enable to UNet2DCondition by patrickvonplaten in 1627

0.10.0

🐳 Depth-Guided Stable Diffusion and 2.1 checkpoints

The new depth-guided stable diffusion model is fully supported in this release. The model is conditioned on monocular depth estimates inferred via [MiDaS](https://github.com/isl-org/MiDaS) and can be used for structure-preserving img2img and shape-conditional synthesis.

![image](https://user-images.githubusercontent.com/26864830/206480602-d0b0969b-3e4a-4c33-a1d0-40fe5b877656.png)

Installing the `transformers` library from source is required for the MiDaS model:
bash
pip install --upgrade git+https://github.com/huggingface/transformers/

python
import torch
import requests
from PIL import Image
from diffusers import StableDiffusionDepth2ImgPipeline

pipe = StableDiffusionDepth2ImgPipeline.from_pretrained(
"stabilityai/stable-diffusion-2-depth",
torch_dtype=torch.float16,
).to("cuda")

url = "http://images.cocodataset.org/val2017/000000039769.jpg"
init_image = Image.open(requests.get(url, stream=True).raw)

prompt = "two tigers"
n_propmt = "bad, deformed, ugly, bad anotomy"
image = pipe(prompt=prompt, image=init_image, negative_prompt=n_propmt, strength=0.7).images[0]


The updated Stable Diffusion 2.1 checkpoints are also released and fully supported:
* https://huggingface.co/stabilityai/stable-diffusion-2-1
* https://huggingface.co/stabilityai/stable-diffusion-2-1-base

:safety_vest: Safe Tensors
We now support [SafeTensors](https://github.com/huggingface/safetensors/): a new simple format for storing tensors safely (as opposed to pickle) that is still fast (zero-copy).
* [Proposal] Support loading from safetensors if file is present. by Narsil in 1357
* [Proposal] Support saving to safetensors by MatthieuBizien in 1494

| Format | Safe | Zero-copy | Lazy loading | No file size limit | Layout control | Flexibility | Bfloat16
| ----------------------- | --- | --- | --- | --- | --- | --- | --- |
| pickle (PyTorch) | βœ— | βœ— | βœ— | βœ“ | βœ— | βœ“ | βœ“ |
| H5 (Tensorflow) | βœ“ | βœ— | βœ“ | βœ“ | ~ | ~ | βœ— |
| SavedModel (Tensorflow) | βœ“ | βœ— | βœ— | βœ“ | βœ“ | βœ— | βœ“ |
| MsgPack (flax) | βœ“ | βœ“ | βœ— | βœ“ | βœ— | βœ— | βœ“ |
| SafeTensors | βœ“ | βœ“ | βœ“ | βœ“ | βœ“ | βœ— | βœ“ |

**More details about the comparison here: https://github.com/huggingface/safetensors#yet-another-format-

pip install safetensors

python
from diffusers import StableDiffusionPipeline

pipe = StableDiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-2-1")
pipe.save_pretrained("./safe-stable-diffusion-2-1", safe_serialization=True)

you can also push this checkpoint to the HF Hub and load from there
safe_pipe = StableDiffusionPipeline.from_pretrained("./safe-stable-diffusion-2-1")


New Pipelines
:paintbrush: Paint-by-example
An implementation of [Paint by Example: Exemplar-based Image Editing with Diffusion Models](https://arxiv.org/abs/2211.13227) by Binxin Yang, Shuyang Gu, Bo Zhang, Ting Zhang, Xuejin Chen, Xiaoyan Sun, Dong Chen, Fang Wen
* Add paint by example by patrickvonplaten in 1533

![image](https://user-images.githubusercontent.com/26864830/206482481-ce91ca9d-1cf3-441a-9dd5-f012f3160431.png)

python
import PIL
import requests
import torch
from io import BytesIO
from diffusers import DiffusionPipeline

def download_image(url):
response = requests.get(url)
return PIL.Image.open(BytesIO(response.content)).convert("RGB")

img_url = "https://raw.githubusercontent.com/Fantasy-Studio/Paint-by-Example/main/examples/image/example_1.png"
mask_url = "https://raw.githubusercontent.com/Fantasy-Studio/Paint-by-Example/main/examples/mask/example_1.png"
example_url = "https://raw.githubusercontent.com/Fantasy-Studio/Paint-by-Example/main/examples/reference/example_1.jpg"

init_image = download_image(img_url).resize((512, 512))
mask_image = download_image(mask_url).resize((512, 512))
example_image = download_image(example_url).resize((512, 512))

pipe = DiffusionPipeline.from_pretrained("Fantasy-Studio/Paint-by-Example", torch_dtype=torch.float16)
pipe = pipe.to("cuda")

image = pipe(image=init_image, mask_image=mask_image, example_image=example_image).images[0]


Audio Diffusion and Latent Audio Diffusion
Audio Diffusion leverages the recent advances in image generation using diffusion models by converting audio samples to and from mel spectrogram images.
* add AudioDiffusionPipeline and LatentAudioDiffusionPipeline 1334 by teticio in 1426
python
from IPython.display import Audio
from diffusers import DiffusionPipeline

pipe = DiffusionPipeline.from_pretrained("teticio/audio-diffusion-ddim-256").to("cuda")

output = pipe()
display(output.images[0])
display(Audio(output.audios[0], rate=pipe.mel.get_sample_rate()))


[Experimental] K-Diffusion pipeline for Stable Diffusion
This pipeline is added to support the latest schedulers from crowsonkb's [k-diffusion](https://github.com/crowsonkb/k-diffusion)
The purpose of this pipeline is to compare scheduler implementations and updates, so new features from other pipelines are unlikely to be supported!

* [K Diffusion] Add k diffusion sampler natively by patrickvonplaten in 1603

pip install k-diffusion

python
from diffusers import StableDiffusionKDiffusionPipeline
import torch

pipe = StableDiffusionKDiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-2-1-base")
pipe = pipe.to("cuda")

pipe.set_scheduler("sample_heun")
image = pipe("astronaut riding horse", num_inference_steps=25).images[0]



New Schedulers
Heun scheduler inspired by Karras et. al
Algorithm 1 of [Karras et. al](https://arxiv.org/abs/2206.00364). Scheduler ported from crowsonkb’s [k-diffusion](https://github.com/crowsonkb/k-diffusion)

* Add 2nd order heun scheduler by patrickvonplaten in 1336
python
from diffusers import HeunDiscreteScheduler

pipe = StableDiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-2-1")
pipe.scheduler = HeunDiscreteScheduler.from_config(pipe.scheduler.config)


Single step DPM-Solver
Original paper can be found [here](https://arxiv.org/abs/2206.00927) and the [improved version](https://arxiv.org/abs/2211.01095). The original implementation can be found [here](https://github.com/LuChengTHU/dpm-solver).
* Add Singlestep DPM-Solver (singlestep high-order schedulers) by LuChengTHU in 1442
python
from diffusers import DPMSolverSinglestepScheduler

pipe = StableDiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-2-1")
pipe.scheduler = DPMSolverSinglestepScheduler.from_config(pipe.scheduler.config)



:memo: Changelog
* [Proposal] Support loading from safetensors if file is present. by Narsil in 1357
* Hotfix for AttributeErrors in OnnxStableDiffusionInpaintPipelineLegacy by anton-l in 1448
* Speed up test and remove kwargs from call by patrickvonplaten in 1446
* v-prediction training support by patil-suraj in 1455
* Fix Flax `from_pt` by pcuenca in 1436
* Ensure Flax pipeline always returns numpy array by pcuenca in 1435
* Add 2nd order heun scheduler by patrickvonplaten in 1336
* fix slow tests by patrickvonplaten in 1467
* Flax support for Stable Diffusion 2 by pcuenca in 1423
* Updates Image to Image Inpainting community pipeline README by vvvm23 in 1370
* StableDiffusion: Decode latents separately to run larger batches by kig in 1150
* Fix bug in half precision for DPMSolverMultistepScheduler by rtaori in 1349
* [Train unconditional] Unwrap model before EMA by anton-l in 1469
* Add `ort_nightly_directml` to the `onnxruntime` candidates by anton-l in 1458
* Allow saving trained betas by patrickvonplaten in 1468
* Fix dtype model loading by patrickvonplaten in 1449
* [Dreambooth] Make compatible with alt diffusion by patrickvonplaten in 1470
* Add better docs xformers by patrickvonplaten in 1487
* Remove reminder comment by pcuenca in 1489
* Bump to 0.10.0.dev0 + deprecations by anton-l in 1490
* Add doc for Stable Diffusion on Habana Gaudi by regisss in 1496
* Replace deprecated hub utils in `train_unconditional_ort` by anton-l in 1504
* [Deprecate] Correct stacklevel by patrickvonplaten in 1483
* simplyfy AttentionBlock by patil-suraj in 1492
* Standardize on using `image` argument in all pipelines by fboulnois in 1361
* support v prediction in other schedulers by patil-suraj in 1505
* Fix Flax flip_sin_to_cos by akashgokul in 1369
* Add an explicit `--image_size` to the conversion script by anton-l in 1509
* fix heun scheduler by patil-suraj in 1512
* [docs] [dreambooth training] accelerate.utils.write_basic_config by williamberman in 1513
* [docs] [dreambooth training] num_class_images clarification by williamberman in 1508
* [From pretrained] Allow returning local path by patrickvonplaten in 1450
* Update conversion script to correctly handle SD 2 by patrickvonplaten in 1511
* [refactor] Making the xformers mem-efficient attention activation recursive by blefaudeux in 1493
* Do not use torch.long in mps by pcuenca in 1488
* Fix Imagic example by dhruvrnaik in 1520
* Fix training docs to install datasets by pedrogengo in 1476
* Finalize 2nd order schedulers by patrickvonplaten in 1503
* Fixed mask+masked_image in sd inpaint pipeline by antoche in 1516
* Create train_dreambooth_inpaint.py by thedarkzeno in 1091
* Update FlaxLMSDiscreteScheduler by dzlab in 1474
* [Proposal] Support saving to safetensors by MatthieuBizien in 1494
* Add xformers attention to VAE by kig in 1507
* [CI] Add slow MPS tests by anton-l in 1104
* [Stable Diffusion Inpaint] Allow tensor as input image & mask by patrickvonplaten in 1527
* Compute embedding distances with torch.cdist by blefaudeux in 1459
* [Upscaling] Fix batch size by patrickvonplaten in 1525
* Update bug-report.yml by patrickvonplaten in 1548
* [Community Pipeline] Checkpoint Merger based on Automatic1111 by Abhinay1997 in 1472
* [textual_inversion] Add an option for only saving the embeddings by allo- in 781
* [examples] use from_pretrained to load scheduler by patil-suraj in 1549
* fix mask discrepancies in train_dreambooth_inpaint by thedarkzeno in 1529
* [refactor] make set_attention_slice recursive by patil-suraj in 1532
* Research folder by patrickvonplaten in 1553
* add AudioDiffusionPipeline and LatentAudioDiffusionPipeline 1334 by teticio in 1426
* [Community download] Fix cache dir by patrickvonplaten in 1555
* [Docs] Correct docs by patrickvonplaten in 1554
* Fix typo by pcuenca in 1558
* [docs] [dreambooth training] default accelerate config by williamberman in 1564
* Mega community pipeline by patrickvonplaten in 1561
* [examples] add check_min_version by patil-suraj in 1550
* [dreambooth] make collate_fn global by patil-suraj in 1547
* Standardize fast pipeline tests with PipelineTestMixin by anton-l in 1526
* Add paint by example by patrickvonplaten in 1533
* [Community Pipeline] fix lpw_stable_diffusion by SkyTNT in 1570
* [Paint by Example] Better default for image width by patrickvonplaten in 1587
* Add from_pretrained telemetry by anton-l in 1461
* Correct order height & width in pipeline_paint_by_example.py by Fantasy-Studio in 1589
* Fix common tests for FP16 by anton-l in 1588
* [UNet2DConditionModel] add an option to upcast attention to fp32 by patil-suraj in 1590
* Flax: avoid recompilation when params change by pcuenca in 1096
* Add Singlestep DPM-Solver (singlestep high-order schedulers) by LuChengTHU in 1442
* fix upcast in slice attention by patil-suraj in 1591
* Update scheduling_repaint.py by Randolph-zeng in 1582
* Update RL docs for better sharing / adding models by natolambert in 1563
* Make cross-attention check more robust by pcuenca in 1560
* [ONNX] Fix flaky tests by anton-l in 1593
* Trivial fix for undefined symbol in train_dreambooth.py by bcsherma in 1598
* [K Diffusion] Add k diffusion sampler natively by patrickvonplaten in 1603
* [Versatile Diffusion] add upcast_attention by patil-suraj in 1605
* Fix PyCharm/VSCode static type checking for dummy objects by anton-l in 1596

0.9

* https://civitai.com/models/22279?modelVersionId=118556
* https://civitai.com/models/104515/sdxlor30costumesrevue-starlight-saijoclaudine-lora
* https://civitai.com/models/108448/daiton-sdxl-test
* https://filebin.net/2ntfqqnapiu9q3zx/pixelbuildings128-v1.safetensors

To know more details and the known limitations, please check out the [documentation](https://huggingface.co/docs/diffusers/main/en/training/lora#supporting-a1111-themed-lora-checkpoints-from-diffusers).

Thanks to isidentical for their sincere help in the PR.

Batched inference

bghira found that for SDXL Img2Img batched inference led to weird artifacts. That is fixed in: https://github.com/huggingface/diffusers/pull/4327.

Downloads

Under some circumstances SD-XL 1.0 can download ONNX weights which is corrected in https://github.com/huggingface/diffusers/pull/4338.

Improved SDXL behavior

https://github.com/huggingface/diffusers/pull/4346 allows the user to disable the watermarker under certain circumstances to improve the usability of SDXL.

All commits:

* [SDXL Refiner] Fix refiner forward pass for batched input by patrickvonplaten in 4327
* [ONNX] Don't download ONNX model by default by patrickvonplaten in 4338
* [SDXL] Make watermarker optional under certain circumstances to improve usability of SDXL 1.0 by patrickvonplaten in 4346
* [Feat] Support SDXL Kohya-style LoRA by sayakpaul in 4287

0.9.0

:art: Stable Diffusion 2 is here!

Installation

`pip install diffusers[torch]==0.9 transformers`

Stable Diffusion 2.0 is available in several flavors:

Stable Diffusion 2.0-V at `768x768`

New stable diffusion model (Stable Diffusion 2.0-v) at 768x768 resolution. Same number of parameters in the U-Net as `1.5`, but uses [OpenCLIP-ViT/H](https://github.com/mlfoundations/open_clip) as the text encoder and is trained from scratch. SD 2.0-v is a so-called [v-prediction](https://arxiv.org/abs/2202.00512) model.

![image](https://user-images.githubusercontent.com/26864830/204018236-259ace29-c007-4002-ad19-98fd35464954.png)

python
import torch
from diffusers import DiffusionPipeline, DPMSolverMultistepScheduler

repo_id = "stabilityai/stable-diffusion-2"
pipe = DiffusionPipeline.from_pretrained(repo_id, torch_dtype=torch.float16, revision="fp16")
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
pipe = pipe.to("cuda")

prompt = "High quality photo of an astronaut riding a horse in space"
image = pipe(prompt, guidance_scale=9, num_inference_steps=25).images[0]
image.save("astronaut.png")


Stable Diffusion 2.0-base at `512x512`

The above model is finetuned from SD 2.0-base, which was trained as a standard noise-prediction model on 512x512 images and is also made available.

![image](https://user-images.githubusercontent.com/26864830/204019534-3e4febce-55f8-4e27-9cc0-d8058ed00486.png)

python
import torch
from diffusers import DiffusionPipeline, DPMSolverMultistepScheduler

repo_id = "stabilityai/stable-diffusion-2-base"
pipe = DiffusionPipeline.from_pretrained(repo_id, torch_dtype=torch.float16, revision="fp16")
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
pipe = pipe.to("cuda")

prompt = "High quality photo of an astronaut riding a horse in space"
image = pipe(prompt, num_inference_steps=25).images[0]
image.save("astronaut.png")


Stable Diffusion 2.0 for Inpanting

This model for text-guided inpanting is finetuned from SD 2.0-base. Follows the mask-generation strategy presented in [LAMA](https://github.com/saic-mdal/lama) which, in combination with the latent VAE representations of the masked image, are used as an additional conditioning.

![image](https://user-images.githubusercontent.com/26864830/204019798-e03b7905-73d5-4eda-abd4-c31f46bd0c49.png)

python
import PIL
import requests
import torch
from io import BytesIO
from diffusers import DiffusionPipeline, DPMSolverMultistepScheduler

def download_image(url):
response = requests.get(url)
return PIL.Image.open(BytesIO(response.content)).convert("RGB")

img_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo.png"
mask_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo_mask.png"
init_image = download_image(img_url).resize((512, 512))
mask_image = download_image(mask_url).resize((512, 512))

repo_id = "stabilityai/stable-diffusion-2-inpainting"
pipe = DiffusionPipeline.from_pretrained(repo_id, torch_dtype=torch.float16, revision="fp16")
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
pipe = pipe.to("cuda")

prompt = "Face of a yellow cat, high resolution, sitting on a park bench"
image = pipe(prompt=prompt, image=init_image, mask_image=mask_image, num_inference_steps=25).images[0]
image.save("yellow_cat.png")


Stable Diffusion X4 Upscaler

The model was trained on crops of size 512x512 and is a text-guided [latent upscaling diffusion model](https://arxiv.org/abs/2112.10752). In addition to the textual input, it receives a noise_level as an input parameter, which can be used to add noise to the low-resolution input according to a [predefined diffusion schedule](https://huggingface.co/stabilityai/stable-diffusion-x4-upscaler/blob/main/low_res_scheduler/scheduler_config.json).

![image](https://user-images.githubusercontent.com/26864830/204020264-86807d85-3097-4755-ace6-cc1e6f24633d.png)

python
import requests
from PIL import Image
from io import BytesIO
from diffusers import StableDiffusionUpscalePipeline
import torch

model_id = "stabilityai/stable-diffusion-x4-upscaler"
pipeline = StableDiffusionUpscalePipeline.from_pretrained(model_id, revision="fp16", torch_dtype=torch.float16)
pipeline = pipeline.to("cuda")

url = "https://huggingface.co/datasets/hf-internal-testing/diffusers-images/resolve/main/sd2-upscale/low_res_cat.png"
response = requests.get(url)
low_res_img = Image.open(BytesIO(response.content)).convert("RGB")
low_res_img = low_res_img.resize((128, 128))

prompt = "a white cat"
upscaled_image = pipeline(prompt=prompt, image=low_res_img).images[0]
upscaled_image.save("upsampled_cat.png")


Saving & Loading is fixed for Versatile Diffusion

Previously there was a :bug: when saving & loading versatile diffusion - this is fixed now so that memory efficient saving & loading works as expected.

* [Versatile Diffusion] Fix remaining tests by patrickvonplaten in 1418

:memo: Changelog
* add v prediction by patil-suraj in 1386
* Adapt UNet2D for supre-resolution by patil-suraj in 1385
* Version 0.9.0.dev0 by anton-l in 1394
* Make height and width optional by patrickvonplaten in 1401
* [Config] Add optional arguments by patrickvonplaten in 1395
* Upscaling fixed by patrickvonplaten in 1402
* Add the new SD2 attention params to the VD text unet by anton-l in 1400
* Deprecate sample size by patrickvonplaten in 1406
* Support SD2 attention slicing by anton-l in 1397
* Add SD2 inpainting integration tests by anton-l in 1412
* Fix sample size conversion script by patrickvonplaten in 1408
* fix clip guided by patrickvonplaten in 1414
* Fix all stable diffusion by patrickvonplaten in 1415
* [MPS] call contiguous after permute by kashif in 1411
* Deprecate `predict_epsilon` by pcuenca in 1393
* Fix ONNX conversion and inference by anton-l in 1416
* Allow to set config params directly in init by patrickvonplaten in 1419
* Add tests for Stable Diffusion 2 V-prediction 768x768 by anton-l in 1420
* StableDiffusionUpscalePipeline by patil-suraj in 1396
* added initial v-pred support to DPM-solver by kashif in 1421
* SD2 docs by patrickvonplaten in 1424

0.8.1

This patch release fixes an error with `CLIPVisionModelWithProjection` imports on a non-git `transformers` installation.

:warning: Please upgrade with `pip install --upgrade diffusers` or `pip install diffusers==0.8.1`

* [Bad dependencies] Fix imports (https://github.com/huggingface/diffusers/pull/1382) by patrickvonplaten

Page 12 of 16

Β© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.