πββοΈ New Models
VersatileDiffusion
VersatileDiffusion, released by [SHI-Labs](https://github.com/SHI-Labs), is a unified multi-flow multimodal diffusion model that is capable of doing multiple tasks such as text2image, image variations, dual-guided(text+image) image generation, image2text.
- [Versatile Diffusion] Add versatile diffusion model by patrickvonplaten anton-l 1283
Make sure to install `transformers` from "main":
bash
pip install git+https://github.com/huggingface/transformers
Then you can run:
python
from diffusers import VersatileDiffusionPipeline
import torch
import requests
from io import BytesIO
from PIL import Image
pipe = VersatileDiffusionPipeline.from_pretrained("shi-labs/versatile-diffusion", torch_dtype=torch.float16)
pipe = pipe.to("cuda")
initial image
url = "https://huggingface.co/datasets/diffusers/images/resolve/main/benz.jpg"
response = requests.get(url)
image = Image.open(BytesIO(response.content)).convert("RGB")
prompt
prompt = "a red car"
text to image
image = pipe.text_to_image(prompt).images[0]
image variation
image = pipe.image_variation(image).images[0]
image variation
image = pipe.dual_guided(prompt, image).images[0]
More in-depth details can be found on:
- [Model card](https://huggingface.co/shi-labs/versatile-diffusion)
- [Docs](https://huggingface.co/docs/diffusers/api/pipelines/versatile_diffusion)
AltDiffusion
AltDiffusion is a multilingual latent diffusion model that supports text-to-image generation for 9 different languages: English, Chinese, Spanish, French, Japanese, Korean, Arabic, Russian and Italian.
- Add AltDiffusion by patrickvonplaten patil-suraj 1299
Stable Diffusion Image Variations
`StableDiffusionImageVariationPipeline` by justinpinkney is a stable diffusion model that takes an image as an input and generates variations of that image. It is conditioned on CLIP image embeddings instead of text.
- StableDiffusionImageVariationPipeline by patil-suraj 1365
Safe Latent Diffusion
Safe Latent Diffusion (SLD), released by [ml-researchTUDarmstadt](https://github.com/ml-research) group, is a new practical and sophisticated approach to prevent unsolicited content from being generated by diffusion models. One of the authors of the research contributed their implementation to `diffusers`.
- Add Safe Stable Diffusion Pipeline by manuelbrack 1244
VQ-Diffusion with classifier-free sampling
vq diffusion classifier free sampling by williamberman 1294
LDM super resolution
LDM super resolution is a latent 4x super-resolution diffusion model released by [CompVis](https://github.com/CompVis).
- Add LDM Super Resolution pipeline by duongna21 1116
CycleDiffusion
CycleDiffusion is a method that uses Text-to-Image Diffusion Models for Image-to-Image Editing. It is capable of
1. Zero-shot image-to-image translation with text-to-image diffusion models such as Stable Diffusion.
Traditional unpaired image-to-image translation with diffusion models trained on two related domains.
2. Zero-shot image-to-image translation with text-to-image diffusion models such as Stable Diffusion.
Traditional unpaired image-to-image translation with diffusion models trained on two related domains.
- Add CycleDiffusion pipeline using Stable Diffusion by ChenWu98 888
CLIPSeg + StableDiffusionInpainting.
Uses [CLIPSeg](https://huggingface.co/docs/transformers/main/en/model_doc/clipseg) to automatically generate a mask using segmentation, and then applies Stable Diffusion in-painting.
K-Diffusion wrapper
K-Diffusion Pipeline is community pipeline that allows to use any sampler from [K-diffusion](https://github.com/crowsonkb/k-diffusion) with `diffusers` models.
- [Community Pipelines] K-Diffusion Pipeline by patrickvonplaten 1360
πNew SOTA Scheduler
`DPMSolverMultistepScheduler` is the 𧨠`diffusers` implementation of [DPM-Solver++](https://github.com/LuChengTHU/dpm-solver), a state-of-the-art scheduler that was contributed by one of the authors of the paper. This scheduler is able to achieve great quality in as few as 20 steps. It's a drop-in replacement for the default Stable Diffusion scheduler, so you can use it to essentially half generation times. It works so well that we adopted it for the Stable Diffusion demo Spaces: https://huggingface.co/spaces/stabilityai/stable-diffusion, https://huggingface.co/spaces/runwayml/stable-diffusion-v1-5.
You can use it like this:
Python
from diffusers import DiffusionPipeline, DPMSolverMultistepScheduler
repo_id = "runwayml/stable-diffusion-v1-5"
scheduler = DPMSolverMultistepScheduler.from_pretrained(repo_id, subfolder="scheduler")
stable_diffusion = DiffusionPipeline.from_pretrained(repo_id, scheduler=scheduler)
π Better scheduler API
The example above also demonstrates how to load schedulers using a new API that is coherent with model loading and therefore more natural and intuitive.
You can load a scheduler using `from_pretrained`, as demonstrated above, or you can instantiate one from an existing scheduler configuration. This is a way to replace the scheduler of a pipeline that was previously loaded:
Python
from diffusers import DiffusionPipeline, EulerDiscreteScheduler
pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5")
pipeline.scheduler = DDIMScheduler.from_config(pipeline.scheduler.config)
Read more about these changes in [the documentation](https://huggingface.co/docs/diffusers/v0.8.0/en/using-diffusers/schedulers). See also the community pipeline that allows using any of the [K-diffusion](https://github.com/crowsonkb/k-diffusion) samplers with `diffusers`, as mentioned above!
π Performance
We work relentlessly to incorporate performance optimizations and memory reduction techniques to 𧨠diffusers. These are two of the most noteworthy incorporations in this release:
- Enable memory-efficient attention by default if [xFormers](https://github.com/facebookresearch/xformers) is installed.
- Use batched-matmuls when possible.
π Quality of Life improvements
- Fix/Enable all schedulers for in-painting
- Easier loading of local pipelines
- cpu offloading: mutli GPU support
:memo: Changelog
* Add multistep DPM-Solver discrete scheduler by LuChengTHU in 1132
* Remove warning about half precision on MPS by pcuenca in 1163
* Fix typo latens -> latents by duongna21 in 1171
* Fix community pipeline links by pcuenca in 1162
* [Docs] Add loading script by patrickvonplaten in 1174
* Fix dtype safety checker inpaint legacy by patrickvonplaten in 1137
* Community pipeline img2img inpainting by vvvm23 in 1114
* [Community Pipeline] Add multilingual stable diffusion to community pipelines by juancopi81 in 1142
* [Flax examples] Load text encoder from subfolder by duongna21 in 1147
* Link to Dreambooth blog post instead of W&B report by pcuenca in 1180
* Fix small typo by pcuenca in 1178
* [DDIMScheduler] fix noise device in ddim step by patil-suraj in 1189
* MPS schedulers: don't use float64 by pcuenca in 1169
* Warning for invalid options without "--with_prior_preservation" by shirayu in 1065
* [ONNX] Improve ONNXPipeline scheduler compatibility, fix safety_checker by anton-l in 1173
* Restore compatibility with deprecated `StableDiffusionOnnxPipeline` by pcuenca in 1191
* Update pr docs actions by mishig25 in 1194
* handle dtype xformers attention by patil-suraj in 1196
* [Scheduler] Move predict epsilon to init by patrickvonplaten in 1155
* add licenses to pipelines by natolambert in 1201
* Fix cpu offloading by anton-l in 1177
* Fix slow tests by patrickvonplaten in 1210
* [Flax] fix extra copy pasta π by camenduru in 1187
* [CLIPGuidedStableDiffusion] support DDIM scheduler by patil-suraj in 1190
* Fix layer names convert LDM script by duongna21 in 1206
* [Loading] Make sure loading edge cases work by patrickvonplaten in 1192
* Add LDM Super Resolution pipeline by duongna21 in 1116
* [Conversion] Improve conversion script by patrickvonplaten in 1218
* DDIM docs by patrickvonplaten in 1219
* apply `repeat_interleave` fix for `mps` to stable diffusion image2image pipeline by jncasey in 1135
* Flax tests: don't hardcode number of devices by pcuenca in 1175
* Improve documentation for the LPW pipeline by exo-pla-net in 1182
* Factor out encode text with Copied from by patrickvonplaten in 1224
* Match the generator device to the pipeline for DDPM and DDIM by anton-l in 1222
* [Tests] Fix mps+generator fast tests by anton-l in 1230
* [Tests] Adjust TPU test values by anton-l in 1233
* Add a reference to the name 'Sampler' by apolinario in 1172
* Fix Flax usage comments by pcuenca in 1211
* [Docs] improve img2img example by ruanrz in 1193
* [Stable Diffusion] Fix padding / truncation by patrickvonplaten in 1226
* Finalize stable diffusion refactor by patrickvonplaten in 1269
* Edited attention.py for older xformers by Lime-Cakes in 1270
* Fix wrong link in text2img fine-tuning documentation by daspartho in 1282
* [StableDiffusionInpaintPipeline] fix batch_size for mask and masked latents by patil-suraj in 1279
* Add UNet 1d for RL model for planning + colab by natolambert in 105
* Fix documentation typo for `UNet2DModel` and `UNet2DConditionModel` by xenova in 1275
* add source link to composable diffusion model by nanliu1 in 1293
* Fix incorrect link to Stable Diffusion notebook by dhruvrnaik in 1291
* [dreambooth] link to bitsandbytes readme for installation by 0xdevalias in 1229
* Add Scheduler.from_pretrained and better scheduler changing by patrickvonplaten in 1286
* Add AltDiffusion by patrickvonplaten in 1299
* Better error message for transformers dummy by patrickvonplaten in 1306
* Revert "Update pr docs actions" by mishig25 in 1307
* [AltDiffusion] add tests by patil-suraj in 1311
* Add improved handling of pil by patrickvonplaten in 1309
* cpu offloading: mutli GPU support by dblunk88 in 1143
* vq diffusion classifier free sampling by williamberman in 1294
* doc string args shape fix by kamalkraj in 1243
* [Community Pipeline] CLIPSeg + StableDiffusionInpainting by unography in 1250
* Temporary local test for PIL_INTERPOLATION by pcuenca in 1317
* Fix gpu_id by anton-l in 1326
* integrate ort by prathikr in 1110
* [Custom pipeline] Easier loading of local pipelines by patrickvonplaten in 1327
* [ONNX] Support Euler schedulers by anton-l in 1328
* img2text Typo by patrickvonplaten in 1329
* add docs for multi-modal examples by natolambert in 1227
* [Flax] Fix loading scheduler from subfolder by skirsten in 1319
* Fix/Enable all schedulers for in-painting by patrickvonplaten in 1331
* Correct path to schedlure by patrickvonplaten in 1322
* Avoid nested fix-copies by anton-l in 1332
* Fix img2img speed with LMS-Discrete Scheduler by NotNANtoN in 896
* Fix the order of casts for onnx inpainting by anton-l in 1338
* Legacy Inpainting Pipeline for Onnx Models by ctsims in 1237
* Jax infer support negative prompt by entrpn in 1337
* Update README.md: IMAGIC example code snippet misspelling by ki-arie in 1346
* Update README.md: Minor change to Imagic code snippet, missing dir error by ki-arie in 1347
* Handle batches and Tensors in `pipeline_stable_diffusion_inpaint.py:prepare_mask_and_masked_image` by vict0rsch in 1003
* change the sample model by shunxing1234 in 1352
* Add bit diffusion [WIP] by kingstut in 971
* perf: prefer batched matmuls for attention by Birch-san in 1203
* [Community Pipelines] K-Diffusion Pipeline by patrickvonplaten in 1360
* Add Safe Stable Diffusion Pipeline by manuelbrack in 1244
* [examples] fix mixed_precision arg by patil-suraj in 1359
* use memory_efficient_attention by default by patil-suraj in 1354
* Replace logger.warn by logger.warning by regisss in 1366
* Fix using non-square images with UNet2DModel and DDIM/DDPM pipelines by jenkspt in 1289
* handle fp16 in `UNet2DModel` by patil-suraj in 1216
* StableDiffusionImageVariationPipeline by patil-suraj in 1365